Paper Digest: Recent Papers on Transformer

July 1, 2020March 10, 2025 admin

Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.

This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to read, write, get answers and review.

Try us today and unlock the full potential of our services for free!

TABLE 1: Paper Digest: Recent Papers on Transformer

	Paper	Author(s)	Source	Date
1	FMT:A Multimodal Pneumonia Detection Model Based on Stacking MOE Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, a Flexible Multimodal Transformer (FMT) was proposed, which uses ResNet-50 and BERT for joint representation learning, followed by a dynamic masked attention strategy that simulates clinical modality loss to improve robustness; finally, a sequential mixture of experts (MOE) architecture was used to achieve multi-level decision refinement.	Jingyu Xu; Yang Wang;	arxiv-cs.CV	2025-03-07
2	MatrixFlow: System-Accelerator Co-design for High-performance Transformer Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their success, their large parameter count and computational demands challenge efficient acceleration. To address these limitations, this paper proposes MatrixFlow, a novel co-designed system-accelerator architecture based on a loosely coupled systolic array including a new software mapping approach for efficient transformer code execution.	Qunyou Liu; Marina Zapater; David Atienza;	arxiv-cs.AR	2025-03-07
3	HILGEN: Hierarchically-Informed Data Generation for Biomedical NER Using Knowledgebases and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present HILGEN, a Hierarchically-Informed Data Generation approach that combines domain knowledge from the Unified Medical Language System (UMLS) with synthetic data generated by large language models (LLMs), specifically GPT-3.5.	YAO GE et. al.	arxiv-cs.CL	2025-03-06
4	Benchmarking Reasoning Robustness in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the recent success of large language models (LLMs) in reasoning such as DeepSeek, we for the first time identify a key dilemma in reasoning robustness and generalization: significant performance degradation on novel or incomplete data, suggesting a reliance on memorized patterns rather than systematic reasoning. Our closer examination reveals four key unique limitations underlying this issue:(1) Positional bias–models favor earlier queries in multi-query inputs but answering the wrong one in the latter (e.g., GPT-4o’s accuracy drops from 75.8 percent to 72.8 percent); (2) Instruction sensitivity–performance declines by 5.0 to 7.5 percent in the Qwen2.5 Series and by 5.0 percent in DeepSeek-V3 with auxiliary guidance; (3) Numerical fragility–value substitution sharply reduces accuracy (e.g., GPT-4o drops from 97.5 percent to 82.5 percent, GPT-o1-mini drops from 97.5 percent to 92.5 percent); and (4) Memory dependence–models resort to guesswork when missing critical data.	TONG YU et. al.	arxiv-cs.AI	2025-03-06
5	BPQA Dataset: Evaluating How Well Language Models Leverage Blood Pressures to Answer Biomedical Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is an important component of biomedical data, which can be used to train transformer-based language models (LMs) for improving healthcare delivery.	CHI HANG et. al.	arxiv-cs.CL	2025-03-06
6	Revisiting The Othello World Model Hypothesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we analyze sequences of Othello board states and train the model to predict the next move based on previous moves.	Yifei Yuan; Anders Søgaard;	arxiv-cs.CL	2025-03-06
7	A Dataset for Analysing News Framing in Chinese Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the first Chinese News Framing dataset, to be used as either a stand-alone dataset or a supplementary resource to the SemEval-2023 task 3 dataset.	Owen Cook; Yida Mu; Xinye Yang; Xingyi Song; Kalina Bontcheva;	arxiv-cs.CL	2025-03-06
8	Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We train hypothesis-only classifiers to determine whether LLM-elicited NLI datasets contain annotation artifacts.	Grace Proebsting; Adam Poliak;	arxiv-cs.CL	2025-03-06
9	Transformers for Molecular Property Prediction: Domain Adaptation Efficiently Improves Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The aim of this study is to investigate and overcome some of the limitations of transformer models in predicting molecular properties.	AFNAN SULTAN et. al.	arxiv-cs.LG	2025-03-05
10	Sarcasm Detection As A Catalyst: Improving Stance Detection with Cross-Target Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study represents the first exploration of sarcasm detection as an intermediate transfer-learning task within the context of SD while also leveraging the concatenation of BERT or RoBERTa with other deep-learning techniques. The proposed approach establishes a foundational baseline for future research in this domain.	Gibson Nkhata Shi Yin Hong; Susan Gauch;	arxiv-cs.CL	2025-03-05
11	DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, current Transformer-based unmixing networks rely on the linear mixing model, which lacks the flexibility to accommodate scenarios where nonlinear effects are significant. To address these limitations, we propose a multi-scale Dilated Transformer-based unmixing network for nonlinear HU (DTU-Net).	ChenTong Wang; Jincheng Gao; Fei Zhu; Abderrahim Halimi; Cédric Richard;	arxiv-cs.CV	2025-03-05
12	MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we simplify the process of building an MAS by reframing it as a generative language task, where the input is a user query and the output is a corresponding MAS.	RUI YE et. al.	arxiv-cs.CL	2025-03-05
13	Scaling Crowdsourced Election Monitoring: Construction and Evaluation of Classification Models for Multilingual and Cross-Domain Classification Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenge of scaling crowdsourced election monitoring by advancing the task of automated classification of crowdsourced election reports to multilingual and cross-domain classification settings.	Jabez Magomere; Scott Hale;	arxiv-cs.CL	2025-03-05
14	Transformer-Based Spatio-Temporal Association of Apple Fruitlets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a transformer-based method to spatio-temporally associate apple fruitlets in stereo-images collected on different days and from different camera poses.	Harry Freeman; George Kantor;	arxiv-cs.CV	2025-03-05
15	BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce BatchGEMBA-MQM, a framework that integrates batched prompting with the GEMBA-MQM metric for machine translation evaluation.	Daniil Larionov; Steffen Eger;	arxiv-cs.CL	2025-03-04
16	Weak-to-Strong Generalization Even in Random Feature Networks, Provably Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider student and teacher that are random feature models, described by two-layer networks with a random and fixed bottom layer and a trained top layer.	MARKO MEDVEDEV et. al.	arxiv-cs.LG	2025-03-04
17	Intermediate-Task Transfer Learning: Leveraging Sarcasm Detection for Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our experiments also revealed that the success of the transfer-learning framework is contingent upon the correlation of lexical attributes between the intermediate task and the target task. This study represents the first exploration of sarcasm detection as an intermediate transfer-learning task in the context of SD and simultaneously uses the concatenation of BERT or RoBERTa with other deep-learning techniques establishing the proposed approach as a foundational baseline for future research endeavors in this domain.	Gibson Nkhata; Susan Gauch;	arxiv-cs.CL	2025-03-04
18	Examining The Mental Health Impact of Misinformation on Social Media Using A Hybrid Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The unchecked spread of false narratives has profound effects on mental health, contributing to increased stress, anxiety, and misinformation-driven paranoia. This study presents a hybrid transformer-based approach using a RoBERTa-LSTM classifier to detect misinformation, assess its impact on mental health, and classify disorders linked to misinformation exposure.	SARVESH ARORA et. al.	arxiv-cs.CL	2025-03-04
19	The Effectiveness of Large Language Models in Transforming Unstructured Text to Standardized Formats Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through comprehensive testing of four models (GPT-4o, GPT-4o-mini, Llama3.1:70b, and Llama3.1:8b), an innovative evaluation approach is introduced that combines traditional metrics (WER, ROUGE-L, TER) with specialized metrics for semantic element identification.	William Brach; Kristián Košťál; Michal Ries;	arxiv-cs.AI	2025-03-04
20	Comparative Analysis of OpenAI GPT-4o and DeepSeek R1 for Scientific Text Categorization Using Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its performance in scientific text categorization remains unexplored. To address this gap, we introduce a new evaluation method designed specifically for this task.	ANIRUDDHA MAITI et. al.	arxiv-cs.CL	2025-03-03
21	EPEE: Towards Efficient and Effective Foundation Models in Biomedicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite these advancements, the high inference latency and the overthinking issues in model inference impair the efficiency and effectiveness of foundation models, thus limiting their application in real-time clinical settings. To address these challenges, we proposed EPEE (Entropy- and Patience-based Early Exiting), a novel hybrid strategy designed to improve the inference efficiency of foundation models.	Zaifu Zhan; Shuang Zhou; Huixue Zhou; Zirui Liu; Rui Zhang;	arxiv-cs.AI	2025-03-03
22	Primus: Enforcing Attention Usage for 3D Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we a) analyze current Transformer-based segmentation models and identify critical shortcomings, particularly their over-reliance on convolutional blocks.	TASSILO WALD et. al.	arxiv-cs.CV	2025-03-03
23	Forgetting Transformer: Softmax Attention with A Forget Gate Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that FoX outperforms the Transformer on long-context language modeling, length extrapolation, and short-context downstream tasks, while performing on par with the Transformer on long-context downstream tasks.	Zhixuan Lin; Evgenii Nikishin; Xu Owen He; Aaron Courville;	arxiv-cs.LG	2025-03-03
24	Network Traffic Classification Using Machine Learning, Transformer, and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study uses various models to address network traffic classification, categorizing traffic into web, browsing, IPSec, backup, and email.	Ahmad Antari; Yazan Abo-Aisheh; Jehad Shamasneh; Huthaifa I. Ashqar;	arxiv-cs.LG	2025-03-03
25	Cancer Type, Stage and Prognosis Assessment from Pathology Reports Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this project, we leverage state-of-the-art language models, including the GPT family, Mistral models, and the open-source Llama models, to evaluate their performance in comprehensively analyzing pathology reports.	RACHIT SALUJA et. al.	arxiv-cs.CL	2025-03-03
26	Beyond QA Pairs: Assessing Parameter-Efficient Fine-Tuning for Fact Embedding in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an extensive examination of Parameter-Efficient Fine-Tuning (PEFT) for embedding domain specific facts into Large Language Models (LLMs), focusing on improving the fine-tuning process by categorizing question-answer (QA) pairs into Factual and Conceptual classes using a BERT-based classifier.	Shivam Ratnakar; Abhiroop Talasila; Raghav Chamadiya; Nikhil Agarwal; Vinayak K Doifode;	arxiv-cs.CL	2025-03-02
27	BERT-based Model for Vietnamese Fact Verification Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach to address the challenges of Fact Verification using the Vietnamese dataset by integrating both sentence selection and classification modules into a unified network architecture.	BAO TRAN et. al.	arxiv-cs.CL	2025-03-01
28	WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present WebFAQ, a large-scale collection of open-domain question answering datasets derived from FAQ-style schema.org annotations.	Michael Dinzinger; Laura Caspari; Kanishka Ghosh Dastidar; Jelena Mitrović; Michael Granitzer;	arxiv-cs.CL	2025-02-28
29	TimesBERT: A BERT-Style Foundation Model for Time Series Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, inspired by the shared multi-granularity structure between multivariate time series and multisentence documents, we design TimesBERT to learn generic representations of time series including temporal patterns and variate-centric characteristics.	HAORAN ZHANG et. al.	arxiv-cs.LG	2025-02-28
30	Measuring Determinism in Large Language Models for Software Code Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we tested four leading LLMs — GPT-4o mini, GPT-4o, Claude 3.5 Sonnet, and LLaMA 3.2 90B Vision — on 70 Java commits from both private and public repositories.	Eugene Klishevich; Yegor Denisov-Blanch; Simon Obstbaum; Igor Ciobanu; Michal Kosinski;	arxiv-cs.SE	2025-02-28
31	À La Recherche Du Sens Perdu: Your Favourite LLM Might Have More to Say Than You Can Understand Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We report a peculiar observation that LLMs can assign hidden meanings to sequences that seem visually incomprehensible to humans: for example, a nonsensical phrase consisting of Byzantine musical symbols is recognized by gpt-4o as say abracadabra.	K. O. T. Erziev;	arxiv-cs.CL	2025-02-28
32	Fine-tuning BERT with Bidirectional LSTM for Fine-grained Movie Reviews Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper our objective is to fine-tune the pre-trained BERT model with Bidirectional LSTM (BiLSTM) to enhance both binary and fine-grained SA specifically for movie reviews.	Gibson Nkhata; Susan Gauch; Usman Anjum; Justin Zhan;	arxiv-cs.CL	2025-02-27
33	Consistency Evaluation of News Article Summaries Generated By Large (and Small) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a meta evaluation score which directly assesses the performance of the LLM evaluation system (prompt + model).	Colleen Gilhuly; Haleh Shahzad;	arxiv-cs.CL	2025-02-27
34	Lotus at SemEval-2025 Task 11: RoBERTa with Llama-3 Generated Explanations for Multi-Label Emotion Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel approach for multi-label emotion detection, where Llama-3 is used to generate explanatory content that clarifies ambiguous emotional expressions, thereby enhancing RoBERTa’s emotion classification performance.	Niloofar Ranjbar; Hamed Baghbani;	arxiv-cs.LG	2025-02-27
35	Large Language Model Strategic Reasoning Evaluation Through Behavioral Game Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Strategic decision-making involves interactive reasoning where agents adapt their choices in response to others, yet existing evaluations of large language models (LLMs) often emphasize Nash Equilibrium (NE) approximation, overlooking the mechanisms driving their strategic choices. To bridge this gap, we introduce an evaluation framework grounded in behavioral game theory, disentangling reasoning capability from contextual effects.	Jingru Jia; Zehua Yuan; Junhao Pan; Paul E. McNamara; Deming Chen;	arxiv-cs.AI	2025-02-27
36	Visual Reasoning at Urban Intersections: FineTuning GPT-4o for Traffic Conflict Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Code used in this study is available at https://github.com/sarimasri3/Traffic-Intersection-Conflict-Detection-using-images.git.	Sari Masri; Huthaifa I. Ashqar; Mohammed Elhenawy;	arxiv-cs.CV	2025-02-27
37	Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using The Consensual Assessment Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Consensual Assessment Technique (CAT) evaluates creativity through holistic expert judgments.	Piotr Sawicki; Marek Grześ; Dan Brown; Fabrício Góes;	arxiv-cs.CL	2025-02-26
38	Cognitive Networks Highlight Differences and Similarities in The STEM Mindsets of Human and LLM-simulated Trainees, Experts and Academics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study uses behavioural forma mentis networks (BFMNs) to investigate the STEM-focused mindset, i.e. ways of associating and perceiving ideas, of 177 human participants and 177 artificial humans simulated by GPT-3.5.	EDITH HAIM et. al.	arxiv-cs.CL	2025-02-26
39	Improving Representation Learning of Complex Critical Care Data with ICU-BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ICU-BERT, a transformer-based model pre-trained on the MIMIC-IV database using a multi-task scheme to learn robust representations of complex ICU data with minimal preprocessing.	Ricardo Santos; André V. Carreiro; Xi Peng; Hugo Gamboa; Holger Fröhlich;	arxiv-cs.LG	2025-02-26
40	Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion — SkiTB Visual Tracking Challenge 2025 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we used STARK (Spatio-Temporal Transformer Network for Visual Tracking), a transformer-based model, to track skiers.	Akhil Penta; Vaibhav Adwani; Ankush Chopra;	arxiv-cs.CV	2025-02-26
41	Negation-Induced Forgetting in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The study explores whether Large Language Models (LLMs) exhibit negation-induced forgetting (NIF), a cognitive phenomenon observed in humans where negating incorrect attributes of an object or event leads to diminished recall of this object or event compared to affirming correct attributes (Mayo et al., 2014; Zang et al., 2023).	Francesca Capuano; Ellen Boschert; Barbara Kaup;	arxiv-cs.CL	2025-02-26
42	NeoBERT: A Next-Generation BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, encoders like BERT and RoBERTa have not seen the same level of progress despite being foundational for many downstream NLP applications. To bridge this gap, we introduce NeoBERT, a next-generation encoder that redefines the capabilities of bidirectional models by integrating state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies.	Lola Le Breton; Quentin Fournier; Mariam El Mezouar; Sarath Chandar;	arxiv-cs.CL	2025-02-26
43	Independent Mobility GPT (IDM-GPT): A Self-Supervised Multi-Agent Large Language Model Framework for Customized Traffic Mobility Analysis Using Machine Learning Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, privacy issues are a major concern when processing data for real-world traffic control and management. To address these challenges, the research team proposes an innovative Multi-agent framework named Independent Mobility GPT (IDM-GPT) based on large language models (LLMs) for customized traffic analysis, management suggestions, and privacy preservation.	Fengze Yang; Xiaoyue Cathy Liu; Lingjiu Lu; Bingzhang Wang; Chenxi Dylan Liu;	arxiv-cs.AI	2025-02-25
44	Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification Without Manually Labeled Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, supervised machine learning models often require large amounts of labeled data for training, and manual annotation is both labor-intensive and requires domain-specific knowledge, leading to relatively high annotation costs. To address this issue, we propose an approach that integrates large language models (LLMs) into an active learning framework.	Yejian Zhang; Shingo Takada;	arxiv-cs.CL	2025-02-24
45	Sentiment Analysis of Texts from Social Networks Based on Machine Learning Methods for Monitoring Public Sentiment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A sentiment analysis system powered by machine learning was created in this study to improve real-time social network public opinion monitoring.	Arsen Tolebay Nurlanuly;	arxiv-cs.CL	2025-02-24
46	VPNeXt — Rethinking Dense Decoding for Plain Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present VPNeXt, a new and simple model for the Plain Vision Transformer (ViT).	Xikai Tang; Ye Huang; Guangqiang Yin; Lixin Duan;	arxiv-cs.CV	2025-02-23
47	Layer-Wise Evolution of Representations in Fine-Tuned Transformers: Insights from Sparse AutoEncoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the underlying mechanisms of fine-tuning, specifically in the BERT transformer, by analyzing activation similarity, training Sparse AutoEncoders (SAEs), and visualizing token-level activations across different layers.	Suneel Nadipalli;	arxiv-cs.CL	2025-02-23
48	Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic investigation of optimization techniques, focusing on structured pruning and quantization methods for transformer architectures.	Arshia Kermani; Ehsan Zeraatkar; Habib Irani;	arxiv-cs.LG	2025-02-23
49	Reasoning About Affordances: Causal and Compositional Reasoning in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In Experiment 2, we introduced two new conditions, Distractor (more object choices, increasing difficulty) and Image (object options presented visually), and evaluated Claude 3 Sonnet and Claude 3.5 Sonnet in addition to the GPT models.	Magnus F. Gjerde; Vanessa Cheung; David Lagnado;	arxiv-cs.AI	2025-02-23
50	Actionable Help in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, although distilled variants (e.g., DistilBERT) exist, they are not tailored for the crisis domain. To address these challenges, we make two key contributions. First, we present CrisisHelpOffer, a novel dataset of 101k tweets collaboratively labelled by generative LLMs and validated by humans, specifically designed to distinguish actionable content from noise.	Rabindra Lamsal; Maria Rodriguez Read; Shanika Karunasekera; Muhammad Imran;	arxiv-cs.CL	2025-02-23
51	A Transformer-in-Transformer Network Utilizing Knowledge Distillation for Image Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel knowledge distillation neural architecture leveraging efficient transformer networks for effective image classification.	Dewan Tauhid Rahman; Yeahia Sarker; Antar Mazumder; Md. Shamim Anower;	arxiv-cs.CV	2025-02-23
52	Iterative Auto-Annotation for Scientific Named Entity Recognition Using BERT-Based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an iterative approach to performing Scientific Named Entity Recognition (SciNER) using BERT-based models.	Kartik Gupta;	arxiv-cs.CL	2025-02-22
53	A Close Look at Decomposition-based XAI-Methods for Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One class of methods that seems very promising in this direction includes decomposition-based approaches, i.e., XAI-methods that redistribute the model’s prediction logit through the network, as this value is directly related to the prediction. In the previous literature we note though that two prominent methods of this category, namely ALTI-Logit and LRP, have not yet been analyzed in juxtaposition and hence we propose to close this gap by conducting a careful quantitative evaluation w.r.t. ground truth annotations on a subject-verb agreement task, as well as various qualitative inspections, using BERT, GPT-2 and LLaMA-3 as a testbed.	Leila Arras; Bruno Puri; Patrick Kahardipraja; Sebastian Lapuschkin; Wojciech Samek;	arxiv-cs.CL	2025-02-21
54	Extraction Multi-étiquettes De Relations En Utilisant Des Couches De Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we present the BTransformer18 model, a deep learning architecture designed for multi-label relation extraction in French texts.	Ngoc Luyen Le; Gildas Tagny Ngompé;	arxiv-cs.CL	2025-02-21
55	TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present TurboFuzzLLM, a mutation-based fuzzing technique for efficiently finding a collection of effective jailbreaking templates that, when combined with harmful questions, can lead a target LLM to produce harmful responses through black-box access via user prompts.	Aman Goel; Xian Carrie Wu; Zhe Wang; Dmitriy Bespalov; Yanjun Qi;	arxiv-cs.CR	2025-02-21
56	Comparative Analysis of Large Language Models for Context-Aware Code Completion Using SAFIM Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work provides a comparative analysis that underscores the trade-offs between accuracy and speed, establishing a benchmark for future advancements in LLM-based code completion.	HANG ZHANG et. al.	arxiv-cs.SE	2025-02-21
57	MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MutaGReP (Mutation-guided Grounded Repository Plan Search), an approach to search for plans that decompose a user request into natural language steps grounded in the codebase.	ZAID KHAN et. al.	arxiv-cs.CL	2025-02-21
58	PPC-GPT: Federated Task-Specific Compression of Large Language Models Via Pruning and Chain-of-Thought Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Compressing Large Language Models (LLMs) into task-specific Small Language Models (SLMs) encounters two significant challenges: safeguarding domain-specific knowledge privacy and managing limited resources. To tackle these challenges, we propose PPC-GPT, a innovative privacy-preserving federated framework specifically designed for compressing LLMs into task-specific SLMs via pruning and Chain-of-Thought (COT) distillation.	TAO FAN et. al.	arxiv-cs.CL	2025-02-21
59	Robust Bias Detection in MLMs and Its Application to Human Trait Ratings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing these, we propose a systematic statistical approach to assess bias in MLMs, using mixed models to account for random effects, pseudo-perplexity weights for sentences derived from templates and quantify bias using statistical effect sizes.	Ingroj Shrestha; Louis Tay; Padmini Srinivasan;	arxiv-cs.CL	2025-02-21
60	Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a detailed evaluation of a Retrieval-Augmented Generation (RAG) system that integrates large language models (LLMs) to enhance information retrieval and instruction generation for maintenance personnel across diverse data formats.	Akos Nagy; Yannis Spyridis; Vasileios Argyriou;	arxiv-cs.IR	2025-02-21
61	Single-pass Detection of Jailbreaking Input in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, we focus on detecting jailbreaking input in a single forward pass.	Leyla Naz Candogan; Yongtao Wu; Elias Abad Rocamora; Grigorios G. Chrysos; Volkan Cevher;	arxiv-cs.LG	2025-02-21
62	Generative Adversarial Networks Vs Large Language Models: A Comparative Study on Synthetic Tabular Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new framework for zero-shot generation of synthetic tabular data.	Austin A. Barr; Robert Rozman; Eddie Guo;	arxiv-cs.LG	2025-02-20
63	BP-GPT: Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the Brain Prompt GPT (BP-GPT).	Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He;	arxiv-cs.HC	2025-02-20
64	Leveraging ChatGPT for Sponsored Ad Detection and Keyword Extraction in YouTube Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work-in-progress paper presents a novel approach to detecting sponsored advertisement segments in YouTube videos and comparing the advertisement with the main content.	Brice Valentin Kok-Shun; Johnny Chan;	arxiv-cs.LG	2025-02-20
65	Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Empirical analysis reveals challenging tokens induce abrupt gradient spikes across layers, exposing architectural stress points in standard Transformers. Building on this insight, we propose Inner Thinking Transformer (ITT), which reimagines layer computations as implicit thinking steps.	YILONG CHEN et. al.	arxiv-cs.CL	2025-02-19
66	Learning Novel Transformer Architecture for Time-series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the success of Transformer-based models in the time-series prediction (TSP) tasks, the existing Transformer architecture still face limitations and the literature lacks comprehensive explorations into alternative architectures. To address these challenges, we propose AutoFormer-TS, a novel framework that leverages a comprehensive search space for Transformer architectures tailored to TSP tasks.	Juyuan Zhang; Wei Zhu; Jiechao Gao;	arxiv-cs.LG	2025-02-19
67	Extracting Social Connections from Finnish Karelian Refugee Interviews Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that the best generative model (GPT-4) is roughly on par with human performance, at an F-score of 88.8%.	Joonatan Laato; Jenna Kanerva; John Loehr; Virpi Lummaa; Filip Ginter;	arxiv-cs.CL	2025-02-19
68	DeepSeek-V3, GPT-4, Phi-4, and LLaMA-3.3 Generate Correct Code for LoRaWAN-related Engineering Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the performance of 16 Large Language Models (LLMs) in automating LoRaWAN-related engineering tasks involving optimal placement of drones and received power calculation under progressively complex zero-shot, natural language prompts.	Daniel Fernandes; João P. Matos-Carvalho; Carlos M. Fernandes; Nuno Fachada;	arxiv-cs.SE	2025-02-19
69	UM_FHS at TREC 2024 PLABA: Exploration of Fine-tuning and AI Agent Approach for Plain Language Adaptations of Biomedical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our submissions to the TREC 2024 PLABA track with the aim to simplify biomedical abstracts for a K8-level audience (13-14 years old students).	PRIMOZ KOCBEK et. al.	arxiv-cs.CL	2025-02-19
70	Spiking Point Transformer for Point Cloud Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Hybrid Dynamics Integrate-and-Fire Neuron (HD-IF), designed to simulate selective neuron activation and reduce over-reliance on specific artificial neurons.	PEIXI WU et. al.	arxiv-cs.LG	2025-02-19
71	FlexTok: Resampling Images Into 1D Token Sequences of Flexible Length Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FlexTok, a tokenizer that projects 2D images into variable-length, ordered 1D token sequences.	ROMAN BACHMANN et. al.	arxiv-cs.CV	2025-02-19
72	QUAD-LLM-MLTC: Large Language Models Ensemble Learning for Healthcare Text Multi-Label Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, when dealing with various labels, different prompts can be relevant depending on the topic. To address these challenges, the proposed approach, QUAD-LLM-MLTC, leverages the strengths of four LLMs: GPT-4o, BERT, PEGASUS, and BART.	Hajar Sakai; Sarah S. Lam;	arxiv-cs.CL	2025-02-19
73	Simulating User Diversity in Task-Oriented Dialogue Systems Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the application of Large Language Models (LLMs) for generating synthetic users and simulating user conversations with a task-oriented dialogue system and present detailed results and their analysis.	Adnan Ahmad; Stefan Hillmann; Sebastian Möller;	arxiv-cs.CL	2025-02-18
74	Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our observations reveal that small models can generate high-quality reasoning paths during sampling, even without chain-of-thought prompting, though these paths are often latent due to their low probability under standard decoding strategies. To address this, we propose Self-Enhanced Reasoning Training (SERT), which activates and leverages latent reasoning capabilities in small models through self-training on filtered, self-generated reasoning paths under zero-shot conditions.	YONG ZHANG et. al.	arxiv-cs.CL	2025-02-18
75	Language Models Are Few-Shot Graders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an ASAG pipeline leveraging state-of-the-art LLMs.	Chenyan Zhao; Mariana Silva; Seth Poulsen;	arxiv-cs.CL	2025-02-18
76	Positional Encoding in Transformer-Based Time Series Models: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in transformer-based models have greatly improved time series analysis, providing robust solutions for tasks such as forecasting, anomaly detection, and classification.	Habib Irani; Vangelis Metsis;	arxiv-cs.LG	2025-02-17
77	Hyperspherical Energy Transformer with Recurrent Depth Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By integrating layers with the same parameters, we propose \textit{Hyper-Spherical Energy Transformer} (Hyper-SET), an alternative to the vanilla Transformer with recurrent depth.	Yunzhe Hu; Difan Zou; Dong Xu;	arxiv-cs.LG	2025-02-17
78	AdaSplash: Adaptive Sparse Flash Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AdaSplash, which combines the efficiency of GPU-optimized algorithms with the sparsity benefits of $\alpha$-entmax.	Nuno Gonçalves; Marcos Treviso; André F. T. Martins;	arxiv-cs.CL	2025-02-17
79	Efficient OpAmp Adaptation for Zoom Attention to Golden Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent work proposes the differential attention mechanism to address this issue, but this mechanism is limited by an unsuitable common-mode rejection ratio (CMRR) and high computational costs. Inspired by the operational amplifier (OpAmp), we propose the OpAmp adaptation to address these challenges, which is implemented with adapters efficiently.	Haoyuan Wu; Rui Ming; Haisheng Zheng; Zhuolun He; Bei Yu;	arxiv-cs.CL	2025-02-17
80	Mixture of Attention Yields Accurate Results for Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To bridge the gap, we propose MAYA, an encoder-decoder transformer-based framework.	XUECHEN LI et. al.	arxiv-cs.LG	2025-02-17
81	GLoT: A Novel Gated-Logarithmic Transformer for Efficient Sign Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Gated-Logarithmic Transformer (GLoT) that captures the long-term temporal dependencies of the sign language as a time-series data.	Nada Shahin; Leila Ismail;	arxiv-cs.CL	2025-02-17
82	The Geometry of BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their internal mechanisms remain mathematically obscure, highlighting the need for greater explainability and interpretability. In this direction, this paper investigates the internal mechanisms of BERT proposing a novel perspective on the attention mechanism of BERT from a theoretical perspective.	Matteo Bonino; Giorgia Ghione; Giansalvo Cirrincione;	arxiv-cs.LG	2025-02-17
83	Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first on synthetic data augmentation for project level proof oriented programming for both generation and repair.	Dylan Zhang; Justin Wang; Tianran Sun;	arxiv-cs.CL	2025-02-17
84	An Empirical Evaluation of Encoder Architectures for Fast Real-Time Long Conversational Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we explore and evaluate recently proposed efficient Transformer variants (e.g. Performer, Reformer) and a CNN-based architecture for real-time and near real-time long conversational understanding tasks.	Annamalai Senthilnathan; Kristjan Arumae; Mohammed Khalilia; Zhengzheng Xing; Aaron R. Colak;	arxiv-cs.CL	2025-02-17
85	Vendi-RAG: Adaptively Trading-Off Diversity And Quality Significantly Improves Retrieval Augmented Generation With LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Vendi-RAG, a framework based on an iterative process that jointly optimizes retrieval diversity and answer quality.	Mohammad Reza Rezaei; Adji Bousso Dieng;	arxiv-cs.CL	2025-02-16
86	Performance Review on LLM for Solving Leetcode Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive performance evaluation of Large Language Models (LLMs) in solving programming challenges from Leetcode, a widely used platform for algorithm practice and technical interviews.	LUN WANG et. al.	arxiv-cs.SE	2025-02-16
87	Integrating Language Models for Enhanced Network State Monitoring in DRL-Based SFC Provisioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper integrates DRL with Language Models (LMs), specifically Bidirectional Encoder Representations from Transformers (BERT) and DistilBERT, to enhance network management.	Parisa Fard Moshiri; Murat Arda Onsu; Poonam Lohan; Burak Kantarci; Emil Janulewicz;	arxiv-cs.NI	2025-02-16
88	Faces of Fairness: Examining Bias in Facial Expression Recognition Datasets and Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates bias sources in FER datasets and models.	Mohammad Mehdi Hosseini; Ali Pourramezan Fard; Mohammad H. Mahoor;	arxiv-cs.CV	2025-02-16
89	Distraction Is All You Need for Multimodal Large Language Model Jailbreaking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, we analyze the relationship between image content and task and find that the complexity of subimages, rather than their content, is key. Building on this insight, we propose the Distraction Hypothesis, followed by a novel framework called Contrasting Subimage Distraction Jailbreaking (CS-DJ), to achieve jailbreaking by disrupting MLLMs alignment through multi-level distraction strategies.	ZUOPENG YANG et. al.	arxiv-cs.CV	2025-02-15
90	The Underlying Structures of Self-attention: Symmetry, Directionality, and Emergent Dynamics in Transformer Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a mathematical framework to analyze self-attention matrices by deriving the structures governing their weight updates.	Matteo Saponati; Pascal Sager; Pau Vilimelis Aceituno; Thilo Stadelmann; Benjamin Grewe;	arxiv-cs.LG	2025-02-15
91	Empirical Evaluation of LLMs in Predicting Fixes of Configuration Bugs in Smart Home System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This empirical study evaluates the effectiveness of Large Language Models (LLMs) in predicting fixes for configuration bugs in smart home systems.	Sheikh Moonwara Anjum Monisha; Atul Bharadwaj;	arxiv-cs.SE	2025-02-15
92	Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study systematically investigates parameter-efficient adapter-based methods for adapting mLMs to LRLs, evaluating three architectures: Sequential Bottleneck, Invertible Bottleneck, and Low-Rank Adaptation.	Daniil Gurgurov; Ivan Vykopal; Josef van Genabith; Simon Ostermann;	arxiv-cs.CL	2025-02-14
93	A Preliminary Exploration with GPT-4o Voice Mode Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the rise of multimodal large language models, GPT-4o stands out as a pioneering model, driving us to evaluate its capabilities. This report assesses GPT-4o across various tasks to analyze its audio processing and reasoning abilities.	YU-XIANG LIN et. al.	arxiv-cs.CL	2025-02-14
94	Large Language Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By optimizing a likelihood bound, it provides a principled generative approach for probabilistic inference.	SHEN NIE et. al.	arxiv-cs.CL	2025-02-14
95	Code-Mixed Telugu-English Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates transformer-based models, including TeluguHateBERT, HateBERT, DeBERTa, Muril, IndicBERT, Roberta, and Hindi-Abusive-MuRIL, for classifying hate speech in Telugu.	Santhosh Kakarla; Gautama Shastry Bulusu Venkata;	arxiv-cs.CL	2025-02-14
96	An Innovative Next Activity Prediction Approach Using Process Entropy and DAW-Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an entropy-driven model selection approach and DAW-Transformer, which stands for Dynamic Attribute-Aware Transformer, to integrate all attributes with a dynamic window for better accuracy.	Hadi Zare; Mostafa Abbasi; Maryam Ahang; Homayoun Najjaran;	arxiv-cs.LG	2025-02-14
97	Do Large Language Models Reason Causally Like Us? Even Better? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have shown impressive capabilities in generating human-like text, raising questions about whether their responses reflect true understanding or statistical patterns.	Hanna M. Dettki; Brenden M. Lake; Charley M. Wu; Bob Rehder;	arxiv-cs.AI	2025-02-14
98	Application of Tabular Transformer Architectures for Operating System Fingerprinting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Tabular Transformer architectures-specifically TabTransformer and FT-Transformer-for OS fingerprinting, leveraging structured network data from three publicly available datasets.	Rubén Pérez-Jove; Cristian R. Munteanu; Alejandro Pazos; Jose Vázquez-Naya;	arxiv-cs.CR	2025-02-13
99	Evaluating GPT’s Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Information about cognitive impairment often exists within unstructured clinician notes in EHRs, but manual chart reviews are both time-consuming and error-prone. To address this issue, our study evaluates an automated approach using zero-shot GPT-4o to determine stage of cognitive impairment in two different tasks.	YU LENG et. al.	arxiv-cs.LG	2025-02-13
100	INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Injongo — a multicultural, open-source benchmark dataset for 16 African languages with utterances generated by native speakers across diverse domains, including banking, travel, home, and dining.	HAO YU et. al.	arxiv-cs.CL	2025-02-13
101	Zero-shot Generation of Synthetic Neurosurgical Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to evaluate the capability of zero-shot generation of synthetic neurosurgical data with a large language model (LLM), GPT-4o, by benchmarking with the conditional tabular generative adversarial network (CTGAN).	Austin A. Barr; Eddie Guo; Emre Sezgin;	arxiv-cs.CL	2025-02-13
102	AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AttentionSmithy, a modular software package that simplifies transformer innovation by breaking down key components into reusable building blocks: attention modules, feed-forward networks, normalization layers, and positional encodings.	Caleb Cranney; Jesse G. Meyer;	arxiv-cs.LG	2025-02-13
103	MTDP: Modulated Transformer Diffusion Policy Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate key architectural designs of Transformers and improve the traditional Transformer architecture by proposing the Modulated Transformer Diffusion Policy (MTDP) model for diffusion policy.	Qianhao Wang; Yinqian Sun; Enmeng Lu; Qian Zhang; Yi Zeng;	arxiv-cs.RO	2025-02-13
104	From Occupations to Tasks: A New Perspective on Automatability Prediction Using BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing research has primarily focused on the potential impact of automation at the occupation level, there has been a lack of investigation into the automatability of individual tasks. This paper addresses this gap by proposing a BERT-based classifier to predict the automatability of tasks in the forthcoming decade at a granular level leveraging the context and semantics information of tasks.	Dawei Xu; Haoran Yang; Marian-Andrei Rizoiu; Guandong Xu;	arxiv-cs.CY	2025-02-13
105	Mechanistic Unveiling of Transformer Circuits: Self-Influence As A Key to Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is still unclear which multi-step reasoning mechanisms are used by language models to solve such tasks. In this paper, we aim to address this question by investigating the mechanistic interpretability of language models, particularly in the context of multi-step reasoning tasks.	Lin Zhang; Lijie Hu; Di Wang;	arxiv-cs.AI	2025-02-13
106	APT-LLM: Embedding-Based Anomaly Detection of Cyber Advanced Persistent Threats Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces APT-LLM, a novel embedding-based anomaly detection framework that integrates large language models (LLMs) — BERT, ALBERT, DistilBERT, and RoBERTa — with autoencoder architectures to detect APTs.	Sidahmed Benabderrahmane; Petko Valtchev; James Cheney; Talal Rahwan;	arxiv-cs.CR	2025-02-13
107	A Hybrid Transformer Model for Fake News Detection: Leveraging Bayesian Optimization and Bidirectional Recurrent Unit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an optimized Transformer model that integrates Bayesian algorithms with a Bidirectional Gated Recurrent Unit (BiGRU), and apply it to fake news classification for the first time.	Tianyi Huang; Zeqiu Xu; Peiyang Yu; Jingyuan Yi; Xiaochuan Xu;	arxiv-cs.CL	2025-02-13
108	Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we tackle industry challenges in video content classification by exploring and optimizing GPT-based models for zero-shot classification across seven critical categories of video quality.	Mark Beliaev; Victor Yang; Madhura Raju; Jiachen Sun; Xinghai Hu;	arxiv-cs.CV	2025-02-13
109	Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel four-dimensional hybrid parallel algorithm implemented in a highly scalable, portable, open-source framework called AxoNN.	SIDDHARTH SINGH et. al.	arxiv-cs.LG	2025-02-12
110	Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Future work aims to broaden the framework to encompass various biomedical literature and enhance model generalizability across various vaccines and adjuvants.	HASIN REHANA et. al.	arxiv-cs.CL	2025-02-12
111	Can Uniform Meaning Representation Help GPT-4 Translate from Indigenous Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the downstream technical utility of UMR for low-resource languages by incorporating it into GPT-4 prompts.	Shira Wein;	arxiv-cs.CL	2025-02-12
112	FoQA: A Faroese Question-Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present FoQA, a Faroese extractive question-answering (QA) dataset with 2,000 samples, created using a semi-automated approach combining Large Language Models (LLMs) and human validation.	Annika Simonsen; Dan Saattrup Nielsen; Hafsteinn Einarsson;	arxiv-cs.CL	2025-02-11
113	WHODUNIT: Evaluation Benchmark for Culprit Detection in Mystery Stories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel data set, WhoDunIt, to assess the deductive reasoning capabilities of large language models (LLM) within narrative contexts.	Kshitij Gupta;	arxiv-cs.CL	2025-02-11
114	Large Language Models Perpetuate Bias in Palliative Care: Development and Analysis of The Palliative Care Adversarial Dataset (PCAD) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Bias and inequity in palliative care disproportionately affect marginalised groups. Large language models (LLMs), such as GPT-4o, hold potential to enhance care but risk …	NAOMI AKHRAS et. al.	arxiv-cs.CY	2025-02-11
115	Making Language Models Robust Against Negation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a self-supervised method to make language models more robust against negation.	MohammadHossein Rezaei; Eduardo Blanco;	arxiv-cs.CL	2025-02-11
116	RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We detail the methodology behind data collection and annotation, and the challenges encountered during the data curation phase.	Naome A. Etori; Maria L. Gini;	arxiv-cs.CL	2025-02-10
117	A Large-Scale Benchmark for Vietnamese Sentence Paraphrases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents ViSP, a high-quality Vietnamese dataset for sentence paraphrasing, consisting of 1.2M original-paraphrase pairs collected from various domains.	Sang Quang Nguyen; Kiet Van Nguyen;	arxiv-cs.CL	2025-02-10
118	Leveraging GPT-4o Efficiency for Detecting Rework Anomaly in Business Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the effectiveness of GPT-4o-2024-08-06, one of the Large Language Models (LLM) from OpenAI, in detecting business process anomalies, with a focus on rework anomalies.	Mohammad Derakhshan; Paolo Ceravolo; Fatemeh Mohammadi;	arxiv-cs.LG	2025-02-10
119	Provably Overwhelming Transformer Models with Designed Inputs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop an algorithm which, given a trained transformer model $\mathcal{M}$ as input, as well as a string of tokens $s$ of length $n_{fix}$ and an integer $n_{free}$, can generate a mathematical proof that $\mathcal{M}$ is “overwhelmed” by $s$, in time and space $\widetilde{O}(n_{fix}^2 + n_{free}^3)$.	Lev Stambler; Seyed Sajjad Nezhadi; Matthew Coudron;	arxiv-cs.LG	2025-02-09
120	Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate in-context temporal biases in attention heads and transformer outputs.	Deven Mahesh Mistry; Anooshka Bajaj; Yash Aggarwal; Sahaj Singh Maini; Zoran Tiganj;	arxiv-cs.LG	2025-02-09
121	Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its effectiveness in mitigating vulnerabilities in LLM-generated code remains underexplored. To address this gap, we implemented a benchmark to automatically assess the impact of various prompt engineering strategies on code security.	Marc Bruni; Fabio Gabrielli; Mohammad Ghafari; Martin Kropp;	arxiv-cs.SE	2025-02-09
122	Learning to Substitute Words with Model-based Score Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To circumvent this issue, we instead employ a model-based score (BARTScore) to quantify sentence quality, thus forgoing the need for human annotations. Specifically, we use this score to define a distribution for each word substitution, allowing one to test whether a substitution is statistically superior relative to others.	Hongye Liu; Ricardo Henao;	arxiv-cs.CL	2025-02-09
123	Online Social Support Detection in Spanish Social Media Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an innovative approach to detecting online social support in Spanish-language social media texts.	MOEIN SHAHIKI TASH et. al.	arxiv-cs.CL	2025-02-09
124	Flowing Through Layers: A Continuous Dynamical Systems Perspective on Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that the standard discrete update rule of transformer layers can be naturally interpreted as a forward Euler discretization of a continuous dynamical system.	Jacob Fein-Ashley;	arxiv-cs.LG	2025-02-08
125	EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we revisit existing gradient-based circuit identification methods and find that their performance is either affected by the zero-gradient problem or saturation effects, where edge attribution scores become insensitive to input changes, resulting in noisy and unreliable attribution evaluations for circuit components.	LIN ZHANG et. al.	arxiv-cs.LG	2025-02-07
126	Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, this study introduces a Large Language Model (LLM)-based Autonomous Driving (AD) assistance system that integrates a vision adapter and an LLM reasoning module to enhance visual understanding and decision-making.	Namhee Kim; Woojin Park;	arxiv-cs.CV	2025-02-06
127	Lowering The Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach that integrates large language models (LLMs), specifically Generative Pre-trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT)-based models, making it accessible to a wider audience.	Yejian Zhang; Shingo Takada;	arxiv-cs.CL	2025-02-05
128	A Systematic Approach for Assessing Large Language Models’ Test Case Generation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the assessment of LLM’s test case generation ability and lacking dataset for evaluation, we propose the Generated Benchmark from Control-Flow Structure and Variable Usage Composition (GBCV) approach, which systematically generates programs used for evaluating LLMs’ test generation capabilities.	Hung-Fu Chang; Mohammad Shokrolah Shirazi;	arxiv-cs.SE	2025-02-04
129	FewTopNER: Integrating Few-Shot Learning with Topic Modeling and Named Entity Recognition in A Multilingual Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FewTopNER, a novel framework that integrates few-shot named entity recognition (NER) with topic-aware contextual modeling to address the challenges of cross-lingual and low-resource scenarios.	Ibrahim Bouabdallaoui; Fatima Guerouate; Samya Bouhaddour; Chaimae Saadi; Mohammed Sbihi;	arxiv-cs.CL	2025-02-04
130	Aligning Human and Machine Attention for Enhanced Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given that humans continue to outperform machines in certain learning tasks, it seems plausible that machine performance could be enriched by aligning machine attention with human attention mechanisms — yet research on this topic is sparse and has achieved only limited success. This paper proposes a new approach to address this gap, called Human-Machine Attention Learning (HuMAL).	Avihay Chriqui; Inbal Yahav; Dov Teeni; Ahmed Abbasi;	arxiv-cs.LG	2025-02-04
131	CodeSteer: Symbolic-Augmented Language Models Via Code/Text Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CodeSteer, an effective method for guiding LLM code/text generation.	Yongchao Chen; Yilun Hao; Yueying Liu; Yang Zhang; Chuchu Fan;	arxiv-cs.CL	2025-02-04
132	Annotation Tool and Dataset for Fact-Checking Podcasts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fact-checking podcasts is a challenging task, requiring transcription, annotation, and claim verification, all while preserving the contextual details of spoken content. Our tool offers a novel approach to tackle these challenges by enabling real-time annotation of podcasts during playback.	Vinay Setty; Adam James Becker;	arxiv-cs.CL	2025-02-03
133	Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their black-box nature introduces significant safety and compliance risks. In this work, we present a scalable framework for the automated evaluation of Custom GPTs against OpenAI’s usage policies, which define the permissible behaviors of these systems.	David Rodriguez; William Seymour; Jose M. Del Alamo; Jose Such;	arxiv-cs.CL	2025-02-03
134	The Jumping Reasoning Curve? Tracking The Evolution of Reasoning Performance in GPT-[n] and O-[n] Models on Multimodal Puzzles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We plan to continuously track new models in the series and update our results in this paper accordingly.	Vernon Y. H. Toh; Yew Ken Chia; Deepanway Ghosal; Soujanya Poria;	arxiv-cs.CV	2025-02-03
135	Optimal Sensor Placement in Power Transformers Using Physics-Informed Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work aims at simulating and predicting the temperature conditions inside a power transformer using Physics-Informed Neural Networks (PINNs).	Sirui Li; Federica Bragone; Matthieu Barreau; Tor Laneryd; Kateryna Morozovska;	arxiv-cs.LG	2025-02-01
136	Explainable AI for Sentiment Analysis of Human Metapneumovirus (HMPV) Using XLNet Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply transformer models, particularly XLNet, achieving 93.50% accuracy in sentiment classification.	Md. Shahriar Hossain Apu; Md Saiful Islam; Tanjim Taharat Aurpa;	arxiv-cs.CL	2025-02-01
137	Large Language Models’ Accuracy in Emulating Human Experts’ Evaluation of Public Sentiments About Heated Tobacco Products on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examined the accuracy of LLMs in replicating human sentiment evaluation of social media messages about heated tobacco products (HTPs).	Kwanho Kim; Soojong Kim;	arxiv-cs.CL	2025-01-31
138	Structure Development in List-Sorting Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Interestingly, vocabulary-splitting is present regardless of whether we use weight decay, a common regularization technique thought to drive simplification, supporting the thesis that neural networks naturally prefer simpler solutions.	Einar Urdshals; Jasmina Urdshals;	arxiv-cs.LG	2025-01-30
139	OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a continuous-time formulation of transformers.	Kelvin Kan; Xingjian Li; Stanley Osher;	arxiv-cs.LG	2025-01-30
140	A Multi-Layered Large Language Model Framework for Disease Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores three Arabic medical text preprocessing techniques: text summarization, text refinement, and Named Entity Recognition (NER).	Malak Mohamed; Rokaia Emad; Ali Hamdi;	arxiv-cs.CL	2025-01-30
141	Economic Rationality Under Specialization: Evidence of Decision Bias in AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the study by Chen et al. (2023) [01], the large language model GPT demonstrated economic rationality comparable to or exceeding the average human level in tasks such as budget allocation and risk preference.	ShuiDe Wen; Juan Feng;	arxiv-cs.AI	2025-01-30
142	Cross-Language Approach for Quranic QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these systems face unique challenges, including the linguistic disparity between questions written in Modern Standard Arabic and answers found in Quranic verses written in Classical Arabic, and the small size of existing datasets, which further restricts model performance. To address these challenges, we adopt a cross-language approach by (1) Dataset Augmentation: expanding and enriching the dataset through machine translation to convert Arabic questions into English, paraphrasing questions to create linguistic diversity, and retrieving answers from an English translation of the Quran to align with multilingual training requirements; and (2) Language Model Fine-Tuning: utilizing pre-trained models such as BERT-Medium, RoBERTa-Base, DeBERTa-v3-Base, ELECTRA-Large, Flan-T5, Bloom, and Falcon to address the specific requirements of Quranic QA.	Islam Oshallah; Mohamed Basem; Ali Hamdi; Ammar Mohammed;	arxiv-cs.CL	2025-01-29
143	Towards Supporting Penetration Testing Education with Large Language Models: An Evaluation and Comparison Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of LLMs in conducting a variety of penetration testing tasks.	Martin Nizon-Deladoeuille; Brynjólfur Stefánsson; Helmut Neukirchen; Thomas Welsh;	arxiv-cs.CR	2025-01-29
144	DINT Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it has two critical limitations: the lack of global context modeling, which is essential for identifying globally significant tokens, and numerical instability due to the absence of strict row normalization in the attention matrix. To overcome these challenges, we propose DINT Transformer, which extends DIFF Transformer by incorporating a differential-integral mechanism.	Yueyang Cang; Yuhang Liu; Xiaoteng Zhang; Erlu Zhao; Li Shi;	arxiv-cs.CL	2025-01-29
145	Shared DIFF Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Shared DIFF Transformer, which draws on the idea of a differential amplifier by introducing a shared base matrix to model global patterns and incorporating low-rank updates to enhance task-specific flexibility.	Yueyang Cang; Yuhang Liu; Xiaoteng Zhang; Xiangju Wang;	arxiv-cs.LG	2025-01-29
146	AlphaAdam:Asynchronous Masked Optimization with Dynamic Alpha for Selective Updates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AlphaAdam, an optimization framework for LLM from the perspective of intra-layer parameter updates.	Da Chang; Yu Li; Ganzhao Yuan;	arxiv-cs.LG	2025-01-29
147	Divergent Emotional Patterns in Disinformation on Social Media? An Analysis of Tweets and TikToks About The DANA in Valencia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the dissemination of disinformation on social media platforms during the DANA event (DANA is a Spanish acronym for Depresion Aislada en Niveles Altos, translating to high-altitude isolated depression) that resulted in extremely heavy rainfall and devastating floods in Valencia, Spain, on October 29, 2024.	Iván Arcos; Paolo Rosso; Ramón Salaverría;	arxiv-cs.CL	2025-01-28
148	MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present and release MIDI-GPT, a generative system based on the Transformer architecture that is designed for computer-assisted music composition workflows.	PHILIPPE PASQUIER et. al.	arxiv-cs.SD	2025-01-28
149	Detecting Harassment and Defamation in Cyberbullying with Emotion-adaptive Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, their performance is substantially lower on harassment and denigration multi-classification tasks. Therefore, we propose an emotion-adaptive training framework (EAT) that helps transfer knowledge from the domain of emotion detection to the domain of cyberbullying detection to help detect indirect cyberbullying events.	Peiling Yi; Arkaitz Zubiaga; Yunfei Long;	arxiv-cs.CL	2025-01-28
150	Comparing Human and LLM Generated Code: The Jury Is Still Out! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there has been limited evaluation effort in the research domain aimed at validating the true utility of such techniques, especially when compared to human coding outputs. We bridge this gap, where a benchmark dataset comprising 72 distinct software engineering tasks is used to compare the effectiveness of large language models (LLMs) and human programmers in producing Python software code.	Sherlock A. Licorish; Ansh Bajpai; Chetan Arora; Fanyu Wang; Kla Tantithamthavorn;	arxiv-cs.SE	2025-01-28
151	Leveraging In-Context Learning and Retrieval-Augmented Generation for Automatic Question Generation in Educational Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore advanced techniques for automated question generation in educational contexts, focusing on In-Context Learning (ICL), Retrieval-Augmented Generation (RAG), and a novel Hybrid Model that merges both methods.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2025-01-28
152	MEL: Legal Spanish Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the development and evaluation of MEL, a legal language model based on XLM-RoBERTa-large, fine-tuned on legal documents such as BOE (Bolet\’in Oficial del Estado, the Spanish oficial report of laws) and congress texts.	DAVID BETANCUR SÁNCHEZ et. al.	arxiv-cs.CL	2025-01-27
153	Optimizing Sentence Embedding with Pseudo-Labeling and Model Ensembles: A Hierarchical Framework for Enhanced NLP Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a framework that combines pseudo-label generation and model ensemble techniques to improve sentence embeddings.	Ziwei Liu; Qi Zhang; Lifu Gao;	arxiv-cs.CL	2025-01-27
154	Optimizing Deep Learning Models to Address Class Imbalance in Code Comment Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work investigates the use of different weighting strategies of the loss function to mitigate the scarcity of certain classes in the dataset.	Moritz Mock; Thomas Borsani; Giuseppe Di Fatta; Barbara Russo;	arxiv-cs.SE	2025-01-27
155	A Comprehensive Study on Fine-Tuning Large Language Models for Medical Question Answering Using Classification Models and Comparative Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the overview of the development and fine-tuning of large language models (LLMs) designed specifically for answering medical questions.	Aysegul Ucar; Soumik Nayak; Anunak Roy; Burak Taşcı; Gülay Taşcı;	arxiv-cs.CL	2025-01-26
156	Identifying Critical Tokens for Accurate Predictions in Transformer-based Medical Imaging Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a step towards demystifying the decision-making process of transformer-based medical imaging models and propose Token Insight, a novel method that identifies the critical tokens that contribute to the prediction made by the model.	Solha Kang; Joris Vankerschaver; Utku Ozbulak;	arxiv-cs.CV	2025-01-26
157	TractoGPT: A GPT Architecture for White Matter Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: White Matter Segmentation remains challenging due to structural similarity in streamlines, subject variability, symmetry in 2 hemispheres, etc. To address these challenges, we propose TractoGPT, a GPT-based architecture trained on streamline, cluster, and fusion data representations separately.	ANOUSHKRIT GOEL et. al.	arxiv-cs.CV	2025-01-26
158	Evaluating Simple Debiasing Techniques in RoBERTa-based Hate Speech Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This leads to a disparity where normal AAE text is more likely to be misclassified as abusive/hateful compared to non-AAE text. Simple debiasing techniques have been developed in the past to counter this sort of disparity, and in this work, we apply and evaluate these techniques in the scope of RoBERTa-based encoders.	Diana Iftimie; Erik Zinn;	arxiv-cs.CL	2025-01-26
159	Multi-stage Large Language Model Pipelines Can Outperform GPT-4o in Relevance Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an LLM-based modular classification pipeline that divides the relevance assessment task into multiple stages, each utilising different prompts and models of varying sizes and capabilities.	Julian A. Schnabel; Johanne R. Trippas; Falk Scholer; Danula Hettiachchi;	arxiv-cs.IR	2025-01-24
160	An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we pursue two main goals.	Shabnam Hassani; Mehrdad Sabetzadeh; Daniel Amyot;	arxiv-cs.SE	2025-01-24
161	Idiom Detection in Sorani Kurdish Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research provides a dataset, three optimized models, and insights into idiom detection, laying a foundation for advancing Kurdish NLP.	Skala Kamaran Omer; Hossein Hassani;	arxiv-cs.CL	2025-01-24
162	Assessing Large Language Models in Comprehending and Verifying Concurrent Programs Across Memory Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of several leading large language models (LLMs), including GPT-3.5-turbo, GPT-4, GPT-4o, GPT-4o-mini, and Mistral-AI’s Large2, in understanding and analyzing concurrency issues within software programs.	Ridhi Jain; Rahul Purandare;	arxiv-cs.SE	2025-01-24
163	GPT-HTree: A Decision Tree Framework Integrating Hierarchical Clustering and Large Language Models for Explainable Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces GPT-HTree, a framework combining hierarchical clustering, decision trees, and large language models (LLMs) to address this challenge.	Te Pei; Fuat Alican; Aaron Ontoyin Yin; Yigit Ihlamur;	arxiv-cs.LG	2025-01-23
164	A Transformer-based Autoregressive Decoder Architecture for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce an effective hierarchical text classifier RADAr (Transformer-based Autoregressive Decoder Architecture) that is based only on an off-the-shelf RoBERTa transformer to process the input and a custom autoregressive decoder with two decoder layers for generating the classification output.	Younes Yousef; Lukas Galke; Ansgar Scherp;	arxiv-cs.LG	2025-01-23
165	Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with An Optimized Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The method proposed in this paper provides a new idea for algorithm optimization in the field of text classification and has good application potential and practical value.	JIA GAO et. al.	arxiv-cs.CL	2025-01-23
166	Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, ensuring privacy and compliance requires edge and private deployments of LLMs. This paper proposes a novel approach to semantic QA over EHRs by first identifying the most relevant FHIR resources for a user query (Task1) and subsequently answering the query based on these resources (Task2).	Sara Kothari; Ayush Gupta;	arxiv-cs.CL	2025-01-23
167	Quantized Spike-driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, recent research in the SNN domain has mainly focused on enhancing accuracy by designing large-scale Transformer structures, which typically rely on substantial computational resources, limiting their deployment on resource-constrained devices. To overcome this challenge, we propose a quantized spike-driven Transformer baseline (QSD-Transformer), which achieves reduced resource demands by utilizing a low bit-width parameter.	XUERUI QIU et. al.	arxiv-cs.CV	2025-01-23
168	5G LDPC Linear Transformer for Channel Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a scalable approach to decode linear block codes with $O(n)$ complexity rather than $O(n^2)$ for regular transformers.	Mario Hernandez; Fernando Pinero;	arxiv-cs.LG	2025-01-23
169	MedSlice: Fine-Tuned Large Language Models for Secure Clinical Note Sectioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study develops a pipeline for automated note sectioning using open-source LLMs, focusing on three sections: History of Present Illness, Interval History, and Assessment and Plan.	JOSHUA DAVIS et. al.	arxiv-cs.CL	2025-01-23
170	LiT: Delving Into A Simplified Linear Diffusion Transformer for Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we offer a suite of ready-to-use solutions for efficient linear diffusion Transformers.	JIAHAO WANG et. al.	arxiv-cs.CV	2025-01-22
171	Comparative Approaches to Sentiment Analysis Using Datasets in Major European and Arabic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores transformer-based models such as BERT, mBERT, and XLM-R for multi-lingual sentiment analysis across diverse linguistic structures.	Mikhail Krasitskii; Olga Kolesnikova; Liliana Chanona Hernandez; Grigori Sidorov; Alexander Gelbukh;	arxiv-cs.CL	2025-01-21
172	Harnessing Generative Pre-Trained Transformer for Datacenter Packet Trace Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Researchers often use simplified mathematical models that lack the depth needed to recreate intricate traffic patterns and, thus, miss optimization opportunities found in realistic traffic. In this preliminary work, we introduce DTG-GPT, a packet-level Datacenter Traffic Generator (DTG), based on the generative pre-trained transformer (GPT) architecture used by many state-of-the-art large language models.	Chen Griner;	arxiv-cs.NI	2025-01-21
173	Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we have evaluated different combinations of multimodal models that integrate Computer Vision and Natural Language Processing to generate comprehensive radiology reports.	Md. Rakibul Islam; Md. Zahid Hossain; Mustofa Ahmed; Most. Sharmin Sultana Samu;	arxiv-cs.CV	2025-01-21
174	FuocChuVIP123 at CoMeDi Shared Task: Disagreement Ranking with XLM-Roberta Sentence Embeddings and Deep Neural Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents results of our system for CoMeDi Shared Task, focusing on Subtask 2: Disagreement Ranking.	Phuoc Duong Huy Chu;	arxiv-cs.CL	2025-01-21
175	LuxVeri at GenAI Detection Task 3: Cross-Domain Detection of AI-Generated Text Using Inverse Perplexity-Weighted Ensemble of Fine-Tuned Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approach for Task 3 of the GenAI content detection workshop at COLING-2025, focusing on Cross-Domain Machine-Generated Text (MGT) Detection.	Md Kamrujjaman Mobin; Md Saiful Islam;	arxiv-cs.CL	2025-01-21
176	LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text Across English and Multilingual Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content, focusing on the binary classification of machine-generated versus human-written text.	Md Kamrujjaman Mobin; Md Saiful Islam;	arxiv-cs.CL	2025-01-21
177	KEIR @ ECIR 2025: The Second Workshop on Knowledge-Enhanced Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this workshop is to bring together researchers from academia and industry to discuss various aspects of knowledge-enhanced information retrieval.	ZIHAN WANG et. al.	arxiv-cs.IR	2025-01-20
178	Trustformer: A Trusted Federated Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel FL method that reduces communication overhead while maintaining competitive utility.	Ali Abbasi Tadi; Dima Alhadidi; Luis Rueda;	arxiv-cs.LG	2025-01-20
179	Irony in Emojis: A Comparative Study of Human and LLM Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the ability of GPT-4o to interpret irony in emojis. By prompting GPT-4o to evaluate the likelihood of specific emojis being used to express irony on social media and comparing its interpretations with human perceptions, we aim to bridge the gap between machine and human understanding.	Yawen Zheng; Hanjia Lyu; Jiebo Luo;	arxiv-cs.CL	2025-01-19
180	PaSa: An LLM Agent for Comprehensive Academic Paper Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce PaSa, an advanced Paper Search agent powered by large language models.	YICHEN HE et. al.	arxiv-cs.IR	2025-01-17
181	Exploring AI-based System Design for Pixel-level Protected Health Information Detection in Medical Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Purpose: This study aims to evaluate different setups of an AI-based solution to detect Protected Health Information (PHI) in medical images.	Tuan Truong; Ivo M. Baltruschat; Mark Klemens; Grit Werner; Matthias Lenga;	arxiv-cs.CV	2025-01-16
182	Improving Automated Feedback Systems for Tutor Training in Low-Resource Scenarios Through Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results demonstrate that our data augmentation approach generalizes effectively to identify other types of praise, compared to the same model fine-tuned without augmentation.	Chentianye Xu; Jionghao Lin; Tongshuang Wu; Vincent Aleven; Kenneth R. Koedinger;	arxiv-cs.HC	2025-01-16
183	Demo: Interactive Visualization of Semantic Relationships in A Biomedical Project’s Talent Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an interactive visualization of the Cell Map for AI Talent Knowledge Graph (CM4AI TKG), a detailed semantic space comprising approximately 28,000 experts and 1,000 datasets focused on the biomedical field.	JIAWEI XU et. al.	arxiv-cs.SI	2025-01-16
184	Towards Multilingual LLM Evaluation for Baltic and Nordic Languages: A Study on Lithuanian History Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluated Lithuanian and general history knowledge of multilingual Large Language Models (LLMs) on a multiple-choice question-answering task.	Yevhen Kostiuk; Oxana Vitman; Łukasz Gagała; Artur Kiulian;	arxiv-cs.CL	2025-01-15
185	Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios.	Alain Komaty; Hatef Otroshi Shahreza; Anjith George; Sebastien Marcel;	arxiv-cs.CV	2025-01-15
186	Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach that combines PhoBERT-V2 and SentiWordnet for Sentiment Analysis of Vietnamese reviews.	Hong-Viet Tran; Van-Tan Bui; Lam-Quan Tran;	arxiv-cs.CL	2025-01-15
187	Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of narratives developed via GPT-4, featuring diverse semantic content and stylistic variations, we analyze BERT’s layerwise activations to uncover patterns of localized neural processing. Through dimensionality reduction techniques such as Principal Component Analysis (PCA) and Multidimensional Scaling (MDS), we reveal that BERT exhibits strong clustering based on narrative content in its later layers, with progressively compact and distinct clusters.	Awritrojit Banerjee; Achim Schilling; Patrick Krauss;	arxiv-cs.CL	2025-01-14
188	Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Tarsier2, a state-of-the-art large vision-language model (LVLM) designed for generating detailed and accurate video descriptions, while also exhibiting superior general video understanding capabilities.	Liping Yuan; Jiawei Wang; Haomiao Sun; Yuchen Zhang; Yuan Lin;	arxiv-cs.CV	2025-01-14
189	Multimodal Fake News Video Explanation Generation: Dataset, Model, and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Without proper explanation, end users may not be able to understand the potential meaning of fake news. Therefore, we propose a novel task, Fake News Video Explanation (FNVE), to generate natural language explanations that reveal the falseness of news videos.	Lizhi Chen; Zhong Qian; Peifeng Li; Qiaoming Zhu;	arxiv-cs.CV	2025-01-14
190	Enhancing The De-identification of Personally Identifiable Information in Educational Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by recent advancements in artificial intelligence, our study investigates the GPT-4o-mini model as a cost-effective and efficient solution for PII detection tasks.	Y. Shen; Z. Ji; J. Lin; K. R. Koedginer;	arxiv-cs.CL	2025-01-14
191	Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle.	Saad Mashkoor Siddiqui; Mohammad Ali Sheikh; Muhammad Aleem; Kajol R Singh;	arxiv-cs.CL	2025-01-14
192	Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, in this work, we investigate the effect of important parameters on the performance and energy efficiency of LLMs during inference and examine their trade-offs.	Paul Joe Maliakel; Shashikant Ilager; Ivona Brandic;	arxiv-cs.LG	2025-01-14
193	GPT As A Monte Carlo Language Tree: A Probabilistic Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel perspective that any language dataset can be represented by a Monte Carlo Language Tree (abbreviated as “Data-Tree”), where each node denotes a token, each edge denotes a token transition probability, and each sequence has a unique path.	Kun-Peng Ning; Jia-Yu Yao; Yu-Yang Liu; Mu-Nan Ning; Li Yuan;	arxiv-cs.CL	2025-01-13
194	An Efficient Sparse Hardware Accelerator for Spike-Driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient sparse hardware accelerator for Spike-driven Transformer.	Zhengke Li; Wendong Mao; Siyu Zhang; Qiwei Dong; Zhongfeng Wang;	arxiv-cs.AR	2025-01-13
195	Transforming Role Classification in Scientific Teams Using LLMs and Advanced Predictive Analytics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we present a transformative approach to classifying author roles in scientific teams using advanced large language models (LLMs), which offers a more refined analysis compared to traditional clustering methods.	Wonduk Seo; Yi Bu;	arxiv-cs.DL	2025-01-13
196	Investigating Large Language Models in Inferring Personality Traits from User Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are demonstrating remarkable human like capabilities across diverse domains, including psychological assessment.	Jianfeng Zhu; Ruoming Jin; Karin G. Coifman;	arxiv-cs.CL	2025-01-13
197	Robust Hybrid Classical-Quantum Transfer Learning Model for Text Classification Using GPT-Neo 125M with LoRA & SMOTE Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This research introduces a hybrid classical-quantum framework for text classification, integrating GPT-Neo 125M with Low-Rank Adaptation (LoRA) and Synthetic Minority Over-sampling Technique (SMOTE) using quantum computing backends.	Santanam Wishal;	arxiv-cs.LG	2025-01-12
198	Generative Artificial Intelligence-Supported Pentesting: A Comparison Between Claude Opus, GPT-4, and Copilot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we have analyzed the potential of leading generic-purpose GenAI tools-Claude Opus, GPT-4 from ChatGPT, and Copilot-in augmenting the penetration testing process as defined by the Penetration Testing Execution Standard (PTES).	Antonio López Martínez; Alejandro Cano; Antonio Ruiz-Martínez;	arxiv-cs.CR	2025-01-12
199	Comparing Few-Shot Prompting of GPT-4 LLMs with BERT Classifiers for Open-Response Assessment in Tutor Equity Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we study whether fine-tuning BERT on human annotations outperforms state-of-the-art LLMs (GPT-4o and GPT-4-Turbo) with few-shot prompting and instruction.	Sanjit Kakarla; Conrad Borchers; Danielle Thomas; Shambhavi Bhushan; Kenneth R. Koedinger;	arxiv-cs.HC	2025-01-11
200	Assessing Instructor-AI Cooperation for Grading Essay-type Questions in An Introductory Sociology Course Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the use of artificial intelligence (AI) as a complementary tool for grading essay-type questions in higher education, focusing on its consistency with human grading and potential to reduce biases.	Francisco Olivos; Tobias Kamelski; Sebastián Ascui-Gac;	arxiv-cs.AI	2025-01-11
201	ZNO-Eval: Benchmarking Reasoning Capabilities of Large Language Models in Ukrainian Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The purpose of this work is to establish a comprehensive benchmark for the reasoning capabilities evaluation of large language models in the Ukrainian language.	Mykyta Syromiatnikov; Victoria Ruvinskaya; Anastasiya Troynina;	arxiv-cs.CL	2025-01-11
202	Model Inversion in Split Learning for Personalized LLMs: New Insights from Information Bottleneck Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the first time, we introduce mutual information entropy to understand the information propagation of Transformer-based LLMs and assess privacy attack performance for LLM blocks.	Yunmeng Shu; Shaofeng Li; Tian Dong; Yan Meng; Haojin Zhu;	arxiv-cs.LG	2025-01-10
203	Aligning Brain Activity with Advanced Transformer Models: Exploring The Role of Punctuation in Semantic Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing an innovative approach originally proposed by Toneva and Wehbe, we evaluate four advanced transformer models RoBERTa, DistiliBERT, ALBERT, and ELECTRA against neural activity data.	Zenon Lamprou; Frank Polick; Yashar Moshfeghi;	arxiv-cs.CL	2025-01-10
204	From Conversation to Automation: Leveraging LLMs for Problem-Solving Therapy Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed a comprehensive framework for PST annotation using established PST Core Strategies and a set of novel Facilitative Strategies to analyze a corpus of real-world therapy transcripts to determine which strategies are most prevalent.	ELHAM AGHAKHANI et. al.	arxiv-cs.CL	2025-01-10
205	UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The newly developed method showed the difference in the length of the created trajectory in 22% and the mean error in finding the objects of interest on a map in 34.22 m by Euclidean distance in the K-Nearest Neighbors (KNN) approach.	OLEG SAUTENKOV et. al.	arxiv-cs.RO	2025-01-09
206	OpenAI ChatGPT Interprets Radiological Images: GPT-4 As A Medical Doctor for A Fast Check-Up Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we addressed the question of whether artificial intelligence (AI) can replace a healthcare professional (e.g., a medical doctor) or whether it can be used as a decision-support tool that makes decisions easier and more reliable.	Omer Aydin; Enis Karaarslan;	arxiv-cs.CV	2025-01-09
207	MB-TaylorFormer V2: Improved Multi-branch Linear Transformer Expanded By Taylor Formula for Image Restoration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the quadratic computational complexity of Softmax-attention poses a significant limitation on its extensive application in image restoration tasks, particularly for high-resolution images. To tackle this challenge, we propose a novel variant of the Transformer.	Zhi Jin; Yuwei Qiu; Kaihao Zhang; Hongdong Li; Wenhan Luo;	arxiv-cs.CV	2025-01-08
208	IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent research has investigated the problem of detecting machine-generated essays for academic purposes.	Mohammad AL-Smadi;	arxiv-cs.CL	2025-01-07
209	A Case Study on The Transformative Potential of AI in Software Engineering on LeetCode and ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This contribution presents a first large-scale study comparing generated code with human-written code based on LeetCode platform based on multiple measures including code quality, code understandability, time behaviour and resource utilisation.	Manuel Merkel; Jens Dörpinghaus;	arxiv-cs.DB	2025-01-07
210	Three-dimensional Attention Transformer for State Evaluation in Real-time Strategy Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose a tri-dimensional Space-Time-Feature Transformer (TSTF Transformer) architecture, which efficiently models battlefield situations through three independent but cascaded modules: spatial attention, temporal attention, and feature attention.	Yanqing Ye; Weilong Yang; Kai Qiu; Jie Zhang;	arxiv-cs.LG	2025-01-07
211	Text to Band Gap: Pre-trained Language Models As Encoders for Semiconductor Band Gap Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we explore the use of a transformer-based language model as an encoder to predict the band gaps of semiconductor materials directly from their text descriptions.	Ying-Ting Yeh; Janghoon Ock; Amir Barati Farimani;	arxiv-cs.CL	2025-01-06
212	Empowering Bengali Education with AI: Solving Bengali Math Word Problems Through Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This poses a significant challenge in natural language processing, particularly for low-resource languages such as Bengali. This paper addresses this challenge by developing an innovative approach to solving Bengali MWPs using transformer-based models, including Basic Transformer, mT5, BanglaT5, and mBART50.	Jalisha Jashim Era; Bidyarthi Paul; Tahmid Sattar Aothoi; Mirazur Rahman Zim; Faisal Muhammad Shah;	arxiv-cs.CL	2025-01-05
213	LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The machine learning and data science community has made significant while dispersive progress in accelerating transformer-based large language models (LLMs), and one promising approach is to replace the original causal attention in a generative pre-trained transformer (GPT) with \emph{exponentially decaying causal linear attention}. In this paper, we present LeetDecoding, which is the first Python package that provides a large set of computation routines for this fundamental operator.	Jiaping Wang; Simiao Zhang; Qiao-Chu He; Yifan Chen;	arxiv-cs.LG	2025-01-05
214	A Completely Uniform Transformer for Parity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a 3-layer constant-dimension transformer, recognizing the parity language, where neither parameter matrices nor the positional encoding depend on the input length.	Alexander Kozachinskiy; Tomasz Steifer;	arxiv-cs.LG	2025-01-05
215	Sensorformer: Cross-patch Attention with Global-patch Compression Is Effective for High-dimensional Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We attribute this issue to the dynamic time lags in the causal relationships between different variables. Therefore, we propose a new multivariate time series forecasting Transformer, Sensorformer, which first compresses the global patch information and then simultaneously extracts cross-variable and cross-time dependencies from the compressed representations.	Liyang Qin; Xiaoli Wang; Chunhua Yang; Huaiwen Zou; Haochuan Zhang;	arxiv-cs.LG	2025-01-05
216	Anonymization By Design of Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a privacy-by-design language modeling approach to address the problem of language models anonymization, and thus promote their sharing.	Antoine Boutet; Zakaria El Kazdam; Lucas Magnana; Helain Zimmermann;	arxiv-cs.CL	2025-01-04
217	LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a preliminary analysis of an experiment conducted by Frank Bold, a Czech expert group, to explore user interactions with GPT-4 for addressing legal queries.	Michal Kuk; Jakub Harasta;	arxiv-cs.HC	2025-01-03
218	VidFormer: A Novel End-to-end Framework Fused By 3DCNN and Transformer for Video-based Remote Physiological Measurement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce VidFormer, a novel end-to-end framework that integrates 3-Dimension Convolutional Neural Network (3DCNN) and Transformer models for rPPG tasks.	JIACHEN LI et. al.	arxiv-cs.CV	2025-01-03
219	End-to-End Long Document Summarization Using Gradient Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose CachED (Gradient $\textbf{Cach}$ing for $\textbf{E}$ncoder-$\textbf{D}$ecoder models), an approach that enables end-to-end training of existing transformer-based encoder-decoder models, using the entire document without truncation.	Rohit Saxena; Hao Tang; Frank Keller;	arxiv-cs.CL	2025-01-03
220	Predicting The Performance of Black-box LLMs Through Self-Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations to train reliable predictors of model behavior.	Dylan Sam; Marc Finzi; J. Zico Kolter;	arxiv-cs.LG	2025-01-02
221	Towards Interactive Deepfake Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to explore interactive deepfake analysis by performing instruction tuning on multi-modal large language models (MLLMs).	LIXIONG QIN et. al.	arxiv-cs.CV	2025-01-02
222	Digital Guardians: Can GPT-4, Perspective API, and Moderation API Reliably Detect Hate Speech in Reader Comments of German Online Newspapers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Some providers of large language models already offer solutions for automated hate speech detection or the identification of toxic content.	MANUEL WEBER et. al.	arxiv-cs.CL	2025-01-02
223	Multi-Head Explainer: A General Framework to Improve Explainability in CNNs and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce the Multi-Head Explainer (MHEX), a versatile and modular framework that enhances both the explainability and accuracy of Convolutional Neural Networks (CNNs) and Transformer-based models.	Bohang Sun; Pietro Liò;	arxiv-cs.CV	2025-01-02
224	An Efficient Attention Mechanism for Sequential Recommendation Tasks: HydraRec Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the idea of Hydra attention, we introduce an efficient Transformer based Sequential RS (HydraRec) which significantly improves theoretical complexity of computing attention for longer sequences and bigger datasets while preserving the temporal context.	Uzma Mushtaque;	arxiv-cs.IR	2025-01-02
225	Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this letter, Multiscaled Multi-Head Attention Video Transformer Network (MsMHA-VTN) for dynamic hand gesture recognition is proposed.	Mallika Garg; Debashis Ghosh; Pyari Mohan Pradhan;	arxiv-cs.CV	2025-01-01
226	Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting A Petroglyph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This may be partially due to a sudden growth of the language modeling community after the advent of GPT-2, but perhaps also due to the lack of a clear explanation in prior publications, despite being commonly understood by practitioners in the past. Here we review this long-forgotten explanation why explicit PEs are nonessential for multi-layer autoregressive Transformers (in contrast, one-layer models require PEs to discern order information of their input tokens).	Kazuki Irie;	arxiv-cs.LG	2024-12-31
227	ReFormer: Generating Radio Fakes for Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ReFormer, a generative AI (GAI) model that can efficiently generate synthetic radio-frequency (RF) data, or RF fakes, statistically similar to the data it was trained on, or with modified statistics, in order to augment datasets collected in real-world experiments.	Yagna Kaasaragadda; Silvija Kokalj-Filipovic;	arxiv-cs.LG	2024-12-31
228	Text Classification: Neural Networks VS Machine Learning Models VS Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a comparison between different techniques to perform text classification.	Christos Petridis;	arxiv-cs.LG	2024-12-30
229	GPT-4 on Clinic Depression Assessment: An LLM-Based Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the use of GPT-4 for clinical depression assessment based on transcript analysis.	Giuliano Lorenzoni; Pedro Elkind Velmovitsky; Paulo Alencar; Donald Cowan;	arxiv-cs.CL	2024-12-30
230	Comparative Performance of Advanced NLP Models and LLMs in Multilingual Geo-Entity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation of leading NLP models — SpaCy, XLM-RoBERTa, mLUKE, GeoLM — and LLMs, specifically OpenAI’s GPT 3.5 and GPT 4, within the context of multilingual geo-entity detection.	Kalin Kopanov;	arxiv-cs.CL	2024-12-29
231	NLP-based Regulatory Compliance — Using GPT 4.0 to Decode Regulatory Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) such as GPT-4.0 have shown significant promise in addressing the semantic complexities of regulatory documents, particularly in detecting inconsistencies and contradictions.	Bimal Kumar; Dmitri Roussinov;	arxiv-cs.CL	2024-12-29
232	Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a distilled transformer network (DTN) to capture both rich local and global forgery traces and learn general and common representations for different forgery faces.	Yaning Zhang; Qiufu Li; Zitong Yu; Linlin Shen;	arxiv-cs.CV	2024-12-28
233	Building A Rich Dataset to Empower The Persian Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, a comprehensive open-domain dataset is presented for Persian.	Mohsen Yazdinejad; Marjan Kaedi;	arxiv-cs.CL	2024-12-28
234	CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces CAD-GPT, a CAD synthesis method with spatial reasoning-enhanced MLLM that takes either a single image or a textual description as input.	SIYU WANG et. al.	arxiv-cs.CV	2024-12-27
235	Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to conduct a comparative study by adapting and evaluating existing text classification techniques within the cyberbullying detection domain.	Adamu Gaston Philipo; Doreen Sebastian Sarwatt; Jianguo Ding; Mahmoud Daneshmand; Huansheng Ning;	arxiv-cs.CL	2024-12-27
236	Generative Pretrained Embedding and Hierarchical Irregular Time Series Representation for Daily Living Activity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further refine recognition, we incorporate into our proposed architecture an hour-of-the-day embedding.	Damien Bouchabou; Sao Mai Nguyen;	arxiv-cs.LG	2024-12-27
237	DrivingWorld: Constructing World Model for Autonomous Driving Via Video GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DrivingWorld, a GPT-style world model for autonomous driving, featuring several spatial-temporal fusion mechanisms.	XIAOTAO HU et. al.	arxiv-cs.CV	2024-12-27
238	DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a pioneering Domain Adaptive Point Transformer (DAPoinTr) framework for point cloud completion.	YINGHUI LI et. al.	arxiv-cs.CV	2024-12-26
239	Feature Alignment-Based Knowledge Distillation for Efficient Compression of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a knowledge distillation algorithm based on large language models and feature alignment, aiming to effectively transfer the knowledge of large pre-trained models into lightweight student models, thereby reducing computational costs while maintaining high model performance.	SHUO WANG et. al.	arxiv-cs.CL	2024-12-26
240	Injecting Bias Into Text Classification Models Using Backdoor Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize backdoor attacks for a new purpose: bias injection.	A. Dilara Yavuz; M. Emre Gursoy;	arxiv-cs.CR	2024-12-25
241	Whose Morality Do They Speak? Unraveling Cultural Bias in Multilingual Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates whether multilingual LLMs, such as GPT-3.5-Turbo, GPT-4o-mini, Llama 3.1, and MistralNeMo, reflect culturally specific moral values or impose dominant moral norms, particularly those rooted in English.	Meltem Aksoy;	arxiv-cs.CL	2024-12-25
242	LoGFiLM: Fine-Tuning A Large Language Model for Automated Generation of Log Statements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning LLMs requires task-specific training data and custom-designed processing algorithms, which, however, have not been thoroughly explored for the log statement generation task. This paper fills this gap by contributing such a fine-tuning method LoGFiLM and an exemplar model by using the proposed method to fine-tune Llama-3-8B.	HAO ZHANG et. al.	arxiv-cs.SE	2024-12-25
243	Ister: Inverted Seasonal-Trend Decomposition Transformer for Explainable Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing models face challenges in identifying critical components for prediction, leading to limited interpretability and suboptimal performance. To address these issues, we propose the Inverted Seasonal-Trend Decomposition Transformer (Ister), a novel Transformer-based model for multivariate time series forecasting.	Fanpu Cao; Shu Yang; Zhengjian Chen; Ye Liu; Laizhong Cui;	arxiv-cs.LG	2024-12-25
244	Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose OMTSeg for open-vocabulary segmentation using another large-scale vision-language pre-trained model called BEiT-3 and leveraging the cross-modal attention between visual and linguistic features in BEiT-3 to achieve better performance.	Yi-Chia Chen; Wei-Hua Li; Chu-Song Chen;	arxiv-cs.CV	2024-12-25
245	Unlocking The Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the capability of state-of-the-art language models-RoBERTa Base, Bangla-BERT, and BERT Base-in automatically assessing Bangla passage-based question-answering from the National Curriculum and Textbook Board (NCTB) textbooks for classes 6-10.	Abdullah Khondoker; Enam Ahmed Taufik; Md Iftekhar Islam Tashik; S M Ishtiak mahmud; Antara Firoz Parsa;	arxiv-cs.CL	2024-12-24
246	Combining GPT and Code-Based Similarity Checking for Effective Smart Contract Vulnerability Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present SimilarGPT, a unique vulnerability identification tool for smart contract, which combines Generative Pretrained Transformer (GPT) models with Code-based similarity checking methods.	Jango Zhang;	arxiv-cs.SE	2024-12-24
247	Optimizing Large Language Models with An Enhanced LoRA Fine-Tuning Algorithm for Efficiency and Robustness in NLP Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a large language model optimization method based on the improved LoRA fine-tuning algorithm, aiming to improve the accuracy and computational efficiency of the model in natural language processing tasks.	JIACHENG HU et. al.	arxiv-cs.CL	2024-12-24
248	Segment-Based Attention Masking for GPTs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, attention is masked based on the known block structure at the prefill phase, followed by the conventional token-by-token autoregressive process after that.	Shahar Katz; Liran Ringel; Yaniv Romano; Lior Wolf;	arxiv-cs.CL	2024-12-24
249	IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the MultilingualRobertaClass model, a deep neural network built on the pretrained multilingual transformer model ia-multilingual-transliterated-roberta, optimized for classification tasks in multilingual and transliterated contexts.	Siddhant Gupta; Siddh Singhal; Azmine Toushik Wasi;	arxiv-cs.CL	2024-12-23
250	Token Statistics Transformer: Linear-Time Attention Via Variational Rate Reduction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel transformer attention operator whose computational complexity scales linearly with the number of tokens.	ZIYANG WU et. al.	arxiv-cs.LG	2024-12-23
251	SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a substation equipment fault analysis method based on a multimodal large language model (MLLM).	JINZHI WANG et. al.	arxiv-cs.AI	2024-12-22
252	PsychAdapter: Adapting LLM Transformers to Reflect Traits, Personality and Mental Health Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a lightweight modification to the standard language model transformer architecture – PsychAdapter – that uses empirically derived trait-language patterns to generate natural language for specified personality, demographic, and mental health characteristics (with or without prompting).	HUY VU et. al.	arxiv-cs.AI	2024-12-22
253	TAR3D: Creating High-Quality 3D Assets Via Next-Part Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TAR3D, a novel framework that consists of a 3D-aware Vector Quantized-Variational AutoEncoder (VQ-VAE) and a Generative Pre-trained Transformer (GPT) to generate high-quality 3D assets.	XUYING ZHANG et. al.	arxiv-cs.CV	2024-12-22
254	Reversed Attention: On The Gradient Descent Of Attention Layers In GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the mathematics of the backward pass of attention, revealing that it implicitly calculates an attention matrix we refer to as Reversed Attention.	Shahar Katz; Lior Wolf;	arxiv-cs.CL	2024-12-22
255	Development of A Large-scale Dataset of Chest Computed Tomography Reports in Japanese and A High-performance Finding Classification Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To develop a comprehensive Japanese CT report dataset through machine translation and establish a specialized language model for structured finding classification.	YOSUKE YAMAGISHI et. al.	arxiv-cs.CL	2024-12-20
256	BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the potential of recurrent neural networks (RNNs) and other subquadratic architectures as competitive alternatives to transformer-based models in low-resource language modeling scenarios.	Patrick Haller; Jonas Golde; Alan Akbik;	arxiv-cs.CL	2024-12-20
257	Demystifying The Potential of ChatGPT-4 Vision for Construction Progress Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of Large Vision-Language Models (LVLMs) such as OpenAI’s GPT-4 Vision into various sectors has marked a significant evolution in the field of artificial intelligence, particularly in the analysis and interpretation of visual data. This paper explores the practical application of GPT-4 Vision in the construction industry, focusing on its capabilities in monitoring and tracking the progress of construction projects.	Ahmet Bahaddin Ersoz;	arxiv-cs.CV	2024-12-20
258	Linguistic Features Extracted By GPT-4 Improve Alzheimer’s Disease Detection Based on Spontaneous Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech.	Jonathan Heitz; Gerold Schneider; Nicolas Langer;	arxiv-cs.CL	2024-12-20
259	Identifying Cyberbullying Roles in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions.	MANUEL SANDOVAL et. al.	arxiv-cs.LG	2024-12-20
260	Graph-Convolutional Networks: Named Entity Recognition and Large Language Model Embedding in Document Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel approach that integrates Named Entity Recognition (NER) and LLM embeddings within a graph-based framework for document clustering.	Imed Keraghel; Mohamed Nadif;	arxiv-cs.CL	2024-12-19
261	How Good Is GPT at Writing Political Speeches for The White House? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using large language models (LLMs), computers are able to generate a written text in response to a us er request.	Jacques Savoy;	arxiv-cs.CL	2024-12-19
262	A Full Transformer-based Framework for Automatic Pain Estimation Using Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a novel full transformer-based framework consisting of a Transformer in Transformer (TNT) model and a Transformer leveraging cross-attention and self-attention blocks.	Stefanos Gkikas; Manolis Tsiknakis;	arxiv-cs.CV	2024-12-19
263	LLMs As Mediators: Can They Diagnose Conflicts Accurately? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior research indicates that to be able to mediate conflict, observers of disagreements between parties must be able to reliably distinguish the sources of their disagreement as stemming from differences in beliefs about what is true (causality) vs. differences in what they value (morality). In this paper, we test if OpenAI’s Large Language Models GPT 3.5 and GPT 4 can perform this task and whether one or other type of disagreement proves particularly challenging for LLM’s to diagnose.	Özgecan Koçak; Phanish Puranam; Afşar Yegin;	arxiv-cs.CL	2024-12-19
264	FarExStance: Explainable Stance Detection for Farsi Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce FarExStance, a new dataset for explainable stance detection in Farsi.	MAJID ZARHARAN et. al.	arxiv-cs.CL	2024-12-18
265	Fake News Detection: Comparative Evaluation of BERT-like Models and Large Language Models with Generative AI-Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a comparative evaluation of BERT-like encoder-only models and autoregressive decoder-only large language models (LLMs) for fake news detection.	Shaina Raza; Drai Paulen-Patterson; Chen Ding;	arxiv-cs.CL	2024-12-18
266	Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce ModernBERT, bringing modern model optimizations to encoder-only models and representing a major Pareto improvement over older encoders.	BENJAMIN WARNER et. al.	arxiv-cs.CL	2024-12-18
267	Lightweight Safety Classification Using Pruned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel technique for content safety and prompt injection classification for Large Language Models.	Mason Sawtell; Tula Masterman; Sandi Besen; Jim Brown;	arxiv-cs.CL	2024-12-17
268	Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, a hybrid model that combines LSTMs for temporal encoding with a Transformer encoder for capturing complex interactions between vehicles is proposed.	Chandra Raskoti; Weizi Li;	arxiv-cs.RO	2024-12-17
269	No More Adam: Learning Rate Scaling at Initialization Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we question the necessity of adaptive gradient methods for training deep neural networks.	Minghao Xu; Lichuan Xiang; Xu Cai; Hongkai Wen;	arxiv-cs.LG	2024-12-16
270	Investigating Mixture of Experts in Dense Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), one limitation of these neural models is their narrow generalizability and robustness. To cope with …	Effrosyni Sokli; Pranav Kasela; Georgios Peikos; Gabriella Pasi;	arxiv-cs.IR	2024-12-16
271	Causal Diffusion Transformers for Generative Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Causal Diffusion as the autoregressive (AR) counterpart of Diffusion models.	Chaorui Deng; Deyao Zhu; Kunchang Li; Shi Guang; Haoqi Fan;	arxiv-cs.CV	2024-12-16
272	Optimized Quran Passage Retrieval Using An Expanded QA Dataset and Fine-Tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Qur’an QA 2023 shared task dataset had a limited number of questions with weak model retrieval. To address this challenge, this work updated the original dataset and improved the model accuracy.	Mohamed Basem; Islam Oshallah; Baraa Hikal; Ali Hamdi; Ammar Mohamed;	arxiv-cs.CL	2024-12-15
273	Seeing The Forest and The Trees: Solving Visual Graph and Tree Based Data Structure Problems Using Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research not only introduces an LMM benchmark to facilitate replication and further exploration but also underscores the potential of LMMs in solving complex computing problems, with important implications for pedagogy and assessment practices.	SEBASTIAN GUTIERREZ et. al.	arxiv-cs.AI	2024-12-15
274	Do Tutors Learn from Equity Training and Can Generative AI Assess It? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We apply a mixed-method approach to analyze the performance of 81 undergraduate remote tutors.	DANIELLE R. THOMAS et. al.	arxiv-cs.HC	2024-12-15
275	SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation.	QILONG WU et. al.	arxiv-cs.CL	2024-12-14
276	Tokens, The Oft-overlooked Appetizer: Large Language Models, The Distributional Hypothesis, and Meaning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides creating sub-optimal semantic building blocks and obscuring the model’s access to the necessary distributional patterns, we describe how tokenization pretraining can be a backdoor for bias and other unwanted content, which current alignment practices may not remediate.	JULIA WITTE ZIMMERMAN et. al.	arxiv-cs.CL	2024-12-14
277	Does Multiple Choice Have A Future in The Age of Generative AI? A Posttest-only RCT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using a posttest-only randomized control design, we compare the performance of 234 tutors (790 lesson completions) across three conditions: MCQ only, open response only, and a combination of both.	DANIELLE R. THOMAS et. al.	arxiv-cs.HC	2024-12-13
278	Evaluation of GPT-4o and GPT-4o-mini’s Vision Capabilities for Compositional Analysis from Dried Solution Drops Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using OpenAI’s image-enabled language models, we analyzed deposits from 12 salts with 200 images per salt and per model.	Deven B. Dangi; Beni B. Dangi; Oliver Steinbock;	arxiv-cs.CV	2024-12-13
279	SPT: Sequence Prompt Transformer for Interactive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods typically process one image at a time, failing to consider the sequential nature of the images. To overcome this limitation, we propose a novel method called Sequence Prompt Transformer (SPT), the first to utilize sequential image information for interactive segmentation.	Senlin Cheng; Haopeng Sun;	arxiv-cs.CV	2024-12-13
280	Adaptive Principal Components Allocation with The $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel Parameter-Efficient Fine-Tuning (PEFT) approach based on Gaussian Graphical Models (GGMs), marking the first application of GGMs to PEFT tasks, to the best of our knowledge.	Jingjing Zheng; Yankai Cao;	arxiv-cs.LG	2024-12-11
281	NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection Using Ensembling of BERT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work emphasizes the need for hate speech detection in Devanagari-scripted languages and presents a foundation for further research.	Anmol Guragain; Nadika Poudel; Rajesh Piryani; Bishesh Khanal;	arxiv-cs.CL	2024-12-11
282	Advancing Single- and Multi-task Text Classification Through Large Language Model Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study employed a diverse range of models and methods, varying in size and architecture, and including both fine-tuned and pre-trained approaches.	Hang Zhao; Qile P. Chen; Yijing Barry Zhang; Gang Yang;	arxiv-cs.CL	2024-12-11
283	A Survey on Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models.	Yang Li; Xinyu Zhou; Yitong Wang; Liangxin Qian; Jun Zhao;	arxiv-cs.CR	2024-12-11
284	Assessing Personalized AI Mentoring with Large Language Models in The Computing Field Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides an in-depth evaluation of three state-of-the-art Large Language Models (LLMs) for personalized career mentoring in the computing field, using three distinct student profiles that consider gender, race, and professional levels.	Xiao Luo; Sean O’Connell; Shamima Mithun;	arxiv-cs.CL	2024-12-11
285	GPT-2 Through The Lens of Vector Symbolic Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the resemblance between decoder-only transformer architecture and vector symbolic architectures (VSA) and presents experiments indicating that GPT-2 uses mechanisms involving nearly orthogonal vector bundling and binding operations similar to VSA for computation and communication between layers.	Johannes Knittel; Tushaar Gangavarapu; Hendrik Strobelt; Hanspeter Pfister;	arxiv-cs.LG	2024-12-10
286	Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach leveraging large language models (LLMs) like GPT-4, LLaMA 2 (13B), and BERT to generate KGs directly from unstructured data, bypassing traditional pipelines.	Ahan Bhatt; Nandan Vaghela; Kush Dudhia;	arxiv-cs.CL	2024-12-10
287	Causal World Representation in The GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Are generative pre-trained transformer (GPT) models only trained to predict the next token, or do they implicitly learn a world model from which a sequence is generated one token at a time? We examine this question by deriving a causal interpretation of the attention mechanism in GPT, and suggesting a causal world model that arises from this interpretation.	Raanan Y. Rohekar; Yaniv Gurwicz; Sungduk Yu; Vasudev Lal;	arxiv-cs.AI	2024-12-10
288	TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the first time, this paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs, including SRAM, AES, and UART modules. We propose a novel tool for this goal that systematically assesses state-of-the-art LLMs (GPT-4o, Gemini 1.5 pro, and Llama 3.1) in detecting HTs without prior fine-tuning.	Md Omar Faruque; Peter Jamieson; Ahmad Patooghy; Abdel-Hameed A. Badawy;	arxiv-cs.CR	2024-12-10
289	Rethinking Emotion Annotations in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the complexities of emotion annotation in the context of LLMs, focusing on GPT-4 as a leading model.	Minxue Niu; Yara El-Tawil; Amrit Romana; Emily Mower Provost;	arxiv-cs.CL	2024-12-10
290	Towards Predictive Communication with Brain-Computer Interfaces Integrating Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This perspective article aims at providing an outline of the state of the art and future developments towards the integration of cutting-edge predictive language models with BCI.	Andrea Caria;	arxiv-cs.HC	2024-12-10
291	Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study aims to explore the performance improvement method of large language models based on GPT-4 under the multi-task learning framework and conducts experiments on two …	ZHEN QI et. al.	ArXiv	2024-12-09
292	CARP: Visuomotor Policy Learning Via Coarse-to-Fine Autoregressive Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Coarse-to-Fine AutoRegressive Policy (CARP), a novel paradigm for visuomotor policy learning that redefines the autoregressive action generation process as a coarse-to-fine, next-scale approach.	ZHEFEI GONG et. al.	arxiv-cs.RO	2024-12-09
293	Inverting Visual Representations with Detection Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we apply the approach of training inverse models to reconstruct input images from intermediate layers within a Detection Transformer, showing that this approach is efficient and feasible for transformer-based vision models.	Jan Rathjens; Shirin Reyhanian; David Kappel; Laurenz Wiskott;	arxiv-cs.CV	2024-12-09
294	SplaXBERT: Leveraging Mixed Precision Training and Context Splitting for Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: SplaXBERT, built on ALBERT-xlarge with context-splitting and mixed precision training, achieves high efficiency in question-answering tasks on lengthy texts. Tested on SQuAD v1.1, …	Zhu Yufan; Hao Zeyu; Li Siqi; Niu Boqian;	arxiv-cs.CL	2024-12-06
295	Exploring Transformer-Based Music Overpainting for Jazz Piano Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs.	Eleanor Row; Ivan Shanin; György Fazekas;	arxiv-cs.SD	2024-12-05
296	FANAL — Financial Activity News Alerting Language Modeling Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FANAL (Financial Activity News Alerting Language Modeling Framework), a specialized BERT-based framework engineered for real-time financial event detection and analysis, categorizing news into twelve distinct financial categories.	Urjitkumar Patel; Fang-Chun Yeh; Chinmay Gondhalekar; Hari Nalluri;	arxiv-cs.CL	2024-12-04
297	Controlling The Mutation in Large Language Models for The Efficient Evolution of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach to mutation control within LLM-driven evolutionary frameworks, inspired by theory of genetic algorithms.	Haoran Yin; Anna V. Kononova; Thomas Bäck; Niki van Stein;	arxiv-cs.NE	2024-12-04
298	A Water Efficiency Dataset for African Data Centers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI computing and data centers consume a large amount of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to …	Noah Shumba; Opelo Tshekiso; Pengfei Li; Giulia Fanti; Shaolei Ren;	arxiv-cs.LG	2024-12-04
299	The Asymptotic Behavior of Attention in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we provide a rigorous, mathematical analysis of the asymptotic properties of attention in transformers.	Álvaro Rodríguez Abella; João Pedro Silvestre; Paulo Tabuada;	arxiv-cs.AI	2024-12-03
300	Transformer-Based Auxiliary Loss for Face Recognition Across Age Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a technique for loss evaluation that uses a transformer network as an additive loss in the face recognition domain.	Pritesh Prakash; Ashish Jacob Sam; S Umamaheswaran;	arxiv-cs.CV	2024-12-03
301	Achieving Semantic Consistency: Contextualized Word Representations for Political Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares Word2Vec and BERT using 20 years of People’s Daily articles to evaluate their performance in semantic representations across different timeframes.	Ruiyu Zhang; Lin Nie; Ce Zhao; Qingyang Chen;	arxiv-cs.CL	2024-12-03
302	Su-RoBERTa: A Semi-supervised Approach to Predicting Suicide Risk Through Social Media Using Base Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Su-RoBERTa, a fine-tuned RoBERTa on suicide risk prediction task that utilized both the labeled and unlabeled Reddit data and tackled class imbalance by data augmentation using GPT-2 model.	CHAYAN TANK et. al.	arxiv-cs.HC	2024-12-02
303	Assessing GPT Model Uncertainty in Mathematical OCR Tasks Via Entropy Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the uncertainty of Generative Pre-trained Transformer (GPT) models in extracting mathematical equations from images of varying resolutions and converting them into LaTeX code.	Alexei Kaltchenko;	arxiv-cs.IT	2024-12-02
304	Impact of Data Snooping on Deep Learning Models for Locating Vulnerabilities in Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the impact of data snooping on neural networks used to detect vulnerabilities in lifted code, and builds on previous research that used word2vec and unidirectional and bidirectional transformer-based embeddings.	Gary A. McCully; John D. Hastings; Shengjie Xu;	arxiv-cs.CR	2024-12-02
305	TGTOD: A Global Temporal Graph Transformer for Outlier Detection at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we rethink temporal graph Transformers and propose TGTOD, a novel end-to-end Temporal Graph Transformer for Outlier Detection.	Kay Liu; Jiahao Ding; MohamadAli Torkamani; Philip S. Yu;	arxiv-cs.LG	2024-12-01
306	Sequence Length Independent Norm-Based Generalization Bounds for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper provides norm-based generalization bounds for the Transformer architecture that do not depend on the input sequence length.	Jacob Trauger; Ambuj Tewari;	aistats	2024-12-01
307	Analysis of Privacy Leakage in Federated Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs.	Minh Vu; Truc Nguyen; Tre� Jeter; My T. Thai;	aistats	2024-12-01
308	Enhancing In-context Learning Via Linear Probe Calibration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input.	MOMIN ABBAS et. al.	aistats	2024-12-01
309	Automated Extraction of Acronym-Expansion Pairs from Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project addresses challenges posed by the widespread use of abbreviations and acronyms in digital texts. We propose a novel method that combines document preprocessing, regular expressions, and a large language model to identify abbreviations and map them to their corresponding expansions.	Izhar Ali; Million Haileyesus; Serhiy Hnatyshyn; Jan-Lucas Ott; Vasil Hnatyshin;	arxiv-cs.CL	2024-12-01
310	Understanding Complex-Valued Transformer for Modulation Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Complex-valued convolution neural networks (CVCNNs) have been recently applied for modulation recognition (MR), due to its ability to capture the relationship between the real and …	JINGRENG LEI et. al.	IEEE Wireless Communications Letters	2024-12-01
311	How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms.	Jorge Garc�a-Carrasco; Alejandro Mat�; Juan Carlos Trujillo;	aistats	2024-12-01
312	Homeostasis and Sparsity in Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The transformer architecture has become an integral part of the field of modern neural networks, playing a crucial role in a variety of tasks, such as text generation, machine …	Leonid Kotyuzanskiy; Artem Klimov;	arxiv-cs.LG	2024-11-30
313	Forma Mentis Networks Predict Creativity Ratings of Short Texts Via Interpretable Artificial Intelligence in Human and GPT-simulated Raters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use textual forma mentis networks (TFMN) to extract network (semantic/syntactic associations) and emotional features from approximately one thousand human- and GPT3.5-generated stories.	Edith Haim; Natalie Fischer; Salvatore Citraro; Giulio Rossetti; Massimo Stella;	arxiv-cs.AI	2024-11-30
314	LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the ever-increasing number of news stories available online, classifying them by topic, regardless of the language they are written in, has become crucial for enhancing readers’ access to relevant content. To address this challenge, we propose a teacher-student framework based on large language models (LLMs) for developing multilingual news classification models of reasonable size with no need for manual data annotation.	Taja Kuzman; Nikola Ljubešić;	arxiv-cs.CL	2024-11-29
315	Habit Coach: Customising RAG-based Chatbots to Support Behavior Change Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the iterative development of Habit Coach, a GPT-based chatbot designed to support users in habit change through personalized interaction.	Arian Fooroogh Mand Arabi; Cansu Koyuturk; Michael O’Mahony; Raffaella Calati; Dimitri Ognibene;	arxiv-cs.HC	2024-11-28
316	Waterfall Transformer for Multi-person Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Waterfall Transformer architecture for Pose estimation (WTPose), a single-pass, end-to-end trainable framework designed for multi-person pose estimation.	Navin Ranjan; Bruno Artacho; Andreas Savakis;	arxiv-cs.CV	2024-11-28
317	Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Self-Cross diffusion guidance to penalize the overlap between cross-attention maps and aggregated self-attention maps.	Weimin Qiu; Jieke Wang; Meng Tang;	arxiv-cs.CV	2024-11-28
318	The Impact of Example Selection in Few-Shot Prompting on Automated Essay Scoring Using GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the impact of example selection on the performance of au-tomated essay scoring (AES) using few-shot prompting with GPT models.	Lui Yoshida;	arxiv-cs.CL	2024-11-28
319	SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces SmartLLMSentry, a novel framework that leverages large language models (LLMs), specifically ChatGPT with in-context training, to advance smart contract vulnerability detection.	Oualid Zaazaa; Hanan El Bakkali;	arxiv-cs.CR	2024-11-28
320	Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Developing a system capable of automatically generating the literature reviews from only the PDF files as input is the primary objective of this research work.	Nurshat Fateh Ali; Md. Mahdi Mohtasim; Shakil Mosharrof; T. Gopi Krishna;	arxiv-cs.CL	2024-11-27
321	Training and Evaluating Language Models with Template-based Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models often struggle with tasks requiring complex reasoning, particularly in mathematical problem-solving, due in part to the scarcity of large-scale, high-quality, domain-specific datasets necessary for training sophisticated reasoning abilities. To address this limitation, we introduce Template-based Data Generation (TDG), a novel approach that leverages LLMs (GPT-4) to automatically generate parameterized meta-templates, which are then used to synthesize a vast array of high-quality problems and solutions.	Yifan Zhang;	arxiv-cs.CL	2024-11-27
322	On Limitations of LLM As Annotator for Low Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To bridge this gap, Large Language Models (LLMs) present an opportunity for potential annotators, capable of generating datasets and resources for these underrepresented languages. In this paper, we focus on Marathi, a low-resource language, and evaluate the performance of both closed-source and open-source LLMs as annotators, while also comparing these results with fine-tuned BERT models.	Suramya Jadhav; Abhay Shanbhag; Amogh Thakurdesai; Ridhima Sinare; Raviraj Joshi;	arxiv-cs.CL	2024-11-26
323	CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Decoder-only models generate tokens autoregressively by caching key/value vectors, but as the cache grows, inference becomes memory-bound. To address this issue, we introduce CLOVER (Cross-Layer Orthogonal Vectors), a novel approach that treats pairs of attention layers as a set of low-rank decompositions.	Fanxu Meng; Pingzhi Tang; Fan jiang; Muhan Zhang;	arxiv-cs.LG	2024-11-26
324	An Attempt to Develop A Neural Parser Based on Simplified Head-Driven Phrase Structure Grammar on Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aimed to develop a neural parser for Vietnamese based on simplified Head-Driven Phrase Structure Grammar (HPSG).	Duc-Vu Nguyen; Thang Chau Phan; Quoc-Nam Nguyen; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-11-26
325	Give Me The Code — Log Analysis of First-Year CS Students’ Interactions With GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite using unsophisticated prompting techniques, our findings suggest that the majority of students successfully leveraged GPT, incorporating the suggested solutions into their projects.	Pedro Alves; Bruno Pereira Cipriano;	arxiv-cs.CY	2024-11-26
326	Distributed Sign Momentum with Local Steps for Training Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates a novel communication-efficient distributed sign momentum method with local updates.	SHUHUA YU et. al.	arxiv-cs.LG	2024-11-26
327	What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational Linguistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of new literature into the English curriculum remains a challenge since educators often lack scalable tools to rapidly evaluate readability and adapt texts for diverse classroom needs. This study proposes to address this gap through a multimodal approach that combines transformer-based text classification with linguistic feature analysis to align texts with UK Key Stages.	Jordan J. Bird;	arxiv-cs.CL	2024-11-26
328	Can Artificial Intelligence Predict Clinical Trial Outcomes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the predictive capabilities of large language models (LLMs) such as GPT-3.5, GPT-4, and HINT in determining clinical trial outcomes.	Shuyi Jin; Lu Chen; Hongru Ding; Meijie Wang; Lun Yu;	arxiv-cs.LG	2024-11-26
329	Can Bidirectional Encoder Become The Ultimate Winner for Downstream Applications of Foundation Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article analyzes one-way and bidirectional models based on GPT and BERT and compares their differences based on the purpose of the model.	Lewen Yang; Xuanyu Zhou; Juao Fan; Xinyi Xie; Shengxin Zhu;	arxiv-cs.CL	2024-11-26
330	The Importance of Visual Modelling Languages in Generative Software Engineering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: GPT-4 accepts image and text inputs, rather than simply natural language. We investigate relevant use cases stemming from these enhanced capabilities of GPT-4.	Roberto Rossi;	arxiv-cs.SE	2024-11-26
331	Can AI Grade Your Essays? A Comparative Analysis of Large Language Models and Teacher Ratings in Multidimensional Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent developments in generative AI, such as large language models, offer potential solutions to facilitate essay-scoring tasks for teachers.	Kathrin Seßler; Maurice Fürstenberg; Babette Bühler; Enkelejda Kasneci;	arxiv-cs.CL	2024-11-25
332	Development of Pre-Trained Transformer-based Models for The Nepali Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing efforts have predominantly concentrated on basic encoder-based models, there is a notable gap in the exploration of decoder-based architectures. To address this gap, we have collected 27.5 GB of Nepali text data, approximately 2.4x larger than any previously available Nepali language corpus.	Prajwal Thapa; Jinu Nyachhyon; Mridul Sharma; Bal Krishna Bal;	arxiv-cs.CL	2024-11-24
333	Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To tackle these problems, we first propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset, where the complex evaluation task is decoupled into simpler sub-tasks, effectively reducing the learning complexity. Based on this dataset, we design innovative training strategies to effectively distill GPT-4o’s evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.	RONG-CHENG TU et. al.	arxiv-cs.CL	2024-11-23
334	Nimbus: Secure and Efficient Two-Party Inference for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents a new two-party inference framework $\mathsf{Nimbus}$ for Transformer models.	ZHENGYI LI et. al.	arxiv-cs.CR	2024-11-23
335	All That Glitters: Approaches to Evaluations with Unreliable Model and Human Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The effects of this error can escape commonly reported metrics of label quality or obscure questions of accuracy, bias, fairness, and usefulness during model evaluation. This study demonstrates methods for answering such questions even in the context of very low reliabilities from expert humans.	Michael Hardy;	arxiv-cs.CL	2024-11-23
336	Enhancing Grammatical Error Detection Using BERT with Cleaned Lang-8 Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an improved LLM based model for Grammatical Error Detection (GED), which is a very challenging and equally important problem for many applications.	Rahul Nihalani; Kushal Shah;	arxiv-cs.CL	2024-11-23
337	Improving Next Tokens Via Second-to-Last Predictions with Generate and Refine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use our model to improve the next token predictions of a standard GPT by combining both predictions in a “generate-then-refine” approach.	Johannes Schneider;	arxiv-cs.CL	2024-11-23
338	Astro-HEP-BERT: A Bidirectional Language Model for Studying The Meanings of Concepts in Astrophysics and High Energy Physics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: I present Astro-HEP-BERT, a transformer-based language model specifically designed for generating contextualized word embeddings (CWEs) to study the meanings of concepts in astrophysics and high-energy physics.	Arno Simons;	arxiv-cs.CL	2024-11-22
339	Inducing Human-like Biases in Moral Reasoning Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the alignment (BrainScore) of large language models (LLMs) fine-tuned for moral reasoning on behavioral data and/or brain data of humans performing the same task.	ARTEM KARPOV et. al.	arxiv-cs.AI	2024-11-22
340	Purrfessor: A Fine-tuned Multimodal LLaVA Diet Health Chatbot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces Purrfessor, an innovative AI chatbot designed to provide personalized dietary guidance through interactive, multimodal engagement.	Linqi Lu; Yifan Deng; Chuan Tian; Sijia Yang; Dhavan Shah;	arxiv-cs.HC	2024-11-22
341	A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings show that transformer-based models have the potential to improve suicide ideation detection, thereby providing a path to develop robust mental health monitoring tools from social media. This research, therefore, underlines the undeniable prospect of advanced techniques in Natural Language Processing (NLP) while improving suicide prevention efforts.	Khalid Hasan; Jamil Saquer;	arxiv-cs.LG	2024-11-22
342	Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While existing research has primarily focused on model-specific adversarial methods, real-world applications demand a more generalizable and universal approach to audio adversarial attacks. In this paper, we introduce the Chat-Audio Attacks (CAA) benchmark including four distinct types of audio attacks, which aims to explore the the vulnerabilities of LLMs to these audio attacks in conversational scenarios.	WANQI YANG et. al.	arxiv-cs.SD	2024-11-22
343	Multiset Transformer: Advancing Representation Learning in Persistence Diagrams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve persistence diagram representation learning, we propose Multiset Transformer.	Minghua Wang; Ziyun Huang; Jinhui Xu;	arxiv-cs.LG	2024-11-21
344	Comparative Analysis of Pooling Mechanisms in LLMs: A Sentiment Analysis Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their widespread use, the comparative performance of these strategies on different LLM architectures remains underexplored. To address this gap, this paper investigates the effects of these pooling mechanisms on two prominent LLM families — BERT and GPT, in the context of sentence-level sentiment analysis.	Jinming Xing; Dongwen Luo; Chang Xue; Ruilin Xing;	arxiv-cs.CL	2024-11-21
345	GPT Versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objectives of the study were to examine novel ethical issues arising from the application of LLMs in multi-robot systems.	REBEKAH ROUSI et. al.	arxiv-cs.RO	2024-11-21
346	Evaluating The Robustness of Analogical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: On digit-matrix problems, we find a similar pattern but only on one out of the two types of variants we tested.	Martha Lewis; Melanie Mitchell;	arxiv-cs.CL	2024-11-21
347	BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, We experiment with four models from the BERT family: BERT Base, DistilBERT, ALBERT, and RoBERTa, and use multiclass classification to assess the alignment between CO and PO/PSO pairs.	Natenaile Asmamaw Shiferaw; Simpenzwe Honore Leandre; Aman Sinha; Dillip Rout;	arxiv-cs.LG	2024-11-21
348	Explaining GPT-4’s Schema of Depression Using Machine Behavior Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leveraged contemporary measurement theory to decode how GPT-4 interrelates depressive symptoms to inform both clinical utility and theoretical understanding.	ADITHYA V GANESAN et. al.	arxiv-cs.CL	2024-11-20
349	Exploring Large Language Models for Climate Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capability of GPT-4 in predicting rainfall at short-term (15-day) and long-term (12-month) scales.	Yang Wang; Hassan A. Karimi;	arxiv-cs.LG	2024-11-20
350	Topkima-Former: Low-energy, Low-Latency Inference for Transformers Using Top-k In-memory ADC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose innovations at the circuit, architecture, and algorithm levels to accelerate the transformer.	SHUAI DONG et. al.	arxiv-cs.AR	2024-11-20
351	AI-Driven Agents with Prompts Designed for High Agreeableness Increase The Likelihood of Being Mistaken for A Human in The Turing Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Various explanations in the literature address why these GPT agents were perceived as human, including psychological frameworks for understanding anthropomorphism. These findings highlight the importance of personality engineering as an emerging discipline in artificial intelligence, calling for collaboration with psychology to develop ergonomic psychological models that enhance system adaptability in collaborative activities.	U. LEÓN-DOMÍNGUEZ et. al.	arxiv-cs.AI	2024-11-20
352	SynEHRgy: Synthesizing Mixed-Type Structured Electronic Health Records Using Decoder-Only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel tokenization strategy tailored for structured EHR data, which encompasses diverse data types such as covariates, ICD codes, and irregularly sampled time series.	Hojjat Karami; David Atienza; Anisoara Ionescu;	arxiv-cs.LG	2024-11-20
353	Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Video Retrieval-Augmented Generation (Video-RAG), a training-free and cost-effective pipeline that employs visually-aligned auxiliary texts to help facilitate cross-modality alignment while providing additional information beyond the visual content.	YONGDONG LUO et. al.	arxiv-cs.CV	2024-11-20
354	Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article aims to introduce a novel approach or model that attains improved performance for Vietnamese NLI.	Dat Van-Thanh Nguyen; Tin Van Huynh; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-11-20
355	Benchmarking GPT-4 Against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a comprehensive evaluation of GPT-4’s translation capabilities compared to human translators of varying expertise levels.	JIANHAO YAN et. al.	arxiv-cs.CL	2024-11-20
356	Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of A Virtual Campus Environment with OpenAI GPT Integration with Unity 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach to multiple language learning, with Hindi the language to be learnt in our case, by using the integration of virtual reality environments and AI enabled tutoring systems using OpenAIs GPT api calls.	Adithya TG; Abhinavaram N; Gowri Srinivasa;	arxiv-cs.HC	2024-11-19
357	Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation of tokenizers used by 12 LLMs across all 22 official languages of India, with a focus on comparing the efficiency of their tokenization processes.	S. Tamang; D. J. Bora;	arxiv-cs.CL	2024-11-19
358	The Illusion of Empathy: How AI Chatbots Shape Conversation Perception Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study examines how chatbot identity and perceived empathy influence users’ overall conversation experience.	TINGTING LIU et. al.	arxiv-cs.HC	2024-11-19
359	Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explored the improvement in terms of multi-class disease classification via pre-trained language models over Medical-Abstracts-TC-Corpus that spans five medical conditions.	Ahmed Akib Jawad Karim; Muhammad Zawad Mahmud; Samiha Islam; Aznur Azam;	arxiv-cs.CL	2024-11-19
360	Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ three distinct text vectorization methods for SVM: Term Frequency Inverse Document Frequency (TF-IDF), Word2Vec, and Bag of Words (BoW) evaluating their effectiveness in distinguishing between genuine and fake news.	Ahmed Akib Jawad Karim; Kazi Hafiz Md Asad; Aznur Azam;	arxiv-cs.CL	2024-11-19
361	Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review examines the development of abstractive NLP-based text summarization approaches and compares them to existing techniques for extractive summarization.	Leon Kopitar; Primoz Kocbek; Lucija Gosak; Gregor Stiglic;	arxiv-cs.CL	2024-11-18
362	Re-examining Learning Linear Functions in Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore a simple model of ICL in a controlled setup with synthetic training data to investigate ICL of univariate linear functions.	Omar Naim; Guilhem Fouilhé; Nicholas Asher;	arxiv-cs.LG	2024-11-18
363	CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This task typically involves text-length alignment and seems easy to solve; however, due to the limited information content in pinyin abbreviations, achieving accurate conversion is challenging. In this paper, we treat this as a fill-mask task and propose CNMBERT, which stands for zh-CN Pinyin Multi-mask BERT Model, as a solution to this issue.	Zishuo Feng; Feng Cao;	arxiv-cs.CL	2024-11-18
364	Automatic A-C. Network Switching Units Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The desirable characteristics of automatic switching units designed for application in secondary a-c. distribution networks are discussed in this paper. Descriptions are given of …	G. G. Grissinger;	Journal of the A.I.E.E.
365	Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: On the other hand, the dynamic multi-grained behavior-aware preference is hard to capture in interaction sequences, which reflects interaction-aware sequential pattern. To tackle these challenges, we propose a Multi-Grained Preference enhanced Transformer framework (M-GPT).	CHUAN HE et. al.	arxiv-cs.IR	2024-11-18
366	A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research introduces a novel text generation model that combines BERT’s semantic interpretation strengths with GPT-4’s generative capabilities, establishing a high standard in generating coherent, contextually accurate language.	JIAJING CHEN et. al.	arxiv-cs.CL	2024-11-18
367	Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach that encapsulates conceptual relationships among variables within a well-defined knowledge graph, forming dynamic and learnable KGEs for seamless integration into the transformer architecture.	Shubham Tanaji Kakde; Rony Mitra; Jasashwi Mandal; Manoj Kumar Tiwari;	arxiv-cs.LG	2024-11-17
368	Brain-inspired Action Generation with Spiking Transformer Diffusion Policy Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Especially in Can task, we achieved an improvement of 8%.	Qianhao Wang; Yinqian Sun; Enmeng Lu; Qian Zhang; Yi Zeng;	arxiv-cs.RO	2024-11-15
369	Does Prompt Formatting Have Any Impact on LLM Performance? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although previous research has explored aspects like rephrasing prompt contexts, using various prompting techniques (like in-context learning and chain-of-thought), and ordering few-shot examples, our understanding of LLM sensitivity to prompt templates remains limited. Therefore, this paper examines the impact of different prompt templates on LLM performance.	JIA HE et. al.	arxiv-cs.CL	2024-11-15
370	CMATH: Cross-Modality Augmented Transformer with Hierarchical Variational Distillation for Multimodal Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Cross-Modality Augmented Transformer with Hierarchical Variational Distillation, called CMATH, which consists of two major components, i.e., Multimodal Interaction Fusion and Hierarchical Variational Distillation.	XIAOFEI ZHU et. al.	arxiv-cs.MM	2024-11-15
371	KuaiFormer: Transformer-Based Retrieval at Kuaishou Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce KuaiFormer, a novel transformer-based retrieval framework deployed in a large-scale content recommendation system.	CHI LIU et. al.	arxiv-cs.IR	2024-11-15
372	Adopting RAG for LLM-Aided Future Vehicle Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to enhance automated design and software development in the automotive industry.	Vahid Zolfaghari; Nenad Petrovic; Fengjunjie Pan; Krzysztof Lebioda; Alois Knoll;	arxiv-cs.SE	2024-11-14
373	BabyLM Challenge: Exploring The Effect of Variation Sets on Language Model Training Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the context of the BabyLM Challenge, we focus on Variation Sets (VSs), sets of consecutive utterances expressing a similar intent with slightly different words and structures, which are ubiquitous in CDS.	Akari Haga; Akiyo Fukatsu; Miyu Oba; Arianna Bisazza; Yohei Oseki;	arxiv-cs.CL	2024-11-14
374	LoRA-LiteE: A Computationally Efficient Framework for Chatbot Preference-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, RLHF methods are often computationally intensive and resource-demanding, limiting their scalability and accessibility for broader applications. To address these challenges, this study introduces LoRA-Lite Ensemble (LoRA-LiteE), an innovative framework that combines Supervised Fine-tuning (SFT) with Low-Rank Adaptation (LoRA) and Ensemble Learning techniques to effectively aggregate predictions of lightweight models, which aim to achieve a balance between the performance and computational cost.	Yahe Yang; Chunliang Tao; Xiaojing Fan;	arxiv-cs.CL	2024-11-14
375	Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although many model compression approaches have been explored, they often suffer from notorious performance degradation. To address this issue, we introduce a new method, namely Transformer Re-parameterization, to boost the performance of lightweight Transformer models.	Zixing Zhang; Zhongren Dong; Weixiang Xu; Jing Han;	arxiv-cs.SD	2024-11-14
376	CamemBERT 2.0: A Smarter French Language Model Aged to Perfection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This issue emphasizes the need for updated models that reflect current linguistic trends. In this paper, we introduce two new versions of the CamemBERT base model-CamemBERTav2 and CamemBERTv2-designed to address these challenges.	WISSAM ANTOUN et. al.	arxiv-cs.CL	2024-11-13
377	Evaluating World Models with LLM for Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a comprehensive evaluation of the world models with LLMs from the decision making perspective.	Chang Yang; Xinrun Wang; Junzhe Jiang; Qinggang Zhang; Xiao Huang;	arxiv-cs.AI	2024-11-13
378	LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH).	XIAONAN NIE et. al.	arxiv-cs.DC	2024-11-13
379	Towards Optimizing A Retrieval Augmented Generation Using Large Language Model on Academic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the growing trend of many organizations integrating Retrieval Augmented Generation (RAG) into their operations, we assess RAG on domain-specific data and test state-of-the-art models across various optimization techniques.	Anum Afzal; Juraj Vladika; Gentrit Fazlija; Andrei Staradubets; Florian Matthes;	arxiv-cs.AI	2024-11-13
380	TRACE: Transformer-based Risk Assessment for Clinical Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TRACE (Transformer-based Risk Assessment for Clinical Evaluation), a novel method for clinical risk assessment based on clinical data, leveraging the self-attention mechanism for enhanced feature interaction and result interpretation.	Dionysis Christopoulos; Sotiris Spanos; Valsamis Ntouskos; Konstantinos Karantzalos;	arxiv-cs.CV	2024-11-13
381	Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using 385 questions spanning seven safety knowledge areas, the study analyzes the models’ accuracy, consistency, and reliability.	Farouq Sammour; Jia Xu; Xi Wang; Mo Hu; Zhenyu Zhang;	arxiv-cs.AI	2024-11-12
382	Circuit Complexity Bounds for RoPE-based Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we establish a circuit complexity bound for Transformers with $\mathsf{RoPE}$ attention.	BO CHEN et. al.	arxiv-cs.LG	2024-11-12
383	Derivational Morphology Reveals Analogical Generalization in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new method for investigating linguistic generalization in LLMs: focusing on GPT-J, we fit cognitive models that instantiate rule-based and analogical learning to the LLM training data and compare their predictions on a set of nonce adjectives with those of the LLM, allowing us to draw direct conclusions regarding underlying mechanisms.	Valentin Hofmann; Leonie Weissweiler; David Mortensen; Hinrich Schütze; Janet Pierrehumbert;	arxiv-cs.CL	2024-11-12
384	Split and Merge: Aligning Position Biases in LLM-based Evaluators IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLM-based evaluators exhibit position bias, or inconsistency, when used to evaluate candidate answers in pairwise comparisons, favoring either the first or second answer regardless of content. To address this limitation, we propose PORTIA, an alignment-based system designed to mimic human comparison strategies to calibrate position bias in a lightweight yet effective manner.	ZONGJIE LI et. al.	emnlp	2024-11-11
385	MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how to build small fact-checking models that have GPT-4-level performance but for 400x lower cost.	Liyan Tang; Philippe Laban; Greg Durrett;	emnlp	2024-11-11
386	Generalizing Clinical De-identification Models By Privacy-safe Data Augmentation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, labeling standards and the formats of patient records vary across different institutions. Our study addresses these issues by exploiting GPT-4 for data augmentation through one-shot and zero-shot prompts.	Woojin Kim; Sungeun Hahm; Jaejin Lee;	emnlp	2024-11-11
387	MTLS: Making Texts Into Linguistic Symbols Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we shift the focus to the symbolic properties and introduce MTLS: a pre-training method to improve the multilingual capability of models by Making Texts into Linguistic Symbols.	Wenlong Fei; Xiaohua Wang; Min Hu; Qingyu Zhang; Hongbo Li;	emnlp	2024-11-11
388	Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the frequency of (anti-)solidarity towards women and migrants in German parliamentary debates between 1867 and 2022.	AIDA KOSTIKOVA et. al.	emnlp	2024-11-11
389	Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its efficiency, Sentence-BERT tackles STS tasks from a classification perspective, overlooking the progressive nature of semantic relationships, which results in suboptimal performance. To bridge this gap, this paper presents an innovative regression framework and proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss.	Bowen Zhang; Chunping Li;	emnlp	2024-11-11
390	Can LLMs Replace Neil DeGrasse Tyson? Evaluating The Reliability of LLMs As Science Communicators Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on evaluating the reliability of current LLMs as science communicators.	Prasoon Bajpai; Niladri Chatterjee; Subhabrata Dutta; Tanmoy Chakraborty;	emnlp	2024-11-11
391	Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-standard varieties from around the world).	EVE FLEISIG et. al.	emnlp	2024-11-11
392	A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing hate speech detection solutions have utilized the features by treating each post as an isolated input instance for the classification. This paper addresses this issue by introducing a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information.	Prashant Kapil; Asif Ekbal;	arxiv-cs.CL	2024-11-11
393	White-Box Diffusion Transformer for Single-cell RNA-seq Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the process of data acquisition is often constrained by high cost and limited sample availability. To overcome these limitations, we propose a hybrid model based on Diffusion model and White-Box transformer that aims to generate synthetic and biologically plausible scRNA-seq data.	Zhuorui Cui; Shengze Dong; Ding Liu;	arxiv-cs.LG	2024-11-11
394	BudgetMLAgent: A Cost-Effective LLM Multi-Agent System for Automating Machine Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the motivation of developing a cost-efficient LLM based solution for solving ML tasks, we propose an LLM Multi-Agent based system which leverages combination of experts using profiling, efficient retrieval of past observations, LLM cascades, and ask-the-expert calls.	Shubham Gandhi; Manasi Patwardhan; Lovekesh Vig; Gautam Shroff;	arxiv-cs.MA	2024-11-11
395	Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a straightforward yet potent Conversation Reconstruction Attack.	Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang;	emnlp	2024-11-11
396	Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) evolve, evaluating their output reliably becomes increasingly difficult due to the high cost of human evaluation. To address this, we introduce FLAMe, a family of Foundational Large Autorater Models.	TU VU et. al.	emnlp	2024-11-11
397	Comparing A BERT Classifier and A GPT Classifier for Detecting Connective Language Across Multiple Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an approach for detecting connective language-defined as language that facilitates engagement, understanding, and conversation-from social media discussions.	Josephine Lukito; Bin Chen; Gina M. Masullo; Natalie Jomini Stroud;	emnlp	2024-11-11
398	SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy.	PRAKAMYA MISHRA et. al.	emnlp	2024-11-11
399	DA3: A Distribution-Aware Adversarial Attack Against Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, they are easy to detect using straightforward detection methods, diminishing the efficacy of such attacks. To address this issue, we propose a Distribution-Aware Adversarial Attack (DA3) method.	Yibo Wang; Xiangjue Dong; James Caverlee; Philip S. Yu;	emnlp	2024-11-11
400	TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A particular interest lies on keystroke dynamics (KD), which refers to the task of recognizing individuals’ identity based on their unique typing style. In this work, we propose the use of pre-trained language models (PLMs) to recognize such patterns.	Matheus Simão; Fabiano Prado; Omar Abdul Wahab; Anderson Avila;	arxiv-cs.CR	2024-11-11
401	Annotation Alignment: Comparing LLM and Human Annotations of Conversational Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that larger datasets are needed to resolve whether GPT-4 exhibits disparities in how well it correlates with different demographic groups.	Rajiv Movva; Pang Wei Koh; Emma Pierson;	emnlp	2024-11-11
402	On The Reliability of Psychological Scales on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits.	JEN-TSE HUANG et. al.	emnlp	2024-11-11
403	TreeCoders: Trees of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TreeCoders, a novel family of transformer trees.	Pierre Colonna D’Istria; Abdulrahman Altahhan;	arxiv-cs.CL	2024-11-11
404	Unraveling The Gradient Descent Dynamics of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradient Descent (GD) to achieve guaranteed convergence?	Bingqing Song; Boran Han; Shuai Zhang; Jie Ding; Mingyi Hong;	arxiv-cs.LG	2024-11-11
405	GPT Vs RETRO: Exploring The Intersection of Retrieval and Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we apply PEFT methods (P-tuning, Adapters, and LoRA) to a modified Retrieval-Enhanced Transformer (RETRO) and a baseline GPT model across several sizes, ranging from 823 million to 48 billion parameters.	Aleksander Ficek; Jiaqi Zeng; Oleksii Kuchaiev;	emnlp	2024-11-11
406	MaLei at The PLABA Track of TREC 2024: RoBERTa for Term Replacement — LLaMA3.1 and GPT-4o for Complete Abstract Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In task one (term replacement), we applied fine-tuned ReBERTa-Base models to identify and classify the difficult terms, jargon, and acronyms in the biomedical abstracts and reported the F1 score (Task 1A and 1B).	Zhidong Ling; Zihao Li; Pablo Romero; Lifeng Han; Goran Nenadic;	arxiv-cs.CL	2024-11-11
407	On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.	YEKUN CHAI et. al.	emnlp	2024-11-11
408	DAMRO: Dive Into The Attention Mechanism of LVLM to Reduce Object Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issue, we propose DAMRO, a novel training-free strategy that Dive into Attention Mechanism of LVLM to Reduce Object Hallucination.	Xuan Gong; Tianshi Ming; Xinpeng Wang; Zhihua Wei;	emnlp	2024-11-11
409	ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision.	Mallika Garg; Debashis Ghosh; Pyari Mohan Pradhan;	arxiv-cs.CV	2024-11-11
410	Knowledge Graph Enhanced Large Language Model Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of post-edit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME.	MENGQI ZHANG et. al.	emnlp	2024-11-11
411	Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first of its kind benchmark for depression-anxiety comorbidity classification from social media posts.	AMEY HENGLE et. al.	emnlp	2024-11-11
412	Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months.	Michael Lan; Philip Torr; Fazl Barez;	emnlp	2024-11-11
413	Will LLMs Replace The Encoder-Only Models in Temporal Relation Classification? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task.	Gabriel Roccabruna; Massimo Rizzoli; Giuseppe Riccardi;	emnlp	2024-11-11
414	Leveraging Pre-trained Language Models for Linguistic Analysis: A Case of Argument Structure Constructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of pre-trained language models in identifying argument structure constructions, important for modeling both first and second language learning.	Hakyung Sung; Kristopher Kyle;	emnlp	2024-11-11
415	Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese.	Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari;	emnlp	2024-11-11
416	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks.	VIKTORIIA A. CHEKALINA et. al.	emnlp	2024-11-11
417	FOOL ME IF YOU CAN! An Adversarial Dataset to Investigate The Robustness of LMs in Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models still struggle with recognizing semantic boundaries and often misclassify homonyms in adversarial context. Therefore, we propose FOOL: FOur-fold Obscure Lexical, a new coarse-grained WSD dataset, which includes four different test sets designed to assess the robustness of language models in WSD tasks.	MOHAMAD BALLOUT et. al.	emnlp	2024-11-11
418	BiasWipe: Mitigating Unintended Bias in Text Classifiers Through Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a robust and generalizable technique BiasWipe to mitigate unintended bias in language models.	Mamta Mamta; Rishikant Chigrupaatii; Asif Ekbal;	emnlp	2024-11-11
419	Evaluating ChatGPT-3.5 Efficiency in Solving Coding Problems of Different Complexity Levels: An Empirical Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We assess the performance of ChatGPT’s GPT-3.5-turbo model on LeetCode, a popular platform with algorithmic coding challenges for technical interview practice, across three difficulty levels: easy, medium, and hard.	Minda Li; Bhaskar Krishnamachari;	arxiv-cs.SE	2024-11-11
420	Using Language Models to Disambiguate Lexical Choices in Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English.	Josh Barua; Sanjay Subramanian; Kayo Yin; Alane Suhr;	emnlp	2024-11-11
421	Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge.	Steven Y. Feng; Noah Goodman; Michael Frank;	emnlp	2024-11-11
422	Pron Vs Prompt: Can Large Language Models Already Challenge A World-Class Fiction Author at Creative Text Writing? Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Are LLMs ready to compete in creative writing skills with a top (rather than average) novelist? To provide an initial answer for this question, we have carried out a contest …	Guillermo Marco; Julio Gonzalo; M.Teresa Mateo-Girona; Ram�n Del Castillo Santos;	emnlp	2024-11-11
423	Subword Segmentation in LLMs: Looking at Inflection and Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study two criteria: (i) adherence to morpheme boundaries and (ii) the segmentation consistency of the different inflected forms of a lemma.	Marion Di Marco; Alexander Fraser;	emnlp	2024-11-11
424	Evaluating Psychological Safety of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we designed unbiased prompts to systematically evaluate the psychological safety of large language models (LLMs).	Xingxuan Li; Yutong Li; Lin Qiu; Shafiq Joty; Lidong Bing;	emnlp	2024-11-11
425	GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Iterative Refinement Induced Self-Jailbreak (IRIS), a novel approach that leverages the reflective capabilities of LLMs for jailbreaking with only black-box access.	Govind Ramesh; Yao Dou; Wei Xu;	emnlp	2024-11-11
426	Universal Response and Emergence of Induction in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By applying our method, we observe signatures of induction behavior within the residual stream of Gemma-2-2B, Llama-3.2-3B, and GPT-2-XL. Across all models, we find that these induction signatures gradually emerge within intermediate layers and identify the relevant model sections composing this behavior.	Niclas Luick;	arxiv-cs.LG	2024-11-11
427	High-Fidelity Cellular Network Control-Plane Traffic Generation Without Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the feasibility of developing a high-fidelity MCN control plane traffic generator by leveraging generative ML models.	Z. Jonny Kong; Nathan Hu; Y. Charlie Hu; Jiayi Meng; Yaron Koral;	arxiv-cs.NI	2024-11-11
428	Ambient AI Scribing Support: Comparing The Performance of Specialized AI Agentic Architecture to Leading Foundational Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares Sporo Health’s AI Scribe, a proprietary model fine-tuned for medical scribing, with various LLMs (GPT-4o, GPT-3.5, Gemma-9B, and Llama-3.2-3B) in clinical documentation.	Chanseo Lee; Sonu Kumar; Kimon A. Vogt; Sam Meraj;	arxiv-cs.AI	2024-11-10
429	LProtector: An LLM-driven Vulnerability Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG).	ZE SHENG et. al.	arxiv-cs.CR	2024-11-10
430	Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This thesis introduces a Parameter-Efficient Fine-Tuning (PEFT) approach tailored for GPT-like models, aiming to mitigate hallucinations and enhance reproducibility, particularly in the computational domain of mass spectrometry.	Daniil Sulimov;	arxiv-cs.CL	2024-11-10
431	Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing finance benchmarks often suffer from limited language and task coverage, as well as challenges such as low-quality datasets and inadequate adaptability for LLM evaluation. To address these limitations, we propose Golden Touchstone, the first comprehensive bilingual benchmark for financial LLMs, which incorporates representative datasets from both Chinese and English across eight core financial NLP tasks.	XIAOJUN WU et. al.	arxiv-cs.CL	2024-11-09
432	AI’s Spatial Intelligence: Evaluating AI’s Understanding of Spatial Transformations in PSVT:R and Augmented Reality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies show Artificial Intelligence (AI) with language and vision capabilities still face limitations in spatial reasoning. In this paper, we have studied generative AI’s spatial capabilities of understanding rotations of objects utilizing its image and language processing features.	Uttamasha Monjoree; Wei Yan;	arxiv-cs.AI	2024-11-09
433	GPT Semantic Cache: Reducing LLM Costs and Latency Via Semantic Embedding Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GPT Semantic Cache, a method that leverages semantic caching of query embeddings in in-memory storage (Redis).	Sajal Regmi; Chetan Phakami Pun;	arxiv-cs.LG	2024-11-07
434	High Entropy Alloy Property Predictions Using Transformer-based Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a language transformer-based machine learning model to predict key mechanical properties of high-entropy alloys (HEAs), addressing the challenges due to their complex, multi-principal element compositions and limited experimental data.	Spyros Kamnis; Konstantinos Delibasis;	arxiv-cs.CE	2024-11-07
435	FineTuneBench: How Well Do Commercial Fine-tuning APIs Infuse Knowledge Into LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce FineTuneBench, an evaluation framework and dataset for understanding how well commercial fine-tuning APIs can successfully learn new and updated knowledge.	Eric Wu; Kevin Wu; James Zou;	arxiv-cs.CL	2024-11-07
436	Lightning IR: Straightforward Fine-tuning and Inference of Transformer-based Language Models for Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Lightning IR, an easy-to-use PyTorch Lightning-based framework for applying transformer-based language models in retrieval scenarios.	Ferdinand Schlatt; Maik Fröbe; Matthias Hagen;	arxiv-cs.IR	2024-11-07
437	Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams.	Adriana Caraeni; Alexander Scarlatos; Andrew Lan;	arxiv-cs.CY	2024-11-07
438	Understanding The Effects of Human-written Paraphrases in LLM-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we devise a new data collection strategy to collect Human & LLM Paraphrase Collection (HLPC), a first-of-its-kind dataset that incorporates human-written texts and paraphrases, as well as LLM-generated texts and paraphrases.	Hiu Ting Lau; Arkaitz Zubiaga;	arxiv-cs.CL	2024-11-06
439	A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to explore how LLMs can alleviate the burden of manual summarization, streamline workflow efficiencies, and support informed decision-making in healthcare settings.	YIMING LI et. al.	arxiv-cs.CL	2024-11-06
440	On-Device Emoji Classifier Trained with GPT-based Data Augmentation for A Mobile Keyboard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an on-device emoji classifier based on MobileBert with reasonable memory and latency requirements for SwiftKey.	Hossam Amer; Joe Osborne; Michael Zaki; Mohamed Afify;	arxiv-cs.CL	2024-11-06
441	Automatic Generation of Question Hints for Mathematics Problems Using Large Language Models in Educational Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present here the study of several dimensions: 1) identifying error patterns made by simulated students on secondary-level math exercises; 2) developing various prompts for GPT-4o as a teacher and evaluating their effectiveness in generating hints that enable simulated students to self-correct; and 3) testing the best-performing prompts, based on their ability to produce relevant hints and facilitate error correction, with Llama-3-8B-Instruct as the teacher, allowing for a performance comparison with GPT-4o.	Junior Cedric Tonga; Benjamin Clement; Pierre-Yves Oudeyer;	arxiv-cs.CL	2024-11-05
442	Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the feasibility of using large language models (LLMs), specifically GPT-4o (ChatGPT), for automated grading of conceptual questions in an undergraduate Mechanical Engineering course.	RUJUN GAO et. al.	arxiv-cs.CY	2024-11-05
443	Enhancing Transformer Training Efficiency with Dynamic Dropout Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Dynamic Dropout, a novel regularization technique designed to enhance the training efficiency of Transformer models by dynamically adjusting the dropout rate based on training epochs or validation loss improvements.	Hanrui Yan; Dan Shao;	arxiv-cs.LG	2024-11-05
444	From Medprompt to O1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Following on the Medprompt study with GPT-4, we systematically evaluate the o1-preview model across various medical benchmarks.	HARSHA NORI et. al.	arxiv-cs.CL	2024-11-05
445	Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we identify representation collapse in the model’s intermediate layers as a key factor limiting their reasoning capabilities.	MD RIFAT AREFIN et. al.	arxiv-cs.LG	2024-11-04
446	Ask, and It Shall Be Given: Turing Completeness of Prompting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Since the success of GPT, large language models (LLMs) have been revolutionizing machine learning and have initiated the so-called LLM prompting paradigm. In the era of LLMs, …	Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong;	ArXiv	2024-11-04
447	Wave Network: An Ultra-Small Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an innovative token representation and update method in a new ultra-small language model: the Wave network.	Xin Zhang; Victor S. Sheng;	arxiv-cs.CL	2024-11-04
448	Ask, and It Shall Be Given: On The Turing Completeness of Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function.	Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong;	arxiv-cs.LG	2024-11-04
449	Advancements and Limitations of LLMs in Replicating Human Color-word Associations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compared multiple generations of LLMs (from GPT-3 to GPT-4o) against human color-word associations using data collected from over 10,000 Japanese participants, involving 17 colors and words from eight categories in Japanese.	Makoto Fukushima; Shusuke Eshita; Hiroshige Fukuhara;	arxiv-cs.CL	2024-11-04
450	Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging advancements in natural language processing, this study presents a systematic approach to enrich tabular datasets with features derived from large language model embeddings.	Gjergji Kasneci; Enkelejda Kasneci;	arxiv-cs.LG	2024-11-03
451	Can Large Language Model Predict Employee Attrition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning (ML) advancements offer more scalable and accurate solutions, but large language models (LLMs) introduce new potential in human resource management by interpreting nuanced employee communication and detecting subtle turnover cues.	Xiaoye Ma; Weiheng Liu; Changyi Zhao; Liliya R. Tukhvatulina;	arxiv-cs.LG	2024-11-02
452	Transformer-CNN for Small Image Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Yan-Lin Chen; Chun-Liang Lin; Yu-Chen Lin; Tzu-Chun Chen;	Signal Process. Image Commun.	2024-11-01
453	Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism.	BIN LEI et. al.	arxiv-cs.AI	2024-11-01
454	LLMs: A Game-Changer for Software Engineers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a critical analysis of technical strengths, limitations, real-world case studies, and future research directions, this paper argues that LLMs are not just reshaping how software is developed but are redefining the role of developers.	Md Asraful Haque;	arxiv-cs.SE	2024-11-01
455	Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, we introduce the Lingma SWE-GPT series, comprising Lingma SWE-GPT 7B and 72B.	YINGWEI MA et. al.	arxiv-cs.SE	2024-11-01
456	GameGen-X: Interactive Open-world Game Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos.	Haoxuan Che; Xuanhua He; Quande Liu; Cheng Jin; Hao Chen;	arxiv-cs.CV	2024-11-01
457	A Lightweight CNN-Transformer Network for Pixel-based Crop Mapping Using Time-series Sentinel-2 Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View	YUMIAO WANG et. al.	Comput. Electron. Agric.	2024-11-01
458	Online Semi-Supervised Transformer for Resilient Vehicle GNSS/INS Navigation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Inertial Navigation Systems (INS) and Global Navigation Satellite Systems (GNSS) integrated navigation system is widely employed for vehicular positioning. However, obstacles …	HAOWEN WANG et. al.	IEEE Transactions on Vehicular Technology	2024-11-01
459	GPT for Games: An Updated Scoping Review (2020-2024) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to illustrate the state of the art in innovative GPT applications in games, offering a foundation to enrich game development and enhance player experiences through cutting-edge AI innovations.	Daijin Yang; Erica Kleinman; Casper Harteveld;	arxiv-cs.AI	2024-10-31
460	Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces.	Nouar AlDahoul; Myles Joshua Toledo Tan; Harishwar Reddy Kasireddy; Yasir Zaki;	arxiv-cs.CV	2024-10-31
461	GPT or BERT: Why Not Both? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple way to merge masked language modeling with causal language modeling.	Lucas Georges Gabriel Charpentier; David Samuel;	arxiv-cs.CL	2024-10-31
462	IO Transformer: Evaluating SwinV2-Based Reward Models for Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines SwinV2-based reward models, called the Input-Output Transformer (IO Transformer) and the Output Transformer.	Maxwell Meyer; Jack Spruyt;	arxiv-cs.CV	2024-10-31
463	Handwriting Recognition in Historical Documents with Multimodal LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I evaluate the accuracy of handwritten document transcriptions generated by Gemini against the current state of the art Transformer based methods.	Lucian Li;	arxiv-cs.CV	2024-10-31
464	Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases.	MUHAMMED SAEED et. al.	arxiv-cs.CL	2024-10-31
465	Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN).	IBNE HASSAN et. al.	arxiv-cs.CV	2024-10-31
466	EDT: An Efficient Diffusion Transformer Framework Inspired By Human-like Sketching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the computation budget of transformer-based DPMs, this work proposes the Efficient Diffusion Transformer (EDT) framework.	Xinwang Chen; Ning Liu; Yichen Zhu; Feifei Feng; Jian Tang;	arxiv-cs.CV	2024-10-31
467	An Empirical Analysis of GPT-4V’s Performance on Fashion Aesthetic Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time.	YUKI HIRAKAWA et. al.	arxiv-cs.CV	2024-10-31
468	A Comprehensive Study on Quantization Techniques for Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have been extensively researched and used in both academia and industry since the rise in popularity of the Transformer model, which demonstrates …	Jiedong Lang; Zhehao Guo; Shuyu Huang;	ArXiv	2024-10-30
469	LoFLAT: Local Feature Matching Using Focused Linear Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper.	Naijian Cao; Renjie He; Yuchao Dai; Mingyi He;	arxiv-cs.CV	2024-10-30
470	ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching.	JUNJIE NI et. al.	arxiv-cs.CV	2024-10-30
471	Automated Personnel Selection for Software Engineers Using LLM-Based Profile Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a fresh dataset and technique as well as shows how transformer models could improve recruiting procedures.	Ahmed Akib Jawad Karim; Shahria Hoque; Md. Golam Rabiul Alam; Md. Zia Uddin;	arxiv-cs.SE	2024-10-30
472	ProTransformer: Robustify Transformers Via Plug-and-Play Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures.	Zhichao Hou; Weizhi Gao; Yuchen Shen; Feiyi Wang; Xiaorui Liu;	arxiv-cs.LG	2024-10-30
473	EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The former hurts the fairness of benchmarks, and the latter hinders practitioners from selecting superior LLMs for specific programming domains. To address these two limitations, we propose a new benchmark – EvoCodeBench, which has the following advances: (1) Evolving data.	JIA LI et. al.	arxiv-cs.CL	2024-10-30
474	Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context for preference.	Rishabh Adiga; Besmira Nushi; Varun Chandrasekaran;	arxiv-cs.CL	2024-10-29
475	AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent work, AmpleGCG~\citep{liao2024amplegcg}, demonstrates that a generative model can quickly produce numerous customizable gibberish adversarial suffixes for any harmful query, exposing a range of alignment gaps in out-of-distribution (OOD) language spaces. To bring more attention to this area, we introduce AmpleGCG-Plus, an enhanced version that achieves better performance in fewer attempts.	Vishal Kumar; Zeyi Liao; Jaylen Jones; Huan Sun;	arxiv-cs.CL	2024-10-29
476	GPT-4o Reads The Mind in The Eyes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using two versions of a widely used theory of mind test, the Reading the Mind in Eyes Test and the Multiracial Reading the Mind in the Eyes Test, we found that GPT-4o outperformed humans in interpreting mental states from upright faces but underperformed humans when faces were inverted.	JAMES W. A. STRACHAN et. al.	arxiv-cs.HC	2024-10-29
477	Is GPT-4 Less Politically Biased Than GPT-3.5? A Renewed Investigation of ChatGPT’s Political Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the political biases and personality traits of ChatGPT, specifically comparing GPT-3.5 to GPT-4.	Erik Weber; Jérôme Rutinowski; Niklas Jost; Markus Pauly;	arxiv-cs.CL	2024-10-28
478	Sequential Choice in Ordered Bundles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate several predictive models, including two custom Transformers using decoder-only and encoder-decoder architectures, fine-tuned GPT-3, a custom LSTM model, a reinforcement learning model, two Markov models, and a zero-order model.	Rajeev Kohli; Kriste Krstovski; Hengyu Kuang; Hengxu Lin;	arxiv-cs.LG	2024-10-28
479	A Simple Yet Effective Corpus Construction Framework for Indonesian Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How to efficiently construct high-quality evaluation corpora for GEC in low-resource languages has become a significant challenge. To fill these gaps, in this paper, we present a framework for constructing GEC corpora.	NANKAI LIN et. al.	arxiv-cs.CL	2024-10-28
480	SepMamba: State-space Models for Speaker Separation Using Mamba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers.	THOR HØJHUS AVENSTRUP et. al.	arxiv-cs.SD	2024-10-28
481	KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD.	Rambod Azimi; Rishav Rishav; Marek Teichmann; Samira Ebrahimi Kahou;	arxiv-cs.CL	2024-10-28
482	Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a medical literature summary generation method based on the BERT model to address the challenges brought by the current explosion of medical information.	JIACHENG HU et. al.	arxiv-cs.CL	2024-10-28
483	MultiTok: Variable-Length Tokenization for Efficient LLMs Adapted from LZW Compression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, current methodologies to train such LLMs require extensive resources including but not limited to large amounts of data, expensive machinery, and lengthy training. To solve this problem, this paper proposes a new tokenization method inspired by universal Lempel-Ziv-Welch data compression that compresses repetitive phrases into multi-word tokens.	Noel Elias; Homa Esfahanizadeh; Kaan Kale; Sriram Vishwanath; Muriel Medard;	arxiv-cs.CL	2024-10-28
484	UOttawa at LegalLens-2024: Transformer-based Classification Experiments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents the methods used for LegalLens-2024 shared task, which focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals.	Nima Meghdadi; Diana Inkpen;	arxiv-cs.CL	2024-10-28
485	Gender Bias in LLM-generated Interview Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications.	Haein Kong; Yongsu Ahn; Sangyub Lee; Yunho Maeng;	arxiv-cs.CL	2024-10-28
486	Exploring The Potential of Large Language Models for Red Teaming in Military Coalition Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper reports on an ongoing investigation comparing the performance of large language models (LLMs) in generating penetration test scripts for realistic red agents. The goal …	ERIK ADLER et. al.	MILCOM 2024 – 2024 IEEE Military Communications Conference …	2024-10-28
487	Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project explores the security vulnerabilities in relation to prompt injection attacks.	Md Abdur Rahman; Fan Wu; Alfredo Cuzzocrea; Sheikh Iqbal Ahamed;	arxiv-cs.CL	2024-10-27
488	SeisGPT: A Physics-Informed Data-Driven Large Model for Real-Time Seismic Response Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods, which rely on complex finite element models often struggle with balancing computational efficiency and accuracy. To address this challenge, we introduce SeisGPT, a data-driven, large physics-informed model that leverages deep neural networks based on the Generative Pre-trained Transformer (GPT) architecture.	SHIQIAO MENG et. al.	arxiv-cs.CE	2024-10-26
489	Notes on The Mathematical Structure of GPT LLM Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM. …	Spencer Becker-Kahn;	arxiv-cs.LG	2024-10-25
490	GPT-4o System Card IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. …	OPENAI AARON HURST et. al.	ArXiv	2024-10-25
491	No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts.	ISRAEL FAMA et. al.	arxiv-cs.CL	2024-10-24
492	Understanding Ranking LLMs: A Mechanistic Analysis for Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the internal mechanisms of state-of-the-art, fine-tuned LLMs for passage reranking.	Tanya Chowdhury; Atharva Nijasure; James Allan;	arxiv-cs.IR	2024-10-24
493	Integrating Large Language Models with Internet of Things Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection, macroprogramming over IoT systems, and sensor data processing.	Mingyu Zong; Arvin Hekmati; Michael Guastalla; Yiyi Li; Bhaskar Krishnamachari;	arxiv-cs.AI	2024-10-24
494	GPT-Signal: Generative AI for Semi-automated Feature Engineering in The Alpha Research Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the recent development of Generative Artificial Intelligence(Gen AI) and Large Language Models (LLMs), we present a novel way of leveraging GPT-4 to generate new return-predictive formulaic alphas, making alpha mining a semi-automated process, and saving time and energy for investors and traders.	Yining Wang; Jinman Zhao; Yuri Lawryshyn;	arxiv-cs.CE	2024-10-24
495	Lightweight Neural App Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel mobile phone control architecture, Lightweight Multi-modal App Control (LiMAC), for efficient interactions and control across various Android apps.	FILIPPOS CHRISTIANOS et. al.	arxiv-cs.AI	2024-10-23
496	OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel End-to-End GPT-based model OmniFlatten for full-duplex conversation, capable of effectively modeling the complex behaviors inherent to natural conversations with low latency.	QINGLIN ZHANG et. al.	arxiv-cs.CL	2024-10-23
497	Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords.	Farshad Jafari; Claire Arthur;	arxiv-cs.IT	2024-10-23
498	An Eye for An AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that although GPT-4o exhibits great potential in solving questions with visual information independently, major limitations still exist to the accuracy and quality of the generated results. We propose several novel approaches for CG educators to incorporate GenAI into CG teaching despite these limitations.	Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Jacqueline Whalley;	arxiv-cs.AI	2024-10-22
499	Interpreting Affine Recurrence Learning in GPT-style Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In-context learning allows transformers to generalize during inference without modifying their weights, yet the precise operations driving this capability remain largely opaque. This paper presents an investigation into the mechanistic interpretability of these transformers, focusing specifically on their ability to learn and predict affine recurrences as an ICL task.	Samarth Bhargav; Alexander Gu;	arxiv-cs.LG	2024-10-22
500	In Context Learning and Reasoning for Symbolic Regression with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore the potential of LLMs to perform symbolic regression — a machine-learning method for finding simple and accurate equations from datasets.	Samiha Sharlin; Tyler R. Josephson;	arxiv-cs.CL	2024-10-22
501	GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although large language models (LLMs) have demonstrated potential in code generation tasks, they often encounter issues such as refusal to code or hallucination in geospatial code generation due to a lack of domain-specific knowledge and code corpora. To address these challenges, this paper presents and open-sources the GeoCode-PT and GeoCode-SFT corpora, along with the GeoCode-Eval evaluation dataset.	SHUYANG HOU et. al.	arxiv-cs.SE	2024-10-22
502	Graph Transformers Dream of Electric Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present explicit weight configurations for implementing each algorithm, and we bound the constructed Transformers’ errors by the errors of the underlying algorithms.	Xiang Cheng; Lawrence Carin; Suvrit Sra;	arxiv-cs.LG	2024-10-22
503	Learning to Differentiate Pairwise-Argument Representations for Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable encoders to produce clearly distinguishable representations, we propose a joint learning framework.	ZHIPANG WANG et. al.	cikm	2024-10-21
504	Exploring Pretraining Via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pretraining strategy that uses active forgetting to achieve similar cross lingual transfer in decoder-only LLMs.	Divyanshu Aggarwal; Ashutosh Sathe; Sunayana Sitaram;	arxiv-cs.CL	2024-10-21
505	Comparative Study of Multilingual Idioms and Similes in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the gap in the literature concerning the comparative performance of LLMs in interpreting different types of figurative language across multiple languages.	PARIA KHOSHTAB et. al.	arxiv-cs.CL	2024-10-21
506	BART-based Hierarchical Attentional Network for Sentence Ordering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel BART-based Hierarchical Attentional Ordering Network (BHAONet), aiming to address the coherence modeling challenge within paragraphs, which stands as a cornerstone in comprehension, generation, and reasoning tasks.	Yiping Yang; Baiyun Cui; Yingming Li;	cikm	2024-10-21
507	Inferring Visualization Intent from Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider a conversational approach to visualization, where users specify their needs at each step in natural language, with a visualization being returned in turn.	Haotian Li; Nithin Chalapathi; Huamin Qu; Alvin Cheung; Aditya G. Parameswaran;	cikm	2024-10-21
508	Large Language Models in Computer Science Education: A Systematic Literature Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, …	Nishat Raihan; Mohammed Latif Siddiq; Joanna C. S. Santos; Marcos Zampieri;	ArXiv	2024-10-21
509	Improving Neuron-level Interpretability with White-box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE), explicitly engineered to capture sparse, low-dimensional structures within data distributions.	Hao Bai; Yi Ma;	arxiv-cs.CL	2024-10-21
510	Using GPT Models for Qualitative and Quantitative News Analytics in The 2024 US Presidental Election Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper considers an approach of using Google Search API and GPT-4o model for qualitative and quantitative analyses of news through retrieval-augmented generation (RAG).	Bohdan M. Pavlyshenko;	arxiv-cs.CL	2024-10-21
511	Diffusion Transformer Policy: Scaling Diffusion Transformer for Generalist Vision-Language-Action Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent large vision-language action models pretrained on diverse robot datasets have demonstrated the potential for generalizing to new environments with a few in-domain data.	ZHI HOU et. al.	arxiv-cs.RO	2024-10-21
512	Does ChatGPT Have A Poetic Style? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the GPT models, especially GPT-4, can successfully produce poems in a range of both common and uncommon English-language forms in superficial yet noteworthy ways, such as by producing poems of appropriate lengths for sonnets (14 lines), villanelles (19 lines), and sestinas (39 lines).	Melanie Walsh; Anna Preus; Elizabeth Gronski;	arxiv-cs.CL	2024-10-20
513	Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias.	Sanguk Lee; Kai-Qi Yang; Tai-Quan Peng; Ruth Heo; Hui Liu;	arxiv-cs.AI	2024-10-20
514	BERTtime Stories: Investigating The Role of Synthetic Story Data in Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe our contribution to the Strict and Strict-Small tracks of the 2nd iteration of the BabyLM Challenge.	Nikitas Theodoropoulos; Giorgos Filandrianos; Vassilis Lyberatos; Maria Lymperaiou; Giorgos Stamou;	arxiv-cs.CL	2024-10-20
515	DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-based Proximal Policy Optimization (DTPPO) method.	Anning Wei; Jintao Liang; Kaiyuan Lin; Ziyue Li; Rui Zhao;	arxiv-cs.MA	2024-10-19
516	Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This scarcity of annotated data impedes the development of effective machine learning models for cancer document classification. To address this challenge, we present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics.	ELIAS HOSSAIN et. al.	arxiv-cs.AI	2024-10-19
517	From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation By Natural Language Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces SecCode, a framework that leverages an innovative interactive encouragement prompting (EP) technique for secure code generation with \textit{only NL} prompts.	SHIGANG LIU et. al.	arxiv-cs.CR	2024-10-18
518	Automated Genre-Aware Article Scoring and Feedback Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the development of an advanced intelligent article scoring system that not only assesses the overall quality of written work but also offers detailed feature-based scoring tailored to various article genres.	CHIHANG WANG et. al.	arxiv-cs.CL	2024-10-18
519	XPerT: Extended Persistence Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel transformer architecture called the \textit{Extended Persistence Transformer (xPerT)}, which is highly scalable than the compared to Persformer, an existing transformer for persistence diagrams.	Sehun Kim;	arxiv-cs.LG	2024-10-18
520	Harmony: A Home Agent for Responsive Management and Action Optimization with A Locally Deployed Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to optimize the privacy and economy of data processing while maintaining the powerful functions of LLMs, we propose Harmony, a smart home assistant framework that uses a locally deployable small-scale LLM.	Ziqi Yin; Mingxin Zhang; Daisuke Kawahara;	arxiv-cs.HC	2024-10-18
521	Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs.	XINGYU TAN et. al.	arxiv-cs.CL	2024-10-18
522	Detecting AI-Generated Texts in Cross-Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs.	You Zhou; Jie Wang;	arxiv-cs.CL	2024-10-17
523	Transfer Learning on Transformers for Building Energy Consumption Forecasting — A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting.	Robert Spencer; Surangika Ranathunga; Mikael Boulic; Andries van Heerden; Teo Susnjak;	arxiv-cs.LG	2024-10-17
524	FaithBench: A Diverse Hallucination Benchmark for Summarization By Modern LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FaithBench, a summarization hallucination benchmark comprising challenging hallucinations made by 10 modern LLMs from 8 different families, with ground truth annotations by human experts.	FORREST SHENG BAO et. al.	arxiv-cs.CL	2024-10-17
525	SBI-RAG: Enhancing Math Word Problem Solving for Students Through Schema-Based Instruction and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Schema-based instruction (SBI) is an evidence-based strategy that helps students categorize problems based on their structure, improving problem-solving accuracy. Building on this, we propose a Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG) framework that incorporates a large language model (LLM).	Prakhar Dixit; Tim Oates;	arxiv-cs.LG	2024-10-17
526	Linguistically Grounded Analysis of Language Models Using Shapley Head Values Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the processing of morphosyntactic phenomena, by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs).	Marcell Fekete; Johannes Bjerva;	arxiv-cs.CL	2024-10-17
527	Judgment of Learning: A Human Ability Beyond Generative Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce a cross-agent prediction model to assess whether ChatGPT-based LLMs align with human judgments of learning (JOL), a metacognitive measure where individuals predict their own future memory performance.	Markus Huff; Elanur Ulakçı;	arxiv-cs.CL	2024-10-17
528	Measuring and Modifying The Readability of English Texts with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, in a pre-registered human experiment (N = 59), we ask whether Turbo can reliably make text easier or harder to read. We find evidence to support this hypothesis, though considerable variance in human judgments remains unexplained.	Sean Trott; Pamela D. Rivière;	arxiv-cs.CL	2024-10-17
529	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed raster order using BERT- or GPT-like transformer architectures.	LIJIE FAN et. al.	arxiv-cs.CV	2024-10-17
530	Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that transformer-based solutions pose higher computational demands, consistently yield inferior performance, and experience significant performance degradation when quantized to accommodate resource-constrained devices.	Clayton Souza Leite; Henry Mauranen; Aziza Zhanabatyrova; Yu Xiao;	arxiv-cs.LG	2024-10-17
531	Unifying Economic and Language Models for Enhanced Sentiment Analysis of The Oil Market Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these LMs often have difficulty with domain-specific terminology, limiting their effectiveness in the crude oil sector. Addressing this gap, we introduce CrudeBERT, a fine-tuned LM specifically for the crude oil market.	Himmet Kaplan; Ralf-Peter Mundani; Heiko Rölke; Albert Weichselbraun; Martin Tschudy;	arxiv-cs.IR	2024-10-16
532	Stabilize The Latent Space for Image Autoregressive Modeling: A Unified Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This finding contrasts sharply with the field of NLP, where the autoregressive model GPT has established a commanding presence. To address this discrepancy, we introduce a unified perspective on the relationship between latent space and generative models, emphasizing the stability of latent space in image generative modeling.	YONGXIN ZHU et. al.	arxiv-cs.CV	2024-10-16
533	Context-Scaling Versus Task-Scaling in In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training.	Amirhesam Abedsoltan; Adityanarayanan Radhakrishnan; Jingfeng Wu; Mikhail Belkin;	arxiv-cs.LG	2024-10-16
534	Reconstruction of Differentially Private Text Sanitization Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two attacks (black-box and white-box) based on the accessibility to LLMs and show that LLMs could connect the pair of DP-sanitized text and the corresponding private training data of LLMs by giving sample text pairs as instructions (in the black-box attacks) or fine-tuning data (in the white-box attacks).	SHUCHAO PANG et. al.	arxiv-cs.CR	2024-10-16
535	When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether GPTs can appropriately respond to unanswerable math word problems by applying prompts typically used in solvable mathematical scenarios.	Asir Saadat; Tasmia Binte Sogir; Md Taukir Azam Chowdhury; Syem Aziz;	arxiv-cs.CL	2024-10-16
536	SELF-BART : A Transformer-based Molecular Representation Model Using SELFIES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we develop an encoder-decoder model based on BART that is capable of leaning molecular representations and generate new molecules.	INDRA PRIYADARSINI et. al.	arxiv-cs.CE	2024-10-16
537	Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Jigsaw Puzzles (JSP), a straightforward yet effective multi-turn jailbreak strategy against the advanced LLMs.	Hao Yang; Lizhen Qu; Ehsan Shareghi; Gholamreza Haffari;	arxiv-cs.CL	2024-10-15
538	Table-LLM-Specialist: Language Model Specialists for Tables Using Iterative Generator-Validator Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Table-LLM-Specialist, or Table-Specialist for short, as a new self-trained fine-tuning paradigm specifically designed for table tasks.	JUNJIE XING et. al.	arxiv-cs.CL	2024-10-15
539	In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have achieved impressive results across various tasks.	Alireza Shamshiri; Kyeong Rok Ryu; June Young Park;	arxiv-cs.CL	2024-10-15
540	TraM : Enhancing User Sleep Prediction with Transformer-based Multivariate Time Series Modeling and Machine Learning Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach that leverages Transformer-based multivariate time series model and Machine Learning Ensembles to predict the quality of human sleep, emotional states, and stress levels.	Jinjae Kim; Minjeong Ma; Eunjee Choi; Keunhee Cho; Chanwoo Lee;	arxiv-cs.LG	2024-10-15
541	De-jargonizing Science for Journalists with GPT-4: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study offers an initial evaluation of a human-in-the-loop system leveraging GPT-4 (a large language model or LLM), and Retrieval-Augmented Generation (RAG) to identify and define jargon terms in scientific abstracts, based on readers’ self-reported knowledge.	Sachita Nishal; Eric Lee; Nicholas Diakopoulos;	arxiv-cs.CL	2024-10-15
542	Embedding Self-Correction As An Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often encounter difficulties in certain aspects of mathematical reasoning, leading to flawed reasoning and erroneous results. To mitigate these issues, we introduce a novel mechanism, the Chain of Self-Correction (CoSC), specifically designed to embed self-correction as an inherent ability in LLMs, enabling them to validate and rectify their own results.	Kuofeng Gao; Huanqia Cai; Qingyao Shuai; Dihong Gong; Zhifeng Li;	arxiv-cs.AI	2024-10-14
543	Open6DOR: Benchmarking Open-instruction 6-DoF Object Rearrangement and A VLM-based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The integration of large-scale Vision-Language Models (VLMs) with embodied AI can greatly enhance the generalizability and the capacity to follow open instructions for robots. …	YUFEI DING et. al.	2024 IEEE/RSJ International Conference on Intelligent …	2024-10-14
544	Rethinking Legal Judgement Prediction in A Realistic Scenario in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking.	Shubham Kumar Nigam; Aniket Deroy; Subhankar Maity; Arnab Bhattacharya;	arxiv-cs.CL	2024-10-14
545	Performance in A Dialectal Profiling Task of LLMs for Varieties of Brazilian Portuguese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results offer sociolinguistic contributions for an equity fluent NLP technology.	Raquel Meister Ko Freitag; Túlio Sousa de Gois;	arxiv-cs.CL	2024-10-14
546	RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers.	Md Kowsher; Tara Esmaeilbeig; Chun-Nam Yu; Mojtaba Soltanalian; Niloofar Yousefi;	arxiv-cs.CL	2024-10-13
547	Evaluating Gender Bias of LLMs in Making Morality Judgements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates whether current closed and open-source LLMs possess gender bias, especially when asked to give moral opinions. To evaluate these models, we curate and introduce a new dataset GenMO (Gender-bias in Morality Opinions) comprising parallel short stories featuring male and female characters respectively.	Divij Bajaj; Yuanyuan Lei; Jonathan Tong; Ruihong Huang;	arxiv-cs.CL	2024-10-13
548	Transformer-based Language Models for Reasoning in The Description Logic ALCQ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we systematically investigate the logical reasoning capabilities of a supervised fine-tuned DeBERTa-based model and two large language models (GPT-3.5, GPT-4) with few-shot prompting.	Angelos Poulis; Eleni Tsalapati; Manolis Koubarakis;	arxiv-cs.CL	2024-10-12
549	\llinstruct: An Instruction-tuned Model for English Language Proficiency Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications.	Debanjan Ghosh; Sophia Chan;	arxiv-cs.CL	2024-10-11
550	Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism Via Dual Diffusion Models and GPT Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Traditional methods often rely on extensive and costly data collection using sonar sensors, jeopardizing data quality and diversity. To overcome these limitations, this study proposes a new sonar image synthesis framework, Synth-SONAR leveraging diffusion models and GPT prompting.	Purushothaman Natarajan; Kamal Basha; Athira Nambiar;	arxiv-cs.CV	2024-10-11
551	Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data.	Grace Proebsting; Adam Poliak;	arxiv-cs.CL	2024-10-11
552	Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a pipeline for developing in-house LLMs tailored to identify differential diagnoses from radiology reports.	LUOYAO CHEN et. al.	arxiv-cs.CL	2024-10-11
553	Improving Legal Entity Recognition Using A Hybrid Transformer Model and Semantic Filtering Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel hybrid model that enhances the accuracy and precision of Legal-BERT, a transformer model fine-tuned for legal text processing, by introducing a semantic similarity-based filtering mechanism.	Duraimurugan Rajamanickam;	arxiv-cs.CL	2024-10-11
554	Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture.	Evan Lucas; Dylan Kangas; Timothy C Havens;	arxiv-cs.CL	2024-10-11
555	AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For instance, attacks tend to be less effective when models pay more attention to system prompts designed to ensure LLM safety alignment. Building on this discovery, we introduce an enhanced method that manipulates models’ attention scores to facilitate LLM jailbreaking, which we term AttnGCG.	ZIJUN WANG et. al.	arxiv-cs.CL	2024-10-11
556	The Rise of AI-Generated Content in Wikipedia Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative, to establish lower bounds on the presence of AI-generated content in recently created Wikipedia pages.	Creston Brooks; Samuel Eggert; Denis Peskoff;	arxiv-cs.CL	2024-10-10
557	Robust AI-Generated Text Detection By Restricted Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on the robustness of classifier-based detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains.	KRISTIAN KUZNETSOV et. al.	arxiv-cs.CL	2024-10-10
558	HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a method named HorGait, which utilizes a hybrid model with a Transformer architecture for gait recognition on the planar projection of 3D point clouds from LiDAR.	JIAXING HAO et. al.	arxiv-cs.CV	2024-10-10
559	Evaluating Transformer Models for Suicide Risk Detection on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on leveraging state-of-the-art natural language processing solutions for identifying suicide risk in social media posts as a submission for the IEEE BigData 2024 Cup: Detection of Suicide Risk on Social Media conducted by the kubapok team.	Jakub Pokrywka; Jeremi I. Kaczmarek; Edward J. Gorzelańczyk;	arxiv-cs.CL	2024-10-10
560	Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While morally clear scenarios are more discernible to LLMs, greater difficulty is encountered in morally ambiguous contexts. In this investigation, we explored LLM calibration to show that human and LLM judgments are poorly aligned in such scenarios.	PRANAV SENTHILKUMAR et. al.	arxiv-cs.CL	2024-10-10
561	VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce VibeCheck, a system for automatically comparing a pair of LLMs by discovering identifying traits of a model (vibes) that are well-defined, differentiating, and user-aligned.	Lisa Dunlap; Krishna Mandal; Trevor Darrell; Jacob Steinhardt; Joseph E Gonzalez;	arxiv-cs.CL	2024-10-10
562	SWE-Bench+: Enhanced Coding Benchmark for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a systematic evaluation of the quality of SWE-bench remains missing. In this paper, we addressed this gap by presenting an empirical analysis of the SWE-bench dataset.	REEM ALEITHAN et. al.	arxiv-cs.SE	2024-10-09
563	Stanceformer: Target-Aware Transformer for Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, these models yield similar performance regardless of whether we utilize or disregard target information, undermining the task’s significance. To address this challenge, we introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference.	Krishna Garg; Cornelia Caragea;	arxiv-cs.CL	2024-10-09
564	Optimized Spatial Architecture Mapping Flow for Transformer Accelerators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the design process for existing spatial architectures is predominantly manual, and it often involves time-consuming redesigns for new applications and new problem dimensions, which greatly limits the development of optimally designed accelerators for Transformer models. To address these challenges, we propose SAMT (Spatial Architecture Mapping for Transformers), a comprehensive framework designed to optimize the dataflow mapping of Transformer inference workloads onto spatial accelerators.	HAOCHENG XU et. al.	arxiv-cs.AR	2024-10-09
565	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks.	VIKTORIIA CHEKALINA et. al.	arxiv-cs.CL	2024-10-09
566	InAttention: Linear Context Scaling for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we modify the decoder-only transformer, replacing self-attention with InAttention, which scales linearly with context length during inference by having tokens attend only to initial states.	Joseph Eisner;	arxiv-cs.LG	2024-10-09
567	A Comparative Study of Hybrid Models in Health Misinformation Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs), aiming to develop more effective tools for countering the spread of health misinformation during the pan-demic.	Mkululi Sikosana; Oluwaseun Ajao; Sean Maudsley-Barton;	arxiv-cs.IR	2024-10-08
568	Unveiling Transformer Perception By Exploring Input Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a general method for the exploration of equivalence classes in the input space of Transformer models.	Alessandro Benfenati; Alfio Ferrara; Alessio Marta; Davide Riva; Elisabetta Rocchetti;	arxiv-cs.LG	2024-10-08
569	Solving Multi-Goal Robotic Tasks with Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, no existing methods effectively combine offline training, multi-goal learning, and transformer-based architectures. In this paper, we address these challenges by introducing a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics.	Paul Gajewski; Dominik Żurek; Marcin Pietroń; Kamil Faber;	arxiv-cs.RO	2024-10-08
570	SC-Bench: A Large-Scale Dataset for Smart Contract Auditing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SC-Bench, the first dataset for automated smart-contract auditing research.	Shihao Xia; Mengting He; Linhai Song; Yiying Zhang;	arxiv-cs.CR	2024-10-08
571	Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini – frontier models trained to be helpful, harmless, and honest – can engage in specification gaming without training on a curriculum of tasks, purely from in-context iterative reflection (which we call in-context reinforcement learning, ICRL).	Leo McKee-Reid; Christoph Sträter; Maria Angelica Martinez; Joe Needham; Mikita Balesni;	arxiv-cs.AI	2024-10-08
572	A GPT-based Decision Transformer for Multi-Vehicle Coordination at Unsignalized Intersections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the application of the Decision Transformer, a decision-making algorithm based on the Generative Pre-trained Transformer (GPT) architecture, to multi-vehicle coordination at unsignalized intersections.	Eunjae Lee; Minhee Kang; Yoojin Choi; Heejin Ahn;	arxiv-cs.RO	2024-10-08
573	DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size—adding a few thousand parameters for large-scale models in the 100B parameters range.	Matteo Pagliardini; Amirkeivan Mohtashami; François Fleuret; Martin Jaggi;	nips	2024-10-07
574	Timer-XL: Long-Context Transformers for Unified Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Timer-XL, a causal Transformer for unified time series forecasting.	Yong Liu; Guo Qin; Xiangdong Huang; Jianmin Wang; Mingsheng Long;	arxiv-cs.LG	2024-10-07
575	SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel transformer-to-SNN conversion method that outputs an end-to-end spike-based transformer, named SpikedAttention.	Sangwoo Hwang; Seunghyun Lee; Dahoon Park; Donghun Lee; Jaeha Kung;	nips	2024-10-07
576	Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models Via Recounted Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new prompting technique tailored for temporal reasoning, Narrative-of-Thought (NoT), that first converts the events set to a Python class, then prompts a small model to generate a temporally grounded narrative, guiding the final generation of a temporal graph.	Xinliang Frederick Zhang; Nick Beauchamp; Lu Wang;	arxiv-cs.CL	2024-10-07
577	LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH).	XIAONAN NIE et. al.	nips	2024-10-07
578	JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.	KUN ZHOU et. al.	nips	2024-10-07
579	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	nips	2024-10-07
580	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 774M parameters.	RHEA SUKTHANKER et. al.	nips	2024-10-07
581	Achieving Efficient Alignment Through Learned Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Aligner, a novel and simple alignment paradigm that learns the correctional residuals between preferred and dispreferred answers using a small model.	JIAMING JI et. al.	nips	2024-10-07
582	In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introducing the concept of state vector.	DONGFANG LI et. al.	nips	2024-10-07
583	Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods.	Lingxiao Zhao; Xueying Ding; Leman Akoglu;	nips	2024-10-07
584	VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach to reduce vision compute by leveraging redundant vision tokens “skipping layers” rather than decreasing the number of vision tokens.	SHIWEI WU et. al.	nips	2024-10-07
585	Finding Transformer Circuits With Edge Pruning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we frame circuit discovery as an optimization problem and propose _Edge Pruning_ as an effective and scalable solution.	Adithya Bhaskar; Alexander Wettig; Dan Friedman; Danqi Chen;	nips	2024-10-07
586	Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Efficient Multi-Task Learning (EMTAL), a novel approach that transforms a pre-trained Transformer into an efficient multi-task learner during training, and reparameterizes the knowledge back to the original Transformer for efficient inference.	Hanwen Zhong; Jiaxin Chen; Yutong Zhang; Di Huang; Yunhong Wang;	nips	2024-10-07
587	$M^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents $M^3$GPT, an advanced \textbf{M}ultimodal, \textbf{M}ultitask framework for \textbf{M}otion comprehension and generation.	MINGSHUANG LUO et. al.	nips	2024-10-07
588	SAND: Smooth Imputation of Sparse and Noisy Functional Data with Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the transformer architecture has come to dominate other models for text and image data, its application to irregularly-spaced longitudinal data has been limited. We introduce a variant of the transformer that enables it to more smoothly impute such functional data.	Ju-Sheng Hong; Junwen Yao; Jonas Mueller; Jane-Ling Wang;	nips	2024-10-07
589	Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks.	Jonathan Hayase; Ema Borevković; Nicholas Carlini; Florian Tramer; Milad Nasr;	nips	2024-10-07
590	ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching.	JUNJIE NI et. al.	nips	2024-10-07
591	A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation.	ARCHIT SHARMA et. al.	nips	2024-10-07
592	SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods.	PEI ZHOU et. al.	nips	2024-10-07
593	Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks.	JONATHAN THOMM et. al.	nips	2024-10-07
594	Does RoBERTa Perform Better Than BERT in Continual Learning: An Attention Sink Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we observe that pre-trained models may allocate high attention scores to some ‘sink’ tokens, such as [SEP] tokens, which are ubiquitous across various tasks.	Xueying Bai; Yifan Sun; Niranjan Balasubramanian;	arxiv-cs.LG	2024-10-07
595	The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent research suggests that state-space models (SSMs) like Mamba can be competitive with Transformer models for language modeling with advantageous deployment characteristics. Given the focus and expertise on training large-scale Transformer models, we consider the challenge of converting these pretrained models into SSMs for deployment.	Junxiong Wang; Daniele Paliotta; Avner May; Alexander Rush; Tri Dao;	nips	2024-10-07
596	SemCoder: Training Code Language Models with Comprehensive Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair.	YANGRUIBO DING et. al.	nips	2024-10-07
597	Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library.	YE FANG et. al.	nips	2024-10-07
598	Weak-to-Strong Search: Align Large Language Models Via Searching Over Small Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce weak-to-strong search, framing the alignment of a large language model as a test-time greedy search to maximize the log-likelihood difference between small tuned and untuned models while sampling from the frozen large model.	ZHANHUI ZHOU et. al.	nips	2024-10-07
599	Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work has proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top $k$ similar tokens.	CHAU TRAN et. al.	nips	2024-10-07
600	LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The KG datastore is designed as a plug-and-play module, allowing for seamless integration with various model architectures. We introduce and evaluate three distinct frameworks within this paradigm: KG-LLaVA, which integrates the pre-trained LLaVA model with KG-RAG; Med-XPT, a custom framework combining MedCLIP, a transformer-based projector, and GPT-2; and Bio-LLaVA, which adapts LLaVA by incorporating the Bio-ViT-L vision model.	Ameer Hamza; Yong Hyun Ahn; Sungyoung Lee; Seong Tae Kim;	arxiv-cs.CV	2024-10-07
601	FinBen: An Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making.	QIANQIAN XIE et. al.	nips	2024-10-07
602	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a method that is able to distill a pre-trained Transformer architecture into alternative architectures such as state space models (SSMs).	Aviv Bick; Kevin Li; Eric Xing; J. Zico Kolter; Albert Gu;	nips	2024-10-07
603	RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.	SHAOTENG LIU et. al.	nips	2024-10-07
604	Understanding Transformers Via N-Gram Statistics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper takes a first step in this direction by considering families of functions (i.e. rules) formed out of simple N-gram based statistics of the training data. By studying how well these rulesets approximate transformer predictions, we obtain a variety of novel discoveries: a simple method to detect overfitting during training without using a holdout set, a quantative measure of how transformers progress from learning simple to more complex statistical rules over the course of training, a model-variance criterion governing when transformer predictions tend to be described by N-gram rules, and insights into how well transformers can be approximated by N-gram rulesets in the limit where these rulesets become increasingly complex.	Timothy Nguyen;	nips	2024-10-07
605	APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents APIGen, an automated data generation pipeline designed to produce verifiable high-quality datasets for function-calling applications.	ZUXIN LIU et. al.	nips	2024-10-07
606	MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents.	WEI TAO et. al.	nips	2024-10-07
607	OccamLLM: Fast and Exact Language Model Arithmetic in A Single Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that enables exact arithmetic in a single autoregressive step, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities.	OWEN DUGAN et. al.	nips	2024-10-07
608	Differential Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise.	TIANZHU YE et. al.	arxiv-cs.CL	2024-10-07
609	Perception of Knowledge Boundary for Large Language Models Through Semi-open-ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perceive the LLMs’ knowledge boundary with semi-open-ended questions by discovering more ambiguous answers.	ZHIHUA WEN et. al.	nips	2024-10-07
610	Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.	Akshay Krishnamurthy; Keegan Harris; Dylan J Foster; Cyril Zhang; Aleksandrs Slivkins;	nips	2024-10-07
611	UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction.	Yansong Ning; Hao Liu;	nips	2024-10-07
612	Visual Autoregressive Modeling: Scalable Image Generation Via Next-Scale Prediction IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine next-scale prediction or next-resolution prediction, diverging from the standard raster-scan next-token prediction.	Keyu Tian; Yi Jiang; Zehuan Yuan; BINGYUE PENG; Liwei Wang;	nips	2024-10-07
613	Seshat Global History Databank Text Dataset and Benchmark of Large Language Models’ History Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This benchmarking is particularly challenging, given that human knowledge of history is inherently unbalanced, with more information available on Western history and recent periods. To address this challenge, we introduce a curated sample of the Seshat Global History Databank, which provides a structured representation of human historical knowledge, containing 36,000 data points across 600 historical societies and over 600 scholarly references.	JAKOB HAUSER et. al.	nips	2024-10-07
614	Leveraging Free Energy in Pretraining Model Selection for Improved Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a Bayesian model selection criterion, called the downstream free energy, which quantifies a checkpoint’s adaptability by measuring the concentration of nearby favorable parameters for the downstream task.	Michael Munn; Susan Wei;	arxiv-cs.LG	2024-10-07
615	ProtocoLLM: Automatic Evaluation Framework of LLMs on Domain-Specific Scientific Protocol Formulation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a flexible, automatic framework to evaluate LLM’s capability on SPFT: ProtocoLLM.	Seungjun Yi; Jaeyoung Lim; Juyong Yoon;	arxiv-cs.CL	2024-10-06
616	Dual Selective Fusion Transformer Network for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Transformer models face two key challenges when dealing with HSI scenes characterized by diverse land cover types and rich spectral information: (1) A fixed receptive field overlooks the effective contextual scales required by various HSI objects; (2) invalid self-attention features in context fusion affect model performance. To address these limitations, we propose a novel Dual Selective Fusion Transformer Network (DSFormer) for HSI classification.	Yichu Xu; Di Wang; Lefei Zhang; Liangpei Zhang;	arxiv-cs.CV	2024-10-04
617	How Language Models Prioritize Contextual Grammatical Cues? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun.	Hamidreza Amirzadeh; Afra Alishahi; Hosein Mohebbi;	arxiv-cs.CL	2024-10-04
618	Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts.	AMEY HENGLE et. al.	arxiv-cs.CL	2024-10-04
619	Learning Semantic Structure Through First-Order-Logic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study whether transformer-based language models can extract predicate argument structure from simple sentences.	Akshay Chaturvedi; Nicholas Asher;	arxiv-cs.CL	2024-10-04
620	Dynamic Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we introduce a Timestep-wise Dynamic Width (TDW) approach that adapts model width conditioned on the generation timesteps.	WANGBO ZHAO et. al.	arxiv-cs.CV	2024-10-04
621	Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the tokenization techniques employed by state-of-the-art large language models (LLMs) and their implications on the cost and availability of services across different languages, especially low resource languages.	Abrar Rahman; Garry Bowlin; Binit Mohanty; Sean McGunigal;	arxiv-cs.CL	2024-10-04
622	IndicSentEval: How Effectively Do Multilingual Transformer Models Encode Linguistic Properties for Indic Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate similar questions regarding encoding capability and robustness for 8 linguistic properties across 13 different perturbations in 6 Indic languages, using 9 multilingual Transformer models (7 universal and 2 Indic-specific).	Akhilesh Aravapalli; Mounika Marreddy; Subba Reddy Oota; Radhika Mamidi; Manish Gupta;	arxiv-cs.CL	2024-10-03
623	AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first correct-by-construction learning-based system for step-by-step mathematical integration.	Mert Ünsal; Timon Gehr; Martin Vechev;	arxiv-cs.LG	2024-10-03
624	CulturalBench: A Robust, Diverse and Challenging Benchmark on Measuring The (Lack Of) Cultural Knowledge of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CulturalBench: a set of 1,227 human-written and human-verified questions for effectively assessing LLMs’ cultural knowledge, covering 45 global regions including the underrepresented ones like Bangladesh, Zimbabwe, and Peru.	YU YING CHIU et. al.	arxiv-cs.CL	2024-10-03
625	GPT-4o As The Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose SIEVE, a lightweight alternative that matches GPT-4o accuracy at less than 1\% of the cost.	Jifan Zhang; Ziyue Luo; Jia Liu; Ness Shroff; Robert Nowak;	arxiv-cs.CL	2024-10-03
626	Intrinsic Evaluation of RAG Systems for Deep-Logic Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Overall Performance Index (OPI), an intrinsic metric to evaluate retrieval-augmented generation (RAG) mechanisms for applications involving deep-logic queries.	Junyi Hu; You Zhou; Jie Wang;	arxiv-cs.AI	2024-10-03
627	Coal Mining Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to coal mining question answering (QA) using large language models (LLMs) combined with tailored prompt engineering techniques.	Antonio Carlos Rivera; Anthony Moore; Steven Robinson;	arxiv-cs.CL	2024-10-03
628	AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose AutoDAN-Turbo, a black-box jailbreak method that can automatically discover as many jailbreak strategies as possible from scratch, without any human intervention or predefined scopes (e.g., specified candidate strategies), and use them for red-teaming.	XIAOGENG LIU et. al.	arxiv-cs.CR	2024-10-03
629	TSOTSALearning at LLMs4OL Tasks A and B : Combining Rules to Large Language Model for Ontology Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our contribution to the Large Language Model For Ontology Learning (LLMs4OL) challenge hosted by ISWC conference. The challenge involves extracting and …	Carick Appolinaire Atezong Ymele; Azanzi Jiomekong;	LLMs4OL@ISWC	2024-10-02
630	Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of LLMs and FinBERT for FSA, comparing their performance on news articles, financial reports and company announcements.	Yanxin Shen; Pulin Kirin Zhang;	arxiv-cs.IR	2024-10-02
631	A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.	LIANG CHEN et. al.	arxiv-cs.CV	2024-10-02
632	Automatic Deductive Coding in Discourse Analysis: An Application of Large Language Models in Learning Analytics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding.	Lishan Zhang; Han Wu; Xiaoshan Huang; Tengfei Duan; Hanxiang Du;	arxiv-cs.CL	2024-10-02
633	Emotion-Aware Response Generation Using Affect-Enriched Embeddings with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel framework that integrates multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as LLAMA 2, Flan-T5, ChatGPT 3.0, and ChatGPT 4.0.	Abdur Rasool; Muhammad Irfan Shahzad; Hafsa Aslam; Vincent Chan;	arxiv-cs.CL	2024-10-02
634	ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, even state-of-the-art vision-language models (VLMs), such as GPT-4o, still fall short of human-level performance, particularly in intricate web environments and long-horizon tasks. To address these limitations, we present ExACT, an approach to combine test-time search and self-learning to build o1-like models for agentic applications.	XIAO YU et. al.	arxiv-cs.CL	2024-10-02
635	Creative and Context-Aware Translation of East Asian Idioms with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, compiling a dictionary of candidate translations demands much time and creativity even for expert translators. To alleviate such burden, we evaluate if GPT-4 can help generate high-quality translations.	Kenan Tang; Peiyang Song; Yao Qin; Xifeng Yan;	arxiv-cs.CL	2024-10-01
636	Ayaka: A Versatile Transformer Accelerator With Low-Rank Estimation and Heterogeneous Dataflow Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer model has demonstrated outstanding performance in the field of artificial intelligence. However, its remarkable performance comes at the cost of substantial …	YUBIN QIN et. al.	IEEE Journal of Solid-State Circuits	2024-10-01
637	CIMFormer: A Systolic CIM-Array-Based Transformer Accelerator With Token-Pruning-Aware Attention Reformulating and Principal Possibility Gathering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models have achieved impressive performance in various artificial intelligence (AI) applications. However, the high cost of computation and memory footprint make its …	RUIQI GUO et. al.	IEEE Journal of Solid-State Circuits	2024-10-01
638	APT: Alarm Prediction Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Nika Strem; D. Dhami; Benedikt Schmidt; Benjamin Klöpper; K. Kersting;	Expert Syst. Appl.	2024-10-01
639	SIGMA: Secure GPT Inference with Function Secret Sharing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the …	KANAV GUPTA et. al.	Proc. Priv. Enhancing Technol.	2024-10-01
640	When Transformer Meets Large Graphs: An Expressive and Efficient Two-View Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The successes of applying Transformer to graphs have been witnessed on small graphs (e.g., molecular graphs), yet two barriers prevent its adoption on large graphs (e.g., citation …	Weirui Kuang; Zhen Wang; Zhewei Wei; Yaliang Li; Bolin Ding;	IEEE Transactions on Knowledge and Data Engineering	2024-10-01
641	MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone’s Potential with Masked Autoregressive Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on this analysis, we propose Masked Autoregressive Pretraining (MAP) to pretrain a hybrid Mamba-Transformer vision backbone network.	Yunze Liu; Li Yi;	arxiv-cs.CV	2024-10-01
642	LingoQA: Video Question Answering for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving.We release our dataset and benchmark1 as an evaluation platform for vision-language models in autonomous driving.	ANA-MARIA MARCU et. al.	eccv	2024-09-30
643	TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost.	Areeg Fahad Rasheed; M. Zarkoosh; Safa F. Abbas; Sana Sabah Al-Azzawi;	arxiv-cs.CL	2024-09-30
644	GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called , simultaneously applicable for various vision tasks only with a vanilla ViT.Interestingly, our builds a new benchmark in generalist performance, and fosters mutual enhancement across tasks, leading to significant improvements compared to isolated training.	HAIYANG WANG et. al.	eccv	2024-09-30
645	AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering.	Weiran Huang; Xiuyuan Chen; Yuan Lin; Yuchen Zhang;	eccv	2024-09-30
646	GENIXER: Empowering Multimodal Large Language Models As A Powerful Data Generator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce , a comprehensive data generation pipeline consisting of four key steps: (i) instruction data collection, (ii) instruction template design, (iii) empowering MLLMs, and (iv) data generation and filtering.	Henry Hengyuan Zhao; Pan Zhou; Mike Zheng Shou;	eccv	2024-09-30
647	Evaluating The Fairness of Task-adaptive Pretraining on Unlabeled Test Data Before Few-shot Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Few-shot learning benchmarks are critical for evaluating modern NLP techniques.	Kush Dubey;	arxiv-cs.CL	2024-09-30
648	HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction.	Fangqin Zhou; Mert Kilickaya; Joaquin Vanschoren; Ran Piao;	eccv	2024-09-30
649	Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenges posed by the substantial training time and memory consumption associated with video transformers, focusing on the ViViT (Video Vision Transformer) model, in particular the Factorised Encoder version, as our baseline for action recognition tasks.	Shreyank N Gowda; Anurag Arnab; Jonathan Huang;	eccv	2024-09-30
650	Sparse Attention Decomposition Applied to Circuit Tracing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we seek to isolate and identify the features used to effect communication and coordination among attention heads in GPT-2 small.	Gabriel Franco; Mark Crovella;	arxiv-cs.LG	2024-09-30
651	OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.	WENZHAO ZHENG et. al.	eccv	2024-09-30
652	Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face limitations in both shape reconstruction and texture generation. This paper introduces an innovative Analysis-by-Synthesis Transformer that addresses these limitations in a unified framework by effectively modeling pixel-to-shape and pixel-to-texture relationships.	DIAN JIA et. al.	eccv	2024-09-30
653	MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce MaskMamba, a novel hybrid model that combines Mamba and Transformer architectures, utilizing Masked Image Modeling for non-autoregressive image synthesis.	Wenchao Chen; Liqiang Niu; Ziyao Lu; Fandong Meng; Jie Zhou;	arxiv-cs.CV	2024-09-30
654	MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the hypergraph transformer-based method for trajectory prediction is yet to be explored. Therefore, we present a MultiscAle Relational Transformer (MART) network for multi-agent trajectory prediction.	Seongju Lee; Junseok Lee; Yeonguk Yu; Taeri Kim; Kyoobin Lee;	eccv	2024-09-30
655	ACE: All-round Creator and Editor Following Instructions Via Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose ACE, an All-round Creator and Editor, which achieves comparable performance compared to those expert models in a wide range of visual generation tasks.	ZHEN HAN et. al.	arxiv-cs.CV	2024-09-30
656	An Explainable Vision Question Answer Model Via Diffusion Chain-of-Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This means that generating explanations solely for the answer can lead to a semantic discrepancy between the content of the explanation and the question-answering content. To address this, we propose a step-by-step reasoning approach to reduce such semantic discrepancies.	Chunhao LU; Qiang Lu; Jake Luo;	eccv	2024-09-30
657	Depression Detection in Social Media Posts Using Transformer-based Models and Auxiliary Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing studies have explored various approaches to this problem but often fall short in terms of accuracy and robustness. To address these limitations, this research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers.	Marios Kerasiotis; Loukas Ilias; Dimitris Askounis;	arxiv-cs.CL	2024-09-30
658	Multimodal Misinformation Detection By Learning from Synthetic Data with Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions.	Fengzhu Zeng; Wenqian Li; Wei Gao; Yan Pang;	arxiv-cs.CL	2024-09-29
659	HYBRIDMIND: Meta Selection of Natural Language and Symbolic Language for Enhanced LLM Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HYBRIDMIND, an adaptive strategy that selects the optimal reasoning approach for each reasoning problem.	Simeng Han; Tianyu Liu; Chuhan Li; Xuyuan Xiong; Arman Cohan;	arxiv-cs.CL	2024-09-28
660	3D-CT-GPT: Generating 3D Radiology Reports Through Integration of Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model specifically designed for generating radiology reports from 3D CT scans, particularly chest CTs.	HAO CHEN et. al.	arxiv-cs.CV	2024-09-28
661	Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a custom self-attention in-memory computing architecture based on emerging charge-based memories called gain cells, which can be efficiently written to store new tokens during sequence generation and enable parallel analog dot-product computation required for self-attention.	NATHAN LEROUX et. al.	arxiv-cs.NE	2024-09-28
662	INSIGHTBUDDY-AI: Medication Extraction and Entity Linking Using Large Language Models and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects.	Pablo Romero; Lifeng Han; Goran Nenadic;	arxiv-cs.CL	2024-09-28
663	Efficient Federated Intrusion Detection in 5G Ecosystem Using Optimized BERT-based Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs).	Frederic Adjewa; Moez Esseghir; Leila Merghem-Boulahia;	arxiv-cs.CR	2024-09-28
664	FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research on food image understanding using recipe data has been a long-standing focus due to the diversity and complexity of the data.	Yuki Imajuku; Yoko Yamakata; Kiyoharu Aizawa;	arxiv-cs.CV	2024-09-27
665	Cottention: Linear Transformers With Cosine Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Cottention, a novel attention mechanism that replaces the softmax operation with cosine similarity.	Gabriel Mongaras; Trevor Dohm; Eric C. Larson;	arxiv-cs.LG	2024-09-27
666	Experimental Evaluation of Machine Learning Models for Goal-oriented Customer Service Chatbot with Pipeline Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a tailored experimental evaluation approach for goal-oriented customer service chatbots with pipeline architecture, focusing on three key components: Natural Language Understanding (NLU), dialogue management (DM), and Natural Language Generation (NLG).	Nurul Ain Nabilah Mohd Isa; Siti Nuraishah Agos Jawaddi; Azlan Ismail;	arxiv-cs.AI	2024-09-27
667	Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Pre-trained language models offer promise for identifying suicidality from unstructured clinical narratives.	ZEHAN LI et. al.	arxiv-cs.CL	2024-09-27
668	Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec.	Richard Yue; John E. Ortega;	arxiv-cs.CL	2024-09-26
669	The Application of GPT-4 in Grading Design University Students’ Assignment and Providing Feedback: An Exploratory Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to investigate whether GPT-4 can effectively grade assignments for design university students and provide useful feedback.	Qian Huang; Thijs Willems; King Wang Poon;	arxiv-cs.AI	2024-09-26
670	Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses from Closed-Domain LLM Vs. Clinical Teams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, responding to these patients’ inquiries has become a significant burden on healthcare workflows, consuming considerable time for clinical care teams. To address this, we introduce RadOnc-GPT, a specialized Large Language Model (LLM) powered by GPT-4 that has been designed with a focus on radiotherapeutic treatment of prostate cancer with advanced prompt engineering, and specifically designed to assist in generating responses.	YUEXING HAO et. al.	arxiv-cs.AI	2024-09-26
671	Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code.	Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier;	arxiv-cs.CR	2024-09-25
672	Reducing and Exploiting Data Augmentation Noise Through Meta Reweighting Contrastive Learning for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To boost deep learning models’ performance given augmented data/samples in text classification tasks, we propose a novel framework, which leverages both meta learning and contrastive learning techniques as parts of our design for reweighting the augmented samples and refining their feature representations based on their quality.	Guanyi Mou; Yichuan Li; Kyumin Lee;	arxiv-cs.CL	2024-09-25
673	Assessing The Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people, from multiple social media sources.	Mukaffi Bin Moin; Pronay Debnath; Usafa Akther Rifa; Rijeet Bin Anis;	arxiv-cs.CL	2024-09-25
674	Beyond Turing Test: Can GPT-4 Sway Experts’ Decisions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the post-Turing era, evaluating large language models (LLMs) involves assessing generated text based on readers’ reactions rather than merely its indistinguishability from human-produced content.	Takehiro Takayanagi; Hiroya Takamura; Kiyoshi Izumi; Chung-Chi Chen;	arxiv-cs.CE	2024-09-25
675	SynChart: Synthesizing Charts from Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a large-scale chart dataset, SynChart, which contains approximately 4 million diverse chart images with over 75 million dense annotations, including data tables, code, descriptions, and question-answer sets. We trained a 4.2B chart-expert model using this dataset and achieve near-GPT-4O performance on the ChartQA task, surpassing GPT-4V.	MENGCHEN LIU et. al.	arxiv-cs.AI	2024-09-24
676	GPT-4 As A Homework Tutor Can Improve Student Engagement and Learning Outcomes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools.	Alessandro Vanzo; Sankalan Pal Chowdhury; Mrinmaya Sachan;	arxiv-cs.CY	2024-09-24
677	MonoFormer: One Transformer for Both Diffusion and Autoregression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to study a simple idea: share one transformer for both autoregression and diffusion.	CHUYANG ZHAO et. al.	arxiv-cs.CV	2024-09-24
678	Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms in English, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality.	Xufeng Duan; Xinyu Zhou; Bei Xiao; Zhenguang G. Cai;	arxiv-cs.CL	2024-09-24
679	SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce \textit{multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries}, SOFI.	Sebastian Janampa; Marios Pattichis;	arxiv-cs.CV	2024-09-23
680	Towards A Realistic Long-Term Benchmark for Open-Web Research Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present initial results of a forthcoming benchmark for evaluating LLM agents on white-collar tasks of economic value.	Peter Mühlbacher; Nikos I. Bosse; Lawrence Phillips;	arxiv-cs.CL	2024-09-23
681	Improving Academic Skills Assessment with NLP and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP).	Xinyi Huang; Yingyi Wu; Danyang Zhang; Jiacheng Hu; Yujian Long;	arxiv-cs.CL	2024-09-23
682	SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in FL environments.	Minyeong Choe; Cheolhee Park; Changho Seo; Hyunil Kim;	arxiv-cs.LG	2024-09-23
683	Evaluating The Quality of Code Comments Generated By Large Language Models for Novice Programmers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated.	Aysa Xuemo Fan; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Jiaze Ke;	arxiv-cs.SE	2024-09-22
684	Can Pre-trained Language Models Generate Titles for Research Papers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fine-tune pre-trained language models to generate titles of papers from their abstracts.	Tohida Rehman; Debarshi Kumar Sanyal; Samiran Chattopadhyay;	arxiv-cs.CL	2024-09-22
685	Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Narrow Jump to Conclusions (NJTC) and Normalized Narrow Jump to Conclusions (N-NJTC) – parameter efficient alternatives to standard linear shortcutting that reduces shortcut parameter count by over 97%.	Amrit Diggavi Seshadri;	arxiv-cs.AI	2024-09-21
686	AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the capabilities and potential of the intelligent personal assistant (IPA) CORE (Checklist Organizer for Research and Exploration), designed to support astronauts during procedures onboard the International Space Station (ISS), the Lunar Gateway station, and beyond.	OLIVER BENSCH et. al.	arxiv-cs.AI	2024-09-21
687	The Use of GPT-4o and Other Large Language Models for The Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OpenAI’s ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft’s Copilot, Google’s Gemini 1.5 Pro, and Antrophic’s Claude 3.5 Sonnet can be effectively used in various phases of scientific research.	Goran Bubaš;	arxiv-cs.AI	2024-09-21
688	Loop Neural Networks for Parameter Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size.	Kei-Sing Ng; Qingchen Wang;	arxiv-cs.AI	2024-09-21
689	On Importance of Pruning and Distillation for Efficient Low Resource NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the case of the low-resource Indic language Marathi.	AISHWARYA MIRASHI et. al.	arxiv-cs.CL	2024-09-21
690	HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This approach ensures that the correlation between the original and updated parameters is preserved, leveraging the semantic features learned during pre-training. Building on this paradigm, we present the Hadamard Updated Transformation (HUT) method.	Geyuan Zhang; Xiaofei Zhou; Chuheng Chen;	arxiv-cs.CL	2024-09-20
691	T2M-X: Learning Expressive Text-to-Motion Generation from Partially Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose T2M-X, a two-stage method that learns expressive text-to-motion generation from partially annotated data.	Mingdian Liu; Yilin Liu; Gurunandan Krishnan; Karl S Bayer; Bing Zhou;	arxiv-cs.CV	2024-09-20
692	Prompting Large Language Models for Supporting The Differential Diagnosis of Anemia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by clinical guidelines, our study aimed to develop pathways similar to those that can be obtained in clinical guidelines.	Elisa Castagnari; Lillian Muyama; Adrien Coulet;	arxiv-cs.CL	2024-09-20
693	FAIR GPT: A Virtual Consultant for Research Data Management in ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: FAIR GPT is a first virtual consultant in ChatGPT designed to help researchers and organizations make their data and metadata compliant with the FAIR (Findable, Accessible, …	R. Shigapov; Irene Schumm;	ArXiv	2024-09-20
694	Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are renowned for their exceptional capabilities, and applying to a wide range of applications.	Md Abdur Rahman; Hossain Shahriar; Fan Wu; Alfredo Cuzzocrea;	arxiv-cs.CL	2024-09-20
695	Drift to Remember Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that representational drift can alleviate catastrophic forgetting in AI during new task acquisition. To test this, we introduce DriftNet, a network designed to constantly explore various local minima in the loss landscape while dynamically retrieving relevant tasks.	JIN DU et. al.	arxiv-cs.AI	2024-09-20
696	‘Since Lawyers Are Males..’: Examining Implicit Gender Bias in Hindi Language Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education.	Ishika Joshi; Ishita Gupta; Adrita Dey; Tapan Parikh;	arxiv-cs.CL	2024-09-20
697	3DTopia-XL: Scaling High-quality 3D Asset Generation Via Primitive Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for physically based rendering (PBR). In this paper, we introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome these limitations.	ZHAOXI CHEN et. al.	arxiv-cs.CV	2024-09-19
698	$\text{M}^\text{6}(\text{GPT})^\text{3}$: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm for the generation of melodic elements.	Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara;	arxiv-cs.SD	2024-09-19
699	TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing prompt compression techniques either rely on sub-optimal metrics such as information entropy or model it as a task-agnostic token classification problem that fails to capture task-specific information. To address these issues, we propose a novel and efficient reinforcement learning (RL) based task-aware prompt compression method.	SHIVAM SHANDILYA et. al.	arxiv-cs.CL	2024-09-19
700	Introducing The Large Medical Model: State of The Art Healthcare Cost and Risk Prediction with Transformers Trained on Patient Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration.	RICKY SAHU et. al.	arxiv-cs.LG	2024-09-19
701	Program Slicing in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the application of large language models (LLMs) to both static and dynamic program slicing, with a focus on Java programs.	Kimya Khakzad Shahandashti; Mohammad Mahdi Mohajer; Alvine Boaye Belle; Song Wang; Hadi Hemmati;	arxiv-cs.SE	2024-09-18
702	Recommendation with Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a taxonomy that categorizes DGMs into three types: ID-driven models, large language models (LLMs), and multimodal models.	YASHAR DELDJOO et. al.	arxiv-cs.IR	2024-09-18
703	AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders.	PENGAN CHEN et. al.	arxiv-cs.RO	2024-09-18
704	American Sign Language to Text Translation Using Transformer and Seq2Seq with LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the Transformer with the Sequence-to-Sequence (Seq2Seq) model in translating sign language to text.	Gregorius Guntur Sunardi Putra; Adifa Widyadhani Chanda D’Layla; Dimas Wahono; Riyanarto Sarno; Agus Tri Haryono;	arxiv-cs.CL	2024-09-17
705	Small Language Models Can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate the creative fiction writing abilities of a fine-tuned small language model (SLM), BART-large, and compare its performance to human writers and two large language models (LLMs): GPT-3.5 and GPT-4o.	Guillermo Marco; Luz Rello; Julio Gonzalo;	arxiv-cs.CL	2024-09-17
706	Adaptive Large Language Models By Layerwise Attention Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we propose to challenge this and introduce adaptive computations for LLM-like setups, which allow the final layer to attend to all of the intermediate layers as it deems fit through the attention mechanism, thereby introducing computational \textbf{attention shortcuts}.	Prateek Verma; Mert Pilanci;	arxiv-cs.CL	2024-09-16
707	SelECT-SQL: Self-correcting Ensemble Chain-of-Thought for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SelECT-SQL, a novel in-context learning solution that uses an algorithmic combination of chain-of-thought (CoT) prompting, self-correction, and ensemble methods to yield a new state-of-the-art result on challenging Text-to-SQL benchmarks.	Ke Shen; Mayank Kejriwal;	arxiv-cs.CL	2024-09-16
708	Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories.	Shaznin Sultana; Sadia Afreen; Nasir U. Eisty;	arxiv-cs.SE	2024-09-16
709	Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, inspired by the recent public release of the GPT-o1 models, we conduct the first study to compare the effectiveness of different versions of the GPT-family models in APR.	Haichuan Hu; Ye Shang; Guolin Xu; Congqing He; Quanjun Zhang;	arxiv-cs.SE	2024-09-16
710	LLMs for Clinical Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the efficacy of GPT-4 and clinalytix Medical AI in predicting the clinical risk of delirium development.	Mohamed Rezk; Patricia Cabanillas Silva; Fried-Michael Dahlweid;	arxiv-cs.CL	2024-09-16
711	Investigating The Impact of Code Comment Inconsistency on Bug Introducing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research investigates the impact of code-comment inconsistency on bug introduction using large language models, specifically GPT-3.5.	Shiva Radmanesh; Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour;	arxiv-cs.SE	2024-09-16
712	CAT: Customized Transformer Accelerator Framework on Versal ACAP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is far more flexible than GPU in hardware customization, and has better and smaller design solution space than traditional FPGA. Therefore, this paper proposes the Customized Transformer Accelerator Framework(CAT), through the CAT framework, a customized Transformer accelerator family can be derived on Versal ACAP, CAT framework has an abstract accelerator architecture design idea, which deconstructs and efficiently maps the Transformer into the hardware, which contains a variety of customizable properties.	Wenbo Zhang; Yiqi Liu; Zhenshan Bao;	arxiv-cs.AR	2024-09-15
713	GP-GPT: Large Language Model for Gene-Phenotype Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis.	YANJUN LYU et. al.	arxiv-cs.CL	2024-09-15
714	Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive investigation of the use of large language models (LLMs) and their capabilities in detecting OWASP Top Ten vulnerabilities in Solidity.	Md Tauseef Alam; Raju Halder; Abyayananda Maiti;	arxiv-cs.CR	2024-09-15
715	Leveraging Open-Source Large Language Models for Native Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Native Language Identification (NLI) – the task of identifying the native language (L1) of a person based on their writing in the second language (L2) – has applications in forensics, marketing, and second language acquisition. Historically, conventional machine learning approaches that heavily rely on extensive feature engineering have outperformed transformer-based language models on this task.	Yee Man Ng; Ilia Markov;	arxiv-cs.CL	2024-09-15
716	Enhancing LLM Problem Solving with REAP: Reflection, Explicit Problem Deconstruction, and Advanced Prompting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have transformed natural language processing, yet improving their problem-solving capabilities, particularly for complex, reasoning-intensive tasks, …	Ryan Lingo; Martin Arroyo; Rajeev Chhajer;	ArXiv	2024-09-14
717	Evaluating Authenticity and Quality of Image Captions Via Sentiment and Semantic Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an evaluation method focused on sentiment and semantic richness.	Aleksei Krotov; Alison Tebo; Dylan K. Picart; Aaron Dean Algave;	arxiv-cs.CV	2024-09-14
718	Autoregressive + Chain of Thought = Recurrent: Recurrence’s Role in Language Models’ Computability and A Revisit of Recurrent Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thoroughly investigate the influence of recurrent structures in neural models on their reasoning abilities and computability, contrasting the role autoregression plays in the neural models’ computational power.	Xiang Zhang; Muhammad Abdul-Mageed; Laks V. S. Lakshmanan;	arxiv-cs.CL	2024-09-13
719	Undergrads Are All You Have Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper also demonstrates that GPT-UGRD is cheaper and easier to train and operate than transformer models. In this paper, we outline the implementation, application, multi-tenanting, and social implications of using this new model in research and other contexts.	Ashe Neth;	arxiv-cs.CY	2024-09-13
720	Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper’s contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.	Jake Street; Isibor Ihianle; Funminiyi Olajide; Ahmad Lotfi;	arxiv-cs.LG	2024-09-12
721	SDformer: Efficient End-to-End Transformer for Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a different window-based Transformer architecture for depth completion tasks named Sparse-to-Dense Transformer (SDformer).	JIAN QIAN et. al.	arxiv-cs.CV	2024-09-12
722	Towards Fairer Health Recommendations: Finding Informative Unbiased Samples Via Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, some of these terms, especially those related to race and ethnicity, can carry different meanings (e.g., white matter of spinal cord). To address this issue, we propose the use of Word Sense Disambiguation models to refine dataset quality by removing irrelevant sentences.	GAVIN BUTTS et. al.	arxiv-cs.CL	2024-09-11
723	How Effectively Do LLMs Extract Feature-Sentiment Pairs from App Reviews? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of our study is to explore the capabilities of LLMs to perform feature-specific sentiment analysis of user reviews.	Faiz Ali Shah; Ahmed Sabir; Rajesh Sharma; Dietmar Pfahl;	arxiv-cs.CL	2024-09-11
724	A Novel Mathematical Framework for Objective Characterization of Ideas Through Vector Embeddings in LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This method suffers from limitations such as human judgment errors, bias, and oversight. Addressing this gap, our study introduces a comprehensive mathematical framework for automated analysis to objectively evaluate the plethora of ideas generated by CAI systems and/or humans.	B. Sankar; Dibakar Sen;	arxiv-cs.AI	2024-09-11
725	Analysis of Responses of GPT-4 V to The Japanese National Clinical Engineer Licensing Examination Related Papers Related Patents Related Grants Related Venues Related Experts View	Kai Ishida; Naoya Arisaka; Kiyotaka Fujii;	Journal of medical systems	2024-09-11
726	GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address the challenges of using LLM-as-a-Judge when evaluating grounded answers generated by RAG systems.	Sacha Muller; António Loison; Bilel Omrani; Gautier Viaud;	arxiv-cs.CL	2024-09-10
727	Identifying The Sources of Ideological Bias in GPT Models Through Linguistic Variation in Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we provide an original approach to identifying ideological bias in generative models, showing that bias can stem from both the training data and the filtering algorithm.	Christina Walker; Joan C. Timoneda;	arxiv-cs.CL	2024-09-09
728	Harmonic Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are becoming very popular and are used for many different purposes, including creative tasks in the arts.	Anna Kruspe;	arxiv-cs.CL	2024-09-09
729	FairHome: A Fair Housing and Fair Lending Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories.	Anusha Bagalkotkar; Aveek Karmakar; Gabriel Arnson; Ondrej Linda;	arxiv-cs.LG	2024-09-09
730	NOVI : Chatbot System for University Novice with BERT and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the difficulties of university freshmen in adapting to university life, we developed NOVI, a chatbot system based on GPT-4o.	Yoonji Nam; TaeWoong Seo; Gyeongcheol Shin; Sangji Lee; JaeEun Im;	arxiv-cs.CL	2024-09-09
731	Can Large Language Models Unlock Novel Scientific Research Ideas? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the capability of LLMs in generating novel research ideas based on information from research papers.	Sandeep Kumar; Tirthankar Ghosal; Vinayak Goyal; Asif Ekbal;	arxiv-cs.CL	2024-09-09
732	Low Latency Transformer Inference on FPGAs for Physics Applications with Hls4ml Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays(FPGAs) using hls4ml.	ZHIXING JIANG et. al.	arxiv-cs.LG	2024-09-08
733	TracrBench: Generating Interpretability Testbeds with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Achieving a mechanistic understanding of transformer-based language models is an open challenge, especially due to their large number of parameters. Moreover, the lack of ground …	Hannes Thurnherr; J’er’emy Scheurer;	ArXiv	2024-09-07
734	You Can Remove GPT2’s LayerNorm By Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we show that it is possible to remove the LN layers from a pre-trained GPT2-small model by fine-tuning on a fraction (500M tokens) of the training data.	Stefan Heimersheim;	arxiv-cs.CL	2024-09-06
735	The Emergence of Large Language Models (LLM) As A Tool in Literature Reviews: An LLM Automated Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review.	Dmitry Scherbakov; Nina Hubig; Vinita Jansari; Alexander Bakumenko; Leslie A. Lenert;	arxiv-cs.DL	2024-09-06
736	Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PIPELOAD mechanism, we present Hermes, a framework optimized for large model inference on edge devices.	XUEYUAN HAN et. al.	arxiv-cs.DC	2024-09-06
737	CA-BERT: Leveraging Context Awareness for Enhanced Multi-Turn Chat Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional models often struggle with determining when additional context is necessary for generating appropriate responses. This paper introduces Context-Aware BERT (CA-BERT), a transformer-based model specifically fine-tuned to address this challenge.	Minghao Liu; Mingxiu Sui; Yi Nan; Cangqing Wang; Zhijie Zhou;	arxiv-cs.CL	2024-09-05
738	From GPT-3.5 to GPT-4.o: A Leap in AI’s Medical Exam Performance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT is a large language model trained on increasingly large datasets to perform diverse language-based tasks. It is capable of answering multiple-choice questions, such as …	Markus Kipp;	Inf.	2024-09-05
739	Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of …	Maheep Chaudhary; Atticus Geiger;	ArXiv	2024-09-05
740	LLM-based Multi-agent Poetry Generation in Non-cooperative Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Under the rationale that the learning process of the poetry generation systems should be more human-like and their output more diverse and novel, we introduce a framework based on social learning where we emphasize non-cooperative interactions besides cooperative interactions to encourage diversity.	Ran Zhang; Steffen Eger;	arxiv-cs.CL	2024-09-05
741	CACER: Clinical Concept Annotations for Cancer Events and Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Clinical Concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48,000 medical problems and drug events and 10,000 drug-problem and problem-problem relations.	YUJUAN FU et. al.	arxiv-cs.CL	2024-09-05
742	Detecting Calls to Action in Multimodal Content: Analysis of The 2021 German Federal Election Campaign on Instagram Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts.	Michael Achmann-Denkler; Jakob Fehle; Mario Haim; Christian Wolff;	arxiv-cs.SI	2024-09-04
743	A Comparative Study of Sentiment Classification Models for Greek Reviews Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, people have expressed their opinions and sentiments about products, services, and other issues on social media platforms and review websites. These sentiments are …	Panagiotis D. Michailidis;	Big Data Cogn. Comput.	2024-09-04
744	OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the experiments and results for the CheckThat!	WŁODZIMIERZ LEWONIEWSKI et. al.	arxiv-cs.CL	2024-09-04
745	MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Finally many Transformer based approaches rely primarily on CNN based decoders overlooking the benefits of Transformer based decoding models. Recognizing these limitations, we address the need efficient lightweight solutions by introducing MobileUNETR, which aims to overcome the performance constraints associated with both CNNs and Transformers while minimizing model size, presenting a promising stride towards efficient image segmentation.	Shehan Perera; Yunus Erzurumlu; Deepak Gulati; Alper Yilmaz;	arxiv-cs.CV	2024-09-04
746	LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Modeling and predicting such intricate behavior without explicit knowledge of the system’s underlying topology presents a significant challenge, motivating the development of algorithms that can generalize across various grid configurations and boundary conditions. We develop a decoder-only generative pretrained transformer (GPT) model to solve this problem, showing that our model can simulate Life on a toroidal grid with no prior knowledge on the size of the grid, or its periodic boundary conditions (LifeGPT).	Jaime A. Berkovich; Markus J. Buehler;	arxiv-cs.AI	2024-09-03
747	Dialogue You Can Trust: Human and AI Perspectives on Generated Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing the GPT-4o API, we generated a diverse dataset of conversations and conducted a two-part experimental analysis.	Ike Ebubechukwu; Johane Takeuchi; Antonello Ceravola; Frank Joublin;	arxiv-cs.CL	2024-09-03
748	How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper seeks to address this gap by providing a comprehensive case study evaluating LLMs’ performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) with respect to privacy policies and data protection regulations. We introduce a Privacy Technical Review (PTR) framework, highlighting its role in mitigating privacy risks during the software development life-cycle.	YANG LIU et. al.	arxiv-cs.CL	2024-09-03
749	Beyond ChatGPT: Enhancing Software Quality Assurance Tasks with Diverse LLMs and Validation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There remains a gap in understanding the performance of various LLMs in this critical domain. This paper aims to address this gap by conducting a comprehensive investigation into the capabilities of several LLMs across two SQA tasks: fault localization and vulnerability detection.	Ratnadira Widyasari; David Lo; Lizi Liao;	arxiv-cs.SE	2024-09-02
750	Research on LLM Acceleration Using The High-Performance RISC-V Processor Xiangshan (Nanhu Version) Based on The Open-Source Matrix Instruction Set Extension (Vector Dot Product) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main contributions of this paper are as follows: For the characteristics of large language models, custom instructions were extended based on the RISC-V instruction set to perform vector dot product calculations, accelerating the computation of large language models on dedicated vector dot product acceleration hardware.	XU-HAO CHEN et. al.	arxiv-cs.AR	2024-09-01
751	A Current-Fed Transformer-Based High-Gain DC–DC Converter With Inverse Gain Characteristic for Renewable Energy Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This article proposes a current-fed high-gain high-efficiency dc–dc converter for renewable energy applications such as photovoltaic energy. The converter is based on the modified …	E. A. O. Barbosa; Mário Lúcio da Silva Martins; L. Limongi; R. Neto; E. J. Barbosa;	IEEE Transactions on Industrial Electronics	2024-09-01
752	Towards Faster Graph Partitioning Via Pre-training and Inductive Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm.	MENG QIN et. al.	arxiv-cs.LG	2024-09-01
753	Selective Information Flow for Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	J. Kugarajeevan; T. Kokul; Amirthalingam Ramanan; S. Fernando;	Expert Syst. Appl.	2024-09-01
754	An Empirical Study on Information Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs’ human-like characteristics, we propose and analyze the effects of a series of simple prompt-based methods, which can be generalized to other LLMs and NLP tasks.	RIDONG HAN et. al.	arxiv-cs.CL	2024-08-31
755	Finding Frames with BERT: A Transformer-based Approach to Generic News Frame Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The availability of digital data offers new possibilities for studying how specific aspects of social reality are made more salient in online communication but also raises challenges related to the scaling of framing analysis and its adoption to new research areas (e.g. studying the impact of artificial intelligence-powered systems on representation of societally relevant issues). To address these challenges, we introduce a transformer-based approach for generic news frame detection in Anglophone online content.	Vihang Jumle; Mykola Makhortykh; Maryna Sydorova; Victoria Vziatysheva;	arxiv-cs.CL	2024-08-30
756	From Text to Emotion: Unveiling The Emotion Annotation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the potential of Large Language Models (LLMs), specifically GPT4, in automating or assisting emotion annotation.	Minxue Niu; Mimansa Jaiswal; Emily Mower Provost;	arxiv-cs.CL	2024-08-30
757	Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning), using leverage retrieval information from the memory to aid in generating accurate answers and persuasive explanations without relying on complex networks and extra datasets.	Su Hyeon Lim; Minkuk Kim; Hyeon Bae Kim; Seong Tae Kim;	arxiv-cs.CV	2024-08-30
758	Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study performs a comparative analysis of various natural language models for medical text classification.	SHUBHAM AGARWAL et. al.	arxiv-cs.CL	2024-08-30
759	ProGRes: Prompted Generative Rescoring on ASR N-Best Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs.	Ada Defne Tur; Adel Moumen; Mirco Ravanelli;	arxiv-cs.CL	2024-08-30
760	Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in The Environmental and Climate Change Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through this research, we aim to contribute to the ongoing discussion on the utility and effectiveness of generative LMs in addressing some of the planet’s most urgent issues, highlighting their strengths and limitations in the context of ecology and CC.	Francesca Grasso; Stefano Locci;	arxiv-cs.CL	2024-08-30
761	Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing far-right and far-left ideological keywords and manually labeled them as extremist or non-extremist.	Beidi Dong; Jin R. Lee; Ziwei Zhu; Balassubramanian Srinivasan;	arxiv-cs.CL	2024-08-29
762	MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Orthogonally, in this work we rely solely on imitation learning that leverages a large dataset of expert MAPF solutions and transformer-based neural network to create a foundation model for MAPF called MAPF-GPT.	Anton Andreychuk; Konstantin Yakovlev; Aleksandr Panov; Alexey Skrynnik;	arxiv-cs.MA	2024-08-29
763	Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Fully Pipelined Distributed Transformer (FPDT) for efficiently training long-context LLMs with extreme hardware efficiency.	JINGHAN YAO et. al.	arxiv-cs.DC	2024-08-29
764	Can AI Replace Human Subjects? A Large-Scale Replication of Psychological Experiments with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that GPT-4 successfully replicates 76.0 percent of main effects and 47.0 percent of interaction effects observed in the original studies, closely mirroring human responses in both direction and significance.	Ziyan Cui; Ning Li; Huaikang Zhou;	arxiv-cs.CL	2024-08-29
765	Unleashing The Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Audio-Language-Referenced SAM 2 (AL-Ref-SAM 2) pipeline to explore the training-free paradigm for audio and language-referenced video object segmentation, namely AVS and RVOS tasks.	SHAOFEI HUANG et. al.	arxiv-cs.CV	2024-08-28
766	FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses Over SORRY-Bench (Automated Multi-shot Jailbreaks) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FRACTURED-SORRY-Bench, a framework for evaluating the safety of Large Language Models (LLMs) against multi-turn conversational attacks.	Aman Priyanshu; Supriti Vijay;	arxiv-cs.CL	2024-08-28
767	Speech Recognition Transformers: Topological-lingualism Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a comprehensive survey of transformer techniques oriented in speech modality.	Shruti Singh; Muskaan Singh; Virender Kadyan;	arxiv-cs.CL	2024-08-27
768	A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this review paper, we provide an extensive overview of various transformer architectures adapted for computer vision tasks.	Gracile Astlin Pereira; Muhammad Hussain;	arxiv-cs.CV	2024-08-27
769	The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of converting these pretrained models for deployment.	Junxiong Wang; Daniele Paliotta; Avner May; Alexander M. Rush; Tri Dao;	arxiv-cs.LG	2024-08-27
770	Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models Without Instruction-Following Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction fine-tuning is crucial for today’s large language models (LLMs) to learn to follow instructions and align with human preferences. Conventionally, supervised data, …	Juncheng Xie; Shensian Syu; Hung-yi Lee;	ArXiv	2024-08-27
771	Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluated multiple models, including OpenAI’s gpt-3.5-turbo, gpt-4-turbo, gpt-4o, ZhipuAI’s glm-4, Anthropic’s claude-3-sonnet-20240229, and MoonShot’s moonshot-v1-8k, using a two-phase testing approach.	LIUCHANG XU et. al.	arxiv-cs.CL	2024-08-26
772	Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite considerable efforts in attack detection, intrusion detection systems remain mostly reactive, responding to specific patterns or observed anomalies. This work proposes a proactive approach to anticipate and mitigate malicious activities before they cause damage.	Alaeddine Diaf; Abdelaziz Amara Korba; Nour Elislem Karabadji; Yacine Ghamri-Doudane;	arxiv-cs.CR	2024-08-26
773	One-layer Transformers Fail to Solve The Induction Heads Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient …	Clayton Sanford; Daniel Hsu; Matus Telgarsky;	arxiv-cs.LG	2024-08-26
774	Bidirectional Awareness Induction in Autoregressive Seq2Seq Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Bidirectional Awareness Induction (BAI), a training method that leverages a subset of elements in the network, the Pivots, to perform bidirectional learning without breaking the autoregressive constraints.	Jia Cheng Hu; Roberto Cavicchioli; Alessandro Capotondi;	arxiv-cs.CL	2024-08-25
775	Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Methods: We used 300 gastroenterology board exam-style multiple-choice questions, 138 of which contain images to systematically assess the impact of model configurations and parameters and prompt engineering strategies utilizing GPT-3.5.	SEYED AMIR AHMAD SAFAVI-NAINI et. al.	arxiv-cs.CL	2024-08-25
776	LowCLIP: Adapting The CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address challenges in vision-language retrieval for low-resource languages, we integrated the CLIP model architecture and employed several techniques to balance computational efficiency with performance.	Ali Asgarov; Samir Rustamov;	arxiv-cs.CV	2024-08-25
777	Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine Against COVID-19 Literature: Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed.	XU TONG et. al.	arxiv-cs.CL	2024-08-24
778	Preliminary Investigations of A Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an innovative architecture that leverages the generative capabilities of zero-shot prompting in Large Language Models (LLMs) such as GPT-4(language only), the predictive ability of few-shot (in-context) learning in Large Multimodal Models (LMMs) such as GPT-4(V)ision, and fuses knowledge across image based and linguistic insights for accurate nanomaterial category prediction.	Sakhinana Sagar Srinivas; Geethan Sannidhi; Sreeja Gangasani; Chidaksh Ravuru; Venkataramana Runkana;	arxiv-cs.CV	2024-08-24
779	CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a CNN-Transformer rectified collaborative learning (CTRCL) framework to learn stronger CNN-based and Transformer-based models for MIS tasks via the bi-directional knowledge transfer between them.	LANHU WU et. al.	arxiv-cs.CV	2024-08-24
780	Enhancing Multi-hop Reasoning Through Knowledge Erasure in Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE).	MENGQI ZHANG et. al.	arxiv-cs.CL	2024-08-22
781	Enhancing Automated Program Repair with Solution Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a compelling question: How can we leverage DR scattered across the issue logs to efficiently enhance APR? To investigate this premise, we introduce DRCodePilot, an approach designed to augment GPT-4-Turbo’s APR capabilities by incorporating DR into the prompt instruction.	JIUANG ZHAO et. al.	arxiv-cs.SE	2024-08-21
782	Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands.	Pihe Hu; Shaolong Li; Longbo Huang;	arxiv-cs.LG	2024-08-21
783	The Self-Contained Negation Test Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we build on Gubelmann and Handschuh (2022), which studies the modification of PLMs’ predictions as a function of the polarity of inputs, in English.	David Kletz; Pascal Amsili; Marie Candito;	arxiv-cs.CL	2024-08-21
784	Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.	MENGLIN YANG et. al.	kdd	2024-08-21
785	Clinical Context-aware Radiology Report Generation from Medical Images Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the use of the transformer model for radiology report generation from chest X-rays.	Sonit Singh;	arxiv-cs.CL	2024-08-21
786	BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a pipeline for developing an in-house LLM to extract clinical information from radiology reports.	YUXUAN CHEN et. al.	arxiv-cs.CL	2024-08-21
787	The MERSA Dataset and A Transformer-Based Approach for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Multimodal Emotion Recognition and Sentiment Analysis (MERSA) dataset, which includes both natural and scripted speech recordings, transcribed text, physiological data, and self-reported emotional surveys from 150 participants collected over a two-week period.	Enshi Zhang; Rafael Trujillo; Christian Poellabauer;	acl	2024-08-20
788	Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer.	MULING WU et. al.	acl	2024-08-20
789	CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of a comprehensive benchmark impedes progress in this field. To bridge this gap, we introduce CharacterEval, a Chinese benchmark for comprehensive RPCA assessment, complemented by a tailored high-quality dataset.	QUAN TU et. al.	acl	2024-08-20
790	Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, methods leveraging pre-trained language models like BERT have been developed, which require less data and yield enhanced performance.	YUCHENG RUAN et. al.	arxiv-cs.CL	2024-08-20
791	Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level operating system microkernel, seL4.	Lichen Zhang; Shuai Lu; Nan Duan;	acl	2024-08-20
792	MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability�MELA, with 46K samples covering 10 languages from a diverse set of language families.	ZIYIN ZHANG et. al.	acl	2024-08-20
793	ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, and undergoes a comprehensive training regime with pre-training, SFT, and RLHF.	Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang;	acl	2024-08-20
794	GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.	Virginia Felkner; Jennifer Thompson; Jonathan May;	acl	2024-08-20
795	Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Aim: Our goal is to improve AD detection performance of various ML/DL models.	Emmanuel Iko-Ojo Simon; Chirath Hettiarachchi; Alex Potanin; Hanna Suominen; Fatemeh Fard;	arxiv-cs.SE	2024-08-20
796	Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs By Sampling with People Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by methods from cognitive science, we propose an iterative method for simultaneously eliciting conversational tones and sentences, where participants alternate between two tasks: (1) one participant identifies the tone of a given sentence and (2) a different participant generates a sentence based on that tone.	Dun-Ming Huang; Pol Van Rijn; Ilia Sucholutsky; Raja Marjieh; Nori Jacoby;	acl	2024-08-20
797	Language Models Can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks.	Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty;	acl	2024-08-20
798	CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Incorrect initial angles between Q and K can cause misestimation in modeling rotary position embedding of the closest tokens. To address this issue, we propose Collinear Constrained Attention mechanism, namely CoCA.	SHIYI ZHU et. al.	acl	2024-08-20
799	Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis.	Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu;	acl	2024-08-20
800	MultiLegalPile: A 689GB Multilingual Legal Corpus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, so far, few datasets are available for specialized critical domains such as law and the available ones are often small and only in English. To fill this gap, we curate and release MultiLegalPile, a 689GB corpus in 24 languages from 17 jurisdictions.	Joel Niklaus; Veton Matoshi; Matthias St�rmer; Ilias Chalkidis; Daniel Ho;	acl	2024-08-20
801	Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate new commercial and open models since VerilogEval’s original release-including GPT-4o, GPT-4 Turbo, Llama3.1 (8B/70B/405B), Llama3 70B, Mistral Large, DeepSeek Coder (33B and 6.7B), CodeGemma 7B, and RTL-Coder-against an improved VerilogEval benchmark suite.	Nathaniel Pinckney; Christopher Batten; Mingjie Liu; Haoxing Ren; Brucek Khailany;	arxiv-cs.AR	2024-08-20
802	MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel map-guided GPT-based agent, dubbed MapGPT, which introduces an online linguistic-formed map to encourage the global exploration.	JIAQI CHEN et. al.	acl	2024-08-20
803	An Empirical Analysis on Large Language Models in Debate Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.	Xinyi Liu; Pinxin Liu; Hangfeng He;	acl	2024-08-20
804	Self-Evolving GPT: A Lifelong Autonomous Experiential Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential learning framework based on LLMs to explore whether LLMs can imitate human ability for learning and utilizing experience.	JINGLONG GAO et. al.	acl	2024-08-20
805	Your Transformer Is Secretly Linear Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper reveals a novel linear characteristic exclusive to transformer decoders, including models like GPT, LLaMA, OPT, BLOOM and others.	ANTON RAZZHIGAEV et. al.	acl	2024-08-20
806	Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Syntactic Transformer language models aim to achieve better generalization through simultaneously modeling syntax trees and sentences.	Yida Zhao; Chao Lou; Kewei Tu;	acl	2024-08-20
807	Linear Transformers with Learnable Kernel Functions Are Better In-Context Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Mirroring the Transformer�s in-context adeptness, it became a strong contender in the field. In our work, we present a singular, elegant alteration to the Based kernel that amplifies its In-Context Learning abilities evaluated with the Multi-Query Associative Recall task and overall language modeling process, as demonstrated on the Pile dataset.	YAROSLAV AKSENOV et. al.	acl	2024-08-20
808	Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian.	CEM ÜYÜK et. al.	arxiv-cs.CL	2024-08-20
809	D2LLM: Decomposed and Distilled Large Language Models for Semantic Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present D2LLMs�Decomposed and Distilled LLMs for semantic search�that combines the best of both worlds.	Zihan Liao; Hang Yu; Jianguo Li; Jun Wang; Wei Zhang;	acl	2024-08-20
810	Rhyme-aware Chinese Lyric Generator Based on GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance.	YIXIAO YUAN et. al.	arxiv-cs.CL	2024-08-19
811	Demystifying The Communication Characteristics for Distributed Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use GPT-based language models as a case study of the transformer architecture due to their ubiquity.	QUENTIN ANTHONY et. al.	arxiv-cs.DC	2024-08-19
812	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs).	Aviv Bick; Kevin Y. Li; Eric P. Xing; J. Zico Kolter; Albert Gu;	arxiv-cs.LG	2024-08-19
813	GPT-based Textile Pilling Classification Using 3D Point Cloud Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PointGPT, the GPT-like big model of point cloud analysis, we incorporate the global features of the input point cloud extracted from the non-parametric network into it, thus proposing the PointGPT+NN model.	Yu Lu; YuYu Chen; Gang Zhou; Zhenghua Lan;	arxiv-cs.CV	2024-08-19
814	How Well Do Large Language Models Serve As End-to-End Secure Code Producers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic investigation into LLMs’ inherent potential to generate code with fewer vulnerabilities.	JIANIAN GONG et. al.	arxiv-cs.SE	2024-08-19
815	LLMSmartSec: Smart Contract Security Auditing with LLM and Annotated Control Flow Graph Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Historically, the complexity of identifying vulnerabilities in smart contracts required human-intensive audits to supplement imprecise automated code scans. The growing smart …	Viraaji Mothukuri; R. Parizi; James L. Massa;	2024 IEEE International Conference on Blockchain …	2024-08-19
816	GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges have resulted in travel difficulties for passengers in certain areas, while many drivers in other areas are unable to secure orders, leading to a decline in the overall quality of urban transportation services. To address these issues, this paper introduces GARLIC: a framework of GPT-Augmented Reinforcement Learning with Intelligent Control for vehicle dispatching.	XIAO HAN et. al.	arxiv-cs.LG	2024-08-19
817	Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare a keyword filtering approach, a RoBERTa model fine-tuned with generic data from CrisisLex, a base RoBERTa model trained with AL and a fine-tuned RoBERTa model trained with AL regarding classification performance.	David Hanny; Sebastian Schmidt; Bernd Resch;	arxiv-cs.CL	2024-08-19
818	A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets.	CLAUDIO M. V. DE ANDRADE et. al.	arxiv-cs.CL	2024-08-18
819	AI Based Multiagent Approach for Requirements Elicitation and Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Requirements Engineering (RE) plays a pivotal role in software development, encompassing tasks such as requirements elicitation, analysis, specification, and change management. …	MALIK ABDUL SAMI et. al.	ArXiv	2024-08-18
820	Attention Is A Smoothed Cubic Spline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We highlight a perhaps important but hitherto unobserved insight: The attention module in a transformer is a smoothed cubic spline.	Zehua Lai; Lek-Heng Lim; Yucong Liu;	arxiv-cs.AI	2024-08-18
821	From Specifications to Prompts: On The Future of Generative LLMs in Requirements Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative LLMs, such as GPT, have the potential to revolutionize Requirements Engineering (RE) by automating tasks in new ways. This column explores the novelties and introduces …	Andreas Vogelsang;	arxiv-cs.SE	2024-08-17
822	See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Designing tasks and finding LLMs’ limitations are becoming increasingly important. In this paper, we investigate the question of whether an LLM can discover its own limitations from the errors it makes.	YULONG CHEN et. al.	arxiv-cs.CL	2024-08-16
823	MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a pure Transformer-based SED model with masked-reconstruction based pre-training, termed MAT-SED.	Pengfei Cai; Yan Song; Kang Li; Haoyu Song; Ian McLoughlin;	arxiv-cs.SD	2024-08-16
824	Retail-GPT: Leveraging Retrieval Augmented Generation (RAG) for Building E-commerce Chat Assistants Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This work presents Retail-GPT, an open-source RAG-based chatbot designed to enhance user engagement in retail e-commerce by guiding users through product recommendations and …	Bruno Amaral Teixeira de Freitas; R. Lotufo;	ArXiv	2024-08-15
825	Extracting Sentence Embeddings from Pretrained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Namely, given 110 M parameters, BERT’s hidden representations from multiple layers, and many tokens, we try diverse ways to extract optimal sentence embeddings.	Lukas Stankevičius; Mantas Lukoševičius;	arxiv-cs.CL	2024-08-15
826	Leveraging Web-Crawled Data for High-Quality Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We argue that although the web-crawled data often has formatting errors causing semantic inaccuracies, it can still serve as a valuable source for high-quality supervised fine-tuning in specific domains without relying on advanced models like GPT-4.	Jing Zhou; Chenglin Jiang; Wei Shen; Xiao Zhou; Xiaonan He;	arxiv-cs.CL	2024-08-15
827	Exploring Transformer Models for Sentiment Classification: A Comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning models have proven superior to classical machine learning approaches in various text classification tasks, such as sentiment analysis, question answering, news …	Ali Areshey; H. Mathkour;	Expert Syst. J. Knowl. Eng.	2024-08-14
828	Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This survey paper provides a comprehensive analysis of the utilization of Transformers and LLMs in cyber-threat detection systems.	Hamza Kheddar;	arxiv-cs.CR	2024-08-14
829	CodeMirage: Hallucinations in Code Generated By Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have shown promising potentials in program generation and no-code automation. However, LLMs are prone to generate hallucinations, i.e., they generate …	Vibhor Agarwal; Yulong Pei; Salwa Alamir; Xiaomo Liu;	ArXiv	2024-08-14
830	MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Models for Multimodal Surface Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting).	YONGQUAN HU et. al.	arxiv-cs.HC	2024-08-14
831	Pragmatic Inference of Scalar Implicature By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some.	Ye-eun Cho; Seong mook Kim;	arxiv-cs.CL	2024-08-13
832	Generative AI for Automatic Topic Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to assess the reliability of three LLMs, namely flan, GPT-4o, and GPT-4 mini for topic labelling.	Diego Kozlowski; Carolina Pradier; Pierre Benz;	arxiv-cs.CL	2024-08-13
833	Evaluating Cultural Adaptability of A Large Language Model Via Simulation of Synthetic Personas Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis shows that specifying a person’s country of residence improves GPT-3.5’s alignment with their responses.	Louis Kwok; Michal Bravansky; Lewis D. Griffin;	arxiv-cs.CL	2024-08-13
834	MGH Radiology Llama: A Llama 3 70B Model for Radiology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the field of radiology has increasingly harnessed the power of artificial intelligence (AI) to enhance diagnostic accuracy, streamline workflows, and improve …	YUCHENG SHI et. al.	ArXiv	2024-08-13
835	Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel resource efficient deep learning and transformer based framework, Sumotosima (Summarizer for otoscopic images), an end-to-end pipeline for classification followed by summarization.	Eram Anwarul Khan; Anas Anwarul Haq Khan;	arxiv-cs.CV	2024-08-13
836	A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the constantly evolving field of cybersecurity, it is imperative for analysts to stay abreast of the latest attack trends and pertinent information that aids in the investigation and attribution of cyber-attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecurity experts about cyber-attacks investigations and attribution.	Sampath Rajapaksha; Ruby Rani; Erisa Karafili;	arxiv-cs.CR	2024-08-12
837	Body Transformer: Leveraging Robot Embodiment for Policy Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. Therefore, we propose Body Transformer (BoT), an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process.	Carmelo Sferrazza; Dun-Ming Huang; Fangchen Liu; Jongmin Lee; Pieter Abbeel;	arxiv-cs.RO	2024-08-12
838	Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the effectiveness of LLMs in detecting and classifying Common Weakness Enumerations (CWE) using different prompt and role strategies.	Kohei Dozono; Tiago Espinha Gasiba; Andrea Stocco;	arxiv-cs.SE	2024-08-12
839	A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is a huge gap between LLM’s and human capabilities for understanding abstract concepts and reasoning. We discuss these issues in a larger philosophical context of human knowledge acquisition and the Turing test.	Vladimir Cherkassky; Eng Hock Lee;	arxiv-cs.CL	2024-08-12
840	MGH Radiology Llama: A Llama 3 70B Model for Radiology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an advanced radiology-focused large language model: MGH Radiology Llama.	YUCHENG SHI et. al.	arxiv-cs.CL	2024-08-12
841	The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability of findings across different scenarios. We address this gap by training language models with progressing complexity on trauma-related datasets, including genocide-related court data, a Reddit dataset on post-traumatic stress disorder (PTSD), counseling conversations, and Incel forum posts.	Miriam Schirmer; Tobias Leemann; Gjergji Kasneci; Jürgen Pfeffer; David Jurgens;	arxiv-cs.CL	2024-08-12
842	Spacetime $E(n)$-Transformer: Equivariant Attention for Spatio-temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an $E(n)$-equivariant Transformer architecture for spatio-temporal graph data.	Sergio G. Charles;	arxiv-cs.LG	2024-08-12
843	Is It A Work or Leisure Travel? Applying Text Classification to Identify Work-related Travel on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a model to predict whether a trip is leisure or work-related, utilizing state-of-the-art Automatic Text Classification (ATC) models such as BERT, RoBERTa, and BART to enhance the understanding of user travel purposes and improve recommendation accuracy in specific travel scenarios.	Lucas Félix; Washington Cunha; Jussara Almeida;	arxiv-cs.SI	2024-08-12
844	Cross-Lingual Conversational Speech Summarization with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We build a baseline cascade-based system using open-source speech recognition and machine translation models.	Max Nelson; Shannon Wotherspoon; Francis Keith; William Hartmann; Matthew Snover;	arxiv-cs.CL	2024-08-12
845	GPT-4 Emulates Average-Human Emotional Cognition from A Third-Person Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper extends recent investigations on the emotional reasoning abilities of Large Language Models (LLMs). Current research on LLMs has not directly evaluated the distinction …	Ala Nekouvaght Tak; Jonathan Gratch;	ArXiv	2024-08-11
846	Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches struggle with CQA due to two challenges: (1) precisely identifying necessary conditions and the logical relationship, and (2) verifying conditions to detect any that are missing. In this paper, we propose a novel prompting approach, Chain of condition, by first identifying all conditions and constructing their logical relationships explicitly according to the document, then verifying whether these conditions are satisfied, finally solving the logical expression to indicate any missing conditions and generating the answer accordingly.	Jiuheng Lin; Yuxuan Lai; Yansong Feng;	arxiv-cs.CL	2024-08-10
847	From Text to Insight: Leveraging Large Language Models for Performance Evaluation in Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through comparative analyses across two studies, including various task performance outputs, we demonstrate that LLMs can serve as a reliable and even superior alternative to human raters in evaluating knowledge-based performance outputs, which are a key contribution of knowledge workers.	Ning Li; Huaikang Zhou; Mingze Xu;	arxiv-cs.CL	2024-08-09
848	Evaluating The Capability of Large Language Models to Personalize Science Texts for Diverse Middle-school-age Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, GPT-4 was used to profile student learning preferences based on choices made during a training session.	Michael Vaccaro Jr; Mikayla Friday; Arash Zaghi;	arxiv-cs.HC	2024-08-09
849	Retrieval-augmented Code Completion for Local Projects Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on using LLMs with around 160 million parameters that are suitable for local execution and augmentation with retrieval from local projects.	Marko Hostnik; Marko Robnik-Šikonja;	arxiv-cs.SE	2024-08-09
850	Transformer Explainer: Interactive Learning of Text-Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model.	AEREE CHO et. al.	arxiv-cs.LG	2024-08-08
851	Multi-Class Intrusion Detection Based on Transformer for IoT Networks Using CIC-IoT-2023 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study uses deep learning methods to explore the Internet of Things (IoT) network intrusion detection method based on the CIC-IoT-2023 dataset. This dataset contains extensive …	Shu-Ming Tseng; Yan-Qi Wang; Yung-Chung Wang;	Future Internet	2024-08-08
852	Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles Using LLMs and LMMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores how LLMs and LMMs can assist journalistic practice by generating contextualised captions for images accompanying news articles.	Aliki Anagnostopoulou; Thiago Gouvea; Daniel Sonntag;	arxiv-cs.CL	2024-08-08
853	Towards Explainable Network Intrusion Detection Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current state-of-the-art NIDS rely on artificial benchmarking datasets, resulting in skewed performance when applied to real-world networking environments. Therefore, we compare the GPT-4 and LLama3 models against traditional architectures and transformer-based models to assess their ability to detect malicious NetFlows without depending on artificially skewed datasets, but solely on their vast pre-trained acquired knowledge.	Paul R. B. Houssel; Priyanka Singh; Siamak Layeghy; Marius Portmann;	arxiv-cs.CR	2024-08-08
854	SocFedGPT: Federated GPT-based Adaptive Content Filtering System Leveraging User Interactions in Social Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study presents a multifaceted approach to enhancing user interaction and content relevance in social media platforms through a federated learning framework. We introduce …	Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder;	ArXiv	2024-08-07
855	Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge.	Steven Y. Feng; Noah D. Goodman; Michael C. Frank;	arxiv-cs.CL	2024-08-07
856	A Comparison of LLM Finetuning Methods & Evaluation Metrics with Travel Chatbot Use Case Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We used two pretrained LLMs utilized for fine-tuning research: LLaMa 2 7B, and Mistral 7B.	Sonia Meyer; Shreya Singh; Bertha Tam; Christopher Ton; Angel Ren;	arxiv-cs.CL	2024-08-07
857	Could ChatGPT Get An Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conceptualize these challenges through the lens of vulnerability, the potential for university assessments and learning outcomes to be impacted by student use of generative AI. We investigate the potential scale of this vulnerability by measuring the degree to which AI assistants can complete assessment questions in standard university-level STEM courses.	BEATRIZ BORGES et. al.	arxiv-cs.CY	2024-08-07
858	Evaluating Source Code Quality with Large Languagem Models: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to describe and analyze the results obtained using LLMs as a static analysis tool, evaluating the overall quality of code.	Igor Regis da Silva Simões; Elaine Venson;	arxiv-cs.SE	2024-08-07
859	Image-to-LaTeX Converter for Mathematical Formulas and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this project, we train a vision encoder-decoder model to generate LaTeX code from images of mathematical formulas and text.	Daniil Gurgurov; Aleksey Morshnev;	arxiv-cs.CL	2024-08-07
860	FLASH: Federated Learning-Based LLMs for Advanced Query Processing in Social Networks Through RAG Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our paper introduces a novel approach to social network information retrieval and user engagement through a personalized chatbot system empowered by Federated Learning GPT. The …	Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder;	ArXiv	2024-08-06
861	Evaluating The Translation Performance of Large Language Models Based on Euas-20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the significant progress in translation performance achieved by large language models, machine translation still faces many challenges. Therefore, in this paper, we construct the dataset Euas-20 to evaluate the performance of large language models on translation tasks, the translation ability on different languages, and the effect of pre-training data on the translation ability of LLMs for researchers and developers.	Yan Huang; Wei Liu;	arxiv-cs.CL	2024-08-06
862	Training LLMs to Recognize Hedges in Spontaneous Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: After an error analysis on the top performing approaches, we used an LLM-in-the-Loop approach to improve the gold standard coding, as well as to highlight cases in which hedges are ambiguous in linguistically interesting ways that will guide future research.	Amie J. Paige; Adil Soubki; John Murzaku; Owen Rambow; Susan E. Brennan;	arxiv-cs.CL	2024-08-06
863	Accuracy and Consistency of LLMs in The Registered Dietitian Exam: The Impact of Prompt Engineering and Knowledge Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to employ the Registered Dietitian (RD) exam to conduct a standard and comprehensive evaluation of state-of-the-art LLMs, GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro, assessing both accuracy and consistency in nutrition queries.	Iman Azimi; Mohan Qi; Li Wang; Amir M. Rahmani; Youlin Li;	arxiv-cs.CL	2024-08-06
864	HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models.	Pratyush Dhingra; Janardhan Rao Doppa; Partha Pratim Pande;	arxiv-cs.AR	2024-08-06
865	PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent success of pre-trained models (PTMs) in natural language processing (NLP), we present PTM4Tag+, a tag recommendation framework for Stack Overflow posts that utilizes PTMs in language modeling.	JUNDA HE et. al.	arxiv-cs.SE	2024-08-05
866	Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG.	Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour;	arxiv-cs.SE	2024-08-05
867	Evaluating The Performance of Large Language Models for SDG Mapping (Technical Report) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we compare the performance of various language models on the Sustainable Development Goal (SDG) mapping task, using the output of GPT-4o as the baseline.	Hui Yin; Amir Aryani; Nakul Nambiar;	arxiv-cs.LG	2024-08-04
868	X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities.	HAOYUAN JIANG et. al.	ijcai	2024-08-03
869	QFormer: An Efficient Quaternion Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Secondly, the DCNNs or Transformer-based image denoising models usually have a large number of parameters, high computational complexity, and slow inference speed. To resolve these issues, this paper proposes a highly-efficient Quaternion Transformer (QFormer) for image denoising.	Bo Jiang; Yao Lu; Guangming Lu; Bob Zhang;	ijcai	2024-08-03
870	Class-consistent Contrastive Learning Driven Cross-dimensional Transformer for 3D Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer emerges as an active research topic in medical image analysis. Yet, three substantial challenges limit the effectiveness of both 2D and 3D Transformers in 3D medical …	Qikui Zhu; Chuan Fu; Shuo Li;	ijcai	2024-08-03
871	MiniCPM-V: A GPT-4V Level MLLM on Your Phone IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MiniCPM-V, a series of efficient MLLMs deployable on end-side devices.	YUAN YAO et. al.	arxiv-cs.CV	2024-08-03
872	Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants.	ZHUOYI LIN et. al.	ijcai	2024-08-03
873	TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires.	Yicheng Lin; Dandan Zhang; Yun Liu;	arxiv-cs.LG	2024-08-02
874	Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the basis of this analysis, we propose DefinitionEMB, a method that utilizes definitions to re-construct isotropically distributed and semantics-related token embeddings for encoder-based PLMs while maintaining original robustness during fine-tuning.	Ying Zhang; Dongyuan Li; Manabu Okumura;	arxiv-cs.CL	2024-08-02
875	Toward Automatic Relevance Judgment Using Vision-Language Models for Image-Text Retrieval Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain. This paper assesses …	Jheng-Hong Yang; Jimmy Lin;	ArXiv	2024-08-02
876	Efficacy of Large Language Models in Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the effectiveness of Large Language Models (LLMs) in interpreting existing literature through a systematic review of the relationship between Environmental, Social, and Governance (ESG) factors and financial performance.	Aaditya Shah; Shridhar Mehendale; Siddha Kanthi;	arxiv-cs.CL	2024-08-02
877	Toward Automatic Relevance Judgment Using Vision–Language Models for Image–Text Retrieval Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain.	Jheng-Hong Yang; Jimmy Lin;	arxiv-cs.IR	2024-08-02
878	High-Throughput Phenotyping of Clinical Text Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a performance comparison of GPT-4 and GPT-3.5-Turbo.	Daniel B. Hier; S. Ilyas Munzir; Anne Stahlfeld; Tayo Obafemi-Ajayi; Michael D. Carrithers;	arxiv-cs.CL	2024-08-02
879	Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces ‘Psycho Analyst’, a custom GPT model based on OpenAI’s GPT-4, optimized for pre-screening mental health disorders.	Jinwen Tang; Yi Shang;	arxiv-cs.CY	2024-08-02
880	Leveraging Large Language Models (LLMs) for Traffic Management at Urban Intersections: The Case of Mixed Traffic Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the ability of a Large Language Model (LLM), specifically, GPT-4o-mini to improve traffic management at urban intersections.	Sari Masri; Huthaifa I. Ashqar; Mohammed Elhenawy;	arxiv-cs.CL	2024-08-01
881	Granting GPT-4 License and Opportunity: Enhancing Accuracy and Confidence Estimation for Few-Shot Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present effort explores methods for effective confidence estimation with GPT-4 with few-shot learning for event detection in the BETTER ontology as a vehicle.	Steven Fincke; Adrien Bibal; Elizabeth Boschee;	arxiv-cs.AI	2024-08-01
882	MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception …	CONG JIN et. al.	IEEE Transactions on Circuits and Systems for Video …	2024-08-01
883	TR-TransGAN: Temporal Recurrent Transformer Generative Adversarial Network for Longitudinal MRI Dataset Expansion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Longitudinal magnetic resonance imaging (MRI) datasets have important implications for the study of degenerative diseases because such datasets have data from multiple points in …	CHEN-CHEN FAN et. al.	IEEE Transactions on Cognitive and Developmental Systems	2024-08-01
884	Bilateral Transformer 3D Planar Recovery Related Papers Related Patents Related Grants Related Venues Related Experts View	Fei Ren; Chunhua Liao; Zhina Xie;	Graph. Model.	2024-08-01
885	Bidirectional Interaction of CNN and Transformer Feature for Visual Tracking Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Empowered by the sophisticated long-range dependency modeling ability of Transformer, tracking performance has seen a dynamic increase in recent years. Approaches in this vein …	Baozhen Sun; Zhenhua Wang; Shilei Wang; Yongkang Cheng; Jifeng Ning;	IEEE Transactions on Circuits and Systems for Video …	2024-08-01
886	Unmasking Large Language Models By Means of OpenAI GPT-4 and Google AI: A Deep Instruction-based Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View	IDREES A. ZAHID et. al.	Intell. Syst. Appl.	2024-08-01
887	MAE-EEG-Transformer: A Transformer-based Approach Combining Masked Autoencoder and Cross-individual Data Augmentation Pre-training for EEG Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Miao Cai; Yu Zeng;	Biomed. Signal Process. Control.	2024-08-01
888	OmniParser for Pure Vision Based GUI Agent Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface.	Yadong Lu; Jianwei Yang; Yelong Shen; Ahmed Awadallah;	arxiv-cs.CV	2024-07-31
889	Performance of Recent Large Language Models for A Low-Resourced Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have shown significant advances in the past year.	Ravindu Jayakody; Gihan Dias;	arxiv-cs.CL	2024-07-31
890	The Llama 3 Herd of Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new set of foundation models, called Llama 3.	AARON GRATTAFIORI et. al.	arxiv-cs.AI	2024-07-31
891	Generative Expressive Conversational Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, due to the limitations of small-scale datasets containing scripted recording styles, they often fail to simulate real natural conversational styles. To address the above issues, we propose a novel generative expressive CSS system, termed GPT-Talker.	Rui Liu; Yifan Hu; Yi Ren; Xiang Yin; Haizhou Li;	arxiv-cs.CL	2024-07-31
892	Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ultimately, we find that the GPT models that we evaluated are not suitable for fully automated vulnerability scanning because the false positive and false negative rates are too high to likely be useful in practice.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CR	2024-07-31
893	Robust Load Prediction of Power Network Clusters Based on Cloud-Model-Improved Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Presenting an innovative approach, the Cloud Model Improved Transformer (CMIT) method integrates the Transformer model with the cloud model utilizing the particle swarm optimization algorithm, with the aim of achieving robust and precise power load predictions.	Cheng Jiang; Gang Lu; Xue Ma; Di Wu;	arxiv-cs.LG	2024-07-30
894	Interpretable Pre-Trained Transformers for Heart Time-Series Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we employ this framework to the analysis of clinical heart time-series data, to create two pre-trained general purpose cardiac models, termed PPG-PT and ECG-PT.	Harry J. Davies; James Monsen; Danilo P. Mandic;	arxiv-cs.LG	2024-07-30
895	Enhancing Agricultural Machinery Management Through Advanced LLM Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach that leverages large language models (LLMs), particularly GPT-4, combined with multi-round prompt engineering to enhance decision-making processes in agricultural machinery management.	Emily Johnson; Noah Wilson;	arxiv-cs.CL	2024-07-30
896	Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address sentiment analysis of Lithuanian five-star-based online reviews from multiple domains that we collect and clean.	Brigita Vileikytė; Mantas Lukoševičius; Lukas Stankevičius;	arxiv-cs.CL	2024-07-29
897	DuA: Dual Attentive Transformer in Long-Term Continuous EEG Emotion Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods encounter significant challenges in real-life scenarios where emotional states evolve over extended periods. To address this issue, we propose a Dual Attentive (DuA) transformer framework for long-term continuous EEG emotion analysis.	YUE PAN et. al.	arxiv-cs.HC	2024-07-29
898	MM-Transformer: A Transformer-Based Knowledge Graph Link Prediction Model That Fuses Multimodal Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multimodal knowledge graph completion necessitates the integration of information from multiple modalities (such as images and text) into the structural representation of entities …	DONGSHENG WANG et. al.	Symmetry	2024-07-29
899	Legal Minds, Algorithmic Decisions: How LLMs Apply Constitutional Principles in Complex Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct an empirical analysis of how large language models (LLMs), specifically GPT-4, interpret constitutional principles in complex decision-making scenarios.	Camilla Bignotti; Carolina Camassa;	arxiv-cs.CL	2024-07-29
900	AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We derive an analytical model for the dependence of optimal weights on data scale and introduce AutoScale, a novel, practical approach for optimizing data compositions at potentially large training data scales.	FEIYANG KANG et. al.	arxiv-cs.LG	2024-07-29
901	Motamot: A Dataset for Revealing The Supremacy of Large Language Models Over Transformer Models in Bengali Political Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate political sentiment analysis during Bangladeshi elections, specifically examining how effectively Pre-trained Language Models (PLMs) and Large Language Models (LLMs) capture complex sentiment characteristics.	FATEMA TUJ JOHORA FARIA et. al.	arxiv-cs.CL	2024-07-28
902	FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks.	Seyed Mojtaba Sadjadi; Zeinab Rajabi; Leila Rabiei; Mohammad-Shahram Moin;	arxiv-cs.CL	2024-07-27
903	The Impact of LoRA Adapters for LLMs on Clinical NLP Classification Under Data Limitations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to the domain gap and limited data availability.	Thanh-Dung Le; Ti Ti Nguyen; Vu Nguyen Ha;	arxiv-cs.CL	2024-07-27
904	QT-TDM: Planning With Transformer Dynamics Model and Autoregressive Q-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the success of the Transformer architecture in natural language processing and computer vision, we investigate the use of Transformers in Reinforcement Learning (RL), specifically in modeling the environment’s dynamics using Transformer Dynamics Models (TDMs).	Mostafa Kotb; Cornelius Weber; Muhammad Burhan Hafez; Stefan Wermter;	arxiv-cs.LG	2024-07-26
905	GPT Deciphering Fedspeak: Quantifying Dissent Among Hawks and Doves Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPT-4 to quantify dissent among members on the topic of inflation.	DENIS PESKOFF et. al.	arxiv-cs.AI	2024-07-26
906	Is Larger Always Better? Evaluating and Prompting Large Language Models for Non-generative Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models, for non-generative medical tasks utilizing renowned datasets.	YINGHAO ZHU et. al.	arxiv-cs.CL	2024-07-26
907	Using GPT-4 to Guide Causal Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are interested in the ability of LLMs to identify causal relationships.	Anthony C. Constantinou; Neville K. Kitson; Alessio Zanga;	arxiv-cs.AI	2024-07-26
908	Automatic Detection of Moral Values in Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues.	Vjosa Preniqi; Iacopo Ghinassi; Julia Ive; Kyriaki Kalimeri; Charalampos Saitis;	arxiv-cs.CY	2024-07-26
909	Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel joint graph learning approach that combines the rich contextual representations learned by pre-trained single-cell language models with the structured knowledge encoded in GRNs using graph neural networks (GNNs).	Sindhura Kommu; Yizhi Wang; Yue Wang; Xuan Wang;	arxiv-cs.LG	2024-07-25
910	The Power of Combining Data and Knowledge: GPT-4o Is An Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models to enhance LNM prediction performance.	Danqing Hu; Bing Liu; Xiaofeng Zhu; Nan Wu;	arxiv-cs.CL	2024-07-25
911	KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While benchmarks exist for testing visual reasoning in LMMs, they require advanced skills and omit basic visual analogies that even young children can make. Inspired by developmental psychology, we propose a new benchmark of 4,300 visual transformations of everyday objects to test LMMs on visual analogical reasoning and compare them to children (ages three to five) and to adults.	EUNICE YIU et. al.	arxiv-cs.CV	2024-07-25
912	HDL-GPT: High-Quality HDL Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models.	BHUVNESH KUMAR et. al.	arxiv-cs.LG	2024-07-25
913	My Ontologist: Evaluating BFO-Based AI for Definition Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through iterative development of a specialized GPT model named My Ontologist, we aimed to generate BFO-conformant ontologies.	Carter Benson; Alec Sculley; Austin Liebers; John Beverley;	arxiv-cs.DB	2024-07-24
914	Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving.	Zuoyin Tang; Jianhua He; Dashuai Pei; Kezhong Liu; Tao Gao;	arxiv-cs.AI	2024-07-24
915	Cost-effective Instruction Learning for Pathology Vision and Language Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we propose a cost-effective instruction learning framework for conversational pathology named as CLOVER.	KAITAO CHEN et. al.	arxiv-cs.AI	2024-07-24
916	Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have exhibited remarkable proficiency in natural language understanding, prompting extensive exploration of their potential applications across …	Cui Long; Yongbin Liu; Chunping Ouyang; Ying Yu;	ArXiv	2024-07-24
917	SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we introduced SDoH-GPT, a simple and effective few-shot Large Language Model (LLM) method leveraging contrastive examples and concise instructions to extract SDoH without relying on extensive medical annotations or costly human intervention.	BERNARDO CONSOLI et. al.	arxiv-cs.CL	2024-07-24
918	Artificial Intelligence in Extracting Diagnostic Data from Dental Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research addresses the issue of missing structured data in dental records by extracting diagnostic information from unstructured text.	YAO-SHUN CHUANG et. al.	arxiv-cs.CL	2024-07-23
919	Can Large Language Models Automatically Jailbreak GPT-4V? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization.	YUANWEI WU et. al.	arxiv-cs.CL	2024-07-23
920	OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code.	FAN CUI et. al.	arxiv-cs.AR	2024-07-23
921	RadioRAG: Factual Large Language Models for Enhanced Diagnostics in Radiology Using Dynamic Retrieval Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have advanced the field of artificial intelligence (AI) in medicine. However LLMs often generate outdated or inaccurate information based on static …	SOROOSH TAYEBI et. al.	ArXiv	2024-07-22
922	KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the adaptation of Transformerbased models for edge devices through the quantisation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC-V platform.	Aness Al-Qawlaq; Ajay Kumar M; Deepu John;	arxiv-cs.AR	2024-07-22
923	Inverted Activations: Reducing Memory Footprint in Neural Network Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a modification to the handling of activation tensors in pointwise nonlinearity layers.	Georgii Novikov; Ivan Oseledets;	arxiv-cs.LG	2024-07-22
924	Dissecting Multiplication in Transformers: Insights Into LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on observation and analysis, we infer the reasons of transformers deficiencies in multiplication tasks lies in their difficulty in calculating successive carryovers and caching intermediate results, and confirmed this inference through experiments. Guided by these findings, we propose improvements to enhance transformers performance on multiplication tasks.	Luyu Qiu; Jianing Li; Chi Su; Chen Jason Zhang; Lei Chen;	arxiv-cs.CL	2024-07-22
925	Can GPT-4 Learn to Analyse Moves in Research Article Abstracts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we employ the affordances of GPT-4 to automate the annotation process by using natural language prompts.	Danni Yu; Marina Bondi; Ken Hyland;	arxiv-cs.CL	2024-07-22
926	Efficient Visual Transformer By Learnable Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Learnable Token Merging (LTM), or LTM-Transformer.	Yancheng Wang; Yingzhen Yang;	arxiv-cs.CV	2024-07-21
927	LLMs Left, Right, and Center: Assessing GPT’s Capabilities to Label Political Bias from Web Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale (far-left to far-right).	Raphael Hernandes; Giulio Corsi;	arxiv-cs.CL	2024-07-19
928	Unipa-GPT: Large Language Models for University-oriented QA in Italian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments we adopted both the Retrieval Augmented Generation (RAG) approach and fine-tuning to develop the system.	Irene Siragusa; Roberto Pirrone;	arxiv-cs.CL	2024-07-19
929	Can Open-Source LLMs Compete with Commercial Models? Exploring The Few-Shot Performance of Current GPT Models in Biomedical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3.5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning.	Samy Ateia; Udo Kruschwitz;	arxiv-cs.CL	2024-07-18
930	Evaluating Large Language Models for Anxiety and Depression Classification Using Counseling and Psychotherapy Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts.	Junwei Sun; Siqi Ma; Yiran Fan; Peter Washington;	arxiv-cs.CL	2024-07-18
931	Sharif-STR at SemEval-2024 Task 1: Transformer As A Regression Model for Fine-Grained Scoring of Textual Semantic Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the investigation of sentence-level STR within Track A (Supervised) by leveraging fine-tuning techniques on the RoBERTa transformer.	SEYEDEH FATEMEH EBRAHIMI et. al.	arxiv-cs.CL	2024-07-17
932	LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel LLMs-in-the-loop approach to develop supervised neural machine translation models optimized specifically for medical texts.	Bunyamin Keles; Murat Gunay; Serdar I. Caglar;	arxiv-cs.CL	2024-07-16
933	Large Language Models As Misleading Assistants in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users.	BETTY LI HOU et. al.	arxiv-cs.CL	2024-07-16
934	Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches …	Yaswanth Narsupalli; Abhranil Chandra; Sreevatsa Muppirala; Manish Gupta; Pawan Goyal;	ArXiv	2024-07-16
935	Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task.	SEYEDEH FATEMEH EBRAHIMI et. al.	arxiv-cs.CL	2024-07-16
936	Educational Personalized Learning Path Planning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its potential, traditional PLPP systems often lack adaptability, interactivity, and transparency. This paper proposes a novel approach integrating Large Language Models (LLMs) with prompt engineering to address these challenges.	Chee Ng; Yuen Fung;	arxiv-cs.CL	2024-07-16
937	GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution is a set of features, their properties, definitions, and examples in a machine-readable format, along with the code for RhetAnn and the GPT prompts and fine-tuning procedures for advancing state-of-the-art interpretable propaganda technique detection.	Kyle Hamilton; Luca Longo; Bojan Bozic;	arxiv-cs.CL	2024-07-16
938	Does Refusal Training in LLMs Generalize to The Past Tense? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We systematically evaluate this method on Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, GPT-4o mini, GPT-4o, o1-mini, o1-preview, and R2D2 models using GPT-3.5 Turbo as a reformulation model.	Maksym Andriushchenko; Nicolas Flammarion;	arxiv-cs.CL	2024-07-16
939	GPT-4V Cannot Generate Radiology Reports Yet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we perform a systematic evaluation of GPT-4V in generating radiology reports on two chest X-ray report datasets: MIMIC-CXR and IU X-Ray.	Yuyang Jiang; Chacha Chen; Dang Nguyen; Benjamin M. Mervak; Chenhao Tan;	arxiv-cs.CY	2024-07-16
940	ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by the need for lightweight, open source, and multilingual dialogue evaluators, this paper introduces GenResCoh (Generated Responses targeting Coherence).	John Mendonça; Isabel Trancoso; Alon Lavie;	arxiv-cs.CL	2024-07-16
941	R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rigorous insights are provided into the influence of jamming LLM word embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE).	ALADIN DJUHERA et. al.	arxiv-cs.LG	2024-07-16
942	A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies show that creating a high-quality training dataset for software engineering chatbots is expensive in terms of both resources and time. Aims: Therefore, in this paper, we present an automated transformer-based approach to augment software engineering chatbot datasets.	Ahmad Abdellatif; Khaled Badran; Diego Elias Costa; Emad Shihab;	arxiv-cs.SE	2024-07-16
943	Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To thoroughly examine the behaviours of Transformer-based MDS models, this paper presents five empirical studies on (1) measuring the impact of document boundary separators quantitatively; (2) exploring the effectiveness of different mainstream Transformer structures; (3) examining the sensitivity of the encoder and decoder; (4) discussing different training strategies; and (5) discovering the repetition in a summary generation.	Congbo Ma; Wei Emma Zhang; Dileepa Pitawela; Haojie Zhuang; Yanfeng Shu;	arxiv-cs.CL	2024-07-16
944	Scientific QA System with Verifiable Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the VerifAI project, a pioneering open-source scientific question-answering system, designed to provide answers that are not only referenced but also automatically vetted and verifiable.	ADELA LJAJIĆ et. al.	arxiv-cs.CL	2024-07-16
945	GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images Via VLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4o can decode hand gestures from forearm ultrasound data even with no fine-tuning, and improves with few-shot, in-context learning.	Keshav Bimbraw; Ye Wang; Jing Liu; Toshiaki Koike-Akino;	arxiv-cs.CV	2024-07-15
946	Leveraging LLM-Respondents for Item Evaluation: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, item calibration is time-consuming and costly, requiring a sufficient number of respondents for the response process. We explore using six different LLMs (GPT-3.5, GPT-4, Llama 2, Llama 3, Gemini-Pro, and Cohere Command R Plus) and various combinations of them using sampling methods to produce responses with psychometric properties similar to human answers.	Yunting Liu; Shreya Bhandari; Zachary A. Pardos;	arxiv-cs.CY	2024-07-15
947	Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present novel approaches that use a generative pretrained transformer (GPT) to identify paraphasias from transcripts as well as two end-to-end approaches that focus on modeling both automatic speech recognition (ASR) and paraphasia classification as multiple sequences vs. a single sequence.	Matthew Perez; Aneesha Sampath; Minxue Niu; Emily Mower Provost;	arxiv-cs.CL	2024-07-15
948	Hierarchical Local Temporal Feature Enhancing for Transformer-Based 3D Human Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advancements in transformer-based methods have yielded substantial success in 2D-to-3D human pose estimation. Transformer-based estimators have their inherent advantages …	Xin Yan; Chi-Man Pun; Haolun Li; Mengqi Liu; Hao Gao;	2024 IEEE International Conference on Multimedia and Expo …	2024-07-15
949	DistillSeq: A Framework for Safety Alignment Testing in Large Language Models Using Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we deploy two distinct strategies for generating malicious queries: one based on a syntax tree approach, and the other leveraging an LLM-based method.	Mingke Yang; Yuqi Chen; Yi Liu; Ling Shi;	arxiv-cs.SE	2024-07-14
950	ChatGPT-3.5 and -4.0 and Mechanical Engineering: Examining Performance on The FE Mechanical Engineering and Undergraduate Exams Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The launch of Generative Pretrained Transformer (ChatGPT) at the end of 2022 generated large interest in possible applications of artificial intelligence (AI) in science, …	Matthew Frenkel; Hebah Emara;	Comput. Appl. Eng. Educ.	2024-07-14
951	Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4).	GE GAO et. al.	arxiv-cs.CL	2024-07-14
952	Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, current works use GPT-4 to only predict the correct option without providing any explanation and thus do not provide any insight into the thinking process and reasoning used by GPT-4 or other LLMs. Therefore, we introduce a new domain-specific error taxonomy derived from collaboration with medical students.	SOUMYADEEP ROY et. al.	sigir	2024-07-14
953	Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.	Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu;	sigir	2024-07-14
954	Reflections on The Coding Ability of LLMs for Analyzing Market Research Surveys Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first systematic study of applying large language models (in our case, GPT-3.5 and GPT-4) for the automatic coding (multi-class classification) problem in market research.	Shi Zong; Santosh Kolagati; Amit Chaudhary; Josh Seltzer; Jimmy Lin;	sigir	2024-07-14
955	CodeV: Empowering LLMs for Verilog Generation Through Multi-Level Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs.	YANG ZHAO et. al.	arxiv-cs.PL	2024-07-14
956	Legal Statute Identification: A Case Study Using State-of-the-Art Datasets and Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reproduce several LSI models on two popular LSI datasets and study the effect of the above-mentioned challenges.	Shounak Paul; Rajas Bhatt; Pawan Goyal; Saptarshi Ghosh;	sigir	2024-07-14
957	Generalizable Tip-of-the-Tongue Retrieval with LLM Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the generalization capabilities of existing retrieval methods with ToT queries in multiple domains.	Lu\'{\i}s Borges; Rohan Jha; Jamie Callan; Bruno Martins;	sigir	2024-07-14
958	Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking Over Larger Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages.	Vinay Setty;	sigir	2024-07-14
959	Causality Extraction from Medical Text Using Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of natural language models, including large language models, to extract causal relations from medical texts, specifically from Clinical Practice Guidelines (CPGs).	Seethalakshmi Gopalakrishnan; Luciana Garbayo; Wlodek Zadrozny;	arxiv-cs.CL	2024-07-13
960	Document-level Clinical Entity and Relation Extraction Via Knowledge Base-Guided Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability.	Kriti Bhattarai; Inez Y. Oh; Zachary B. Abrams; Albert M. Lai;	arxiv-cs.CL	2024-07-13
961	Robustness of LLMs to Perturbations in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs’ robustness against the corrupt variations of the original text.	Ayush Singh; Navpreet Singh; Shubham Vatsal;	arxiv-cs.CL	2024-07-12
962	EVOLVE: Predicting User Evolution and Network Dynamics in Social Media Using Fine-Tuned GPT-like Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we propose a predictive method to understand how a user evolves on social media throughout their life and to forecast the next stage of their evolution.	Ismail Hossain; Md Jahangir Alam; Sai Puppala; Sajedul Talukder;	arxiv-cs.SI	2024-07-12
963	The Two Sides of The Coin: Hallucination Generation and Detection with LLMs As Evaluators for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content.	ANH THU MARIA BUI et. al.	arxiv-cs.AI	2024-07-12
964	On Exact Bit-level Reversible Transformers Without Changing Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference.	Guoqiang Zhang; J. P. Lewis; W. B. Kleijn;	arxiv-cs.LG	2024-07-12
965	Show, Don’t Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To probe generalization, we introduce two new games for spatial logic: LEGO Connect Language (LCL) and Guess-the-SMILES (GtS), a operationally simple chemistry benchmark.	Gonçalo Hora de Carvalho; Oscar Knap; Robert Pollice;	arxiv-cs.AI	2024-07-12
966	Movie Recommendation with Poster Attention Via Multi-modal Transformer Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal movie recommendation system by extract features of the well designed posters for each movie and the narrative text description of the movie.	Linhan Xia; Yicheng Yang; Ziou Chen; Zheng Yang; Shengxin Zhu;	arxiv-cs.IR	2024-07-12
967	A Survey on Symbolic Knowledge Distillation of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This survey paper delves into the emerging and critical area of symbolic knowledge distillation in Large Language Models (LLMs). As LLMs like Generative Pre-trained Transformer-3 …	Kamal Acharya; Alvaro Velasquez; H. Song;	ArXiv	2024-07-12
968	Detect Llama — Finding Vulnerabilities in Smart Contracts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we test the hypothesis that although OpenAI’s GPT-4 performs well generally, we can fine-tune open-source models to outperform GPT-4 in smart contract vulnerability detection.	Peter Ince; Xiapu Luo; Jiangshan Yu; Joseph K. Liu; Xiaoning Du;	arxiv-cs.CR	2024-07-11
969	GPT-4 Is Judged More Human Than Humans in Displaced and Inverted Turing Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall.	Ishika Rathi; Sydney Taylor; Benjamin K. Bergen; Cameron R. Jones;	arxiv-cs.HC	2024-07-11
970	LLMs’ Morphological Analyses of Complex FST-generated Finnish Words Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms.	Anssi Moisio; Mathias Creutz; Mikko Kurimo;	arxiv-cs.CL	2024-07-11
971	Teaching Transformers Causal Reasoning Through Axiomatic Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data.	Aniket Vashishtha; Abhinav Kumar; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma;	arxiv-cs.LG	2024-07-10
972	FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the previous approaches has investigated the efficiency of LLM-based few-shot learning in domain-specific scenarios. To address this gap, we introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets, with a focus on industrial manufacturing and maintenance, while using multiple LLMs — GPT-4-32K, GPT-3.5-Turbo, LLaMA 2-chat, and Vicuna.	Yongjian Tang; Rakebul Hasan; Thomas Runkler;	arxiv-cs.CL	2024-07-10
973	Transformer Neural Networks with Spatiotemporal Attention for Predictive Control and Optimization of Industrial Processes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the context of real-time optimization and model predictive control of industrial systems, machine learning, and neural networks represent cutting-edge tools that hold promise …	Ethan R. Gallup; Jacob F. Tuttle; Jake Immonen; Blake W. Billings; Kody M. Powell;	2024 American Control Conference (ACC)	2024-07-10
974	ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose Random Subspace Adaptation (ROSA), a method that outperforms previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time.	Marawan Gamal Abdel Hameed; Aristides Milios; Siva Reddy; Guillaume Rabusseau;	arxiv-cs.LG	2024-07-10
975	Short Answer Scoring with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View	Lan Jiang; Nigel Bosch;	ACM Conference on Learning @ Scale	2024-07-09
976	Mixture-of-Modules: Reinventing Transformers As Dynamic Assemblies of Modules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization.	ZHUOCHENG GONG et. al.	arxiv-cs.CL	2024-07-09
977	A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine five feature extraction methods (TF-IDF, LSI, BERT, MiniLM, RoBERTa) and find that Term Frequency-Inverse Document Frequency (TF-IDF) outperforms the other four methods with a precision of 75\% and an F1 score of 64\%.	Refat Othman; Bruno Rossi; Russo Barbara;	arxiv-cs.CR	2024-07-09
978	Prompting Techniques for Secure Code Generation: A Systematic Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OBJECTIVE: In this study, we investigate the impact of different prompting techniques on the security of code generated from NL instructions by LLMs.	Catherine Tony; Nicolás E. Díaz Ferreyra; Markus Mutas; Salem Dhiff; Riccardo Scandariato;	arxiv-cs.SE	2024-07-09
979	Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce Multilingual Blending, a mixed-language query-response scheme designed to evaluate the safety alignment of various state-of-the-art LLMs (e.g., GPT-4o, GPT-3.5, Llama3) under sophisticated, multilingual conditions.	Jiayang Song; Yuheng Huang; Zhehua Zhou; Lei Ma;	arxiv-cs.CL	2024-07-09
980	Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To assess the LLM output, we propose completeness, soundness, clarity, syntax, and functioning code as metrics.	Inwon Kang; William Van Woensel; Oshani Seneviratne;	arxiv-cs.CL	2024-07-09
981	PEER: Expertizing Domain-Specific Tasks with A Multi-Agent Framework and Tuning Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework.	YIYING WANG et. al.	arxiv-cs.AI	2024-07-09
982	Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR).	YAOZONG GAN et. al.	arxiv-cs.CV	2024-07-08
983	Intent Aware Data Augmentation By Leveraging Generative AI for Stress Detection in Social Media Texts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Stress is a major issue in modern society. Researchers focus on identifying stress in individuals, linking language with mental health, and often utilizing social media posts. …	Minhah Saleem; Jihie Kim;	PeerJ Comput. Sci.	2024-07-08
984	Surprising Gender Biases in GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present seven experiments exploring gender biases in GPT.	Raluca Alexandra Fulgu; Valerio Capraro;	arxiv-cs.CY	2024-07-08
985	On The Power of Convolution Augmented Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent architectural recipes, such as state-space models, have bridged the performance gap. Motivated by this, we examine the benefits of Convolution-Augmented Transformer (CAT) for recall, copying, and length generalization tasks.	Mingchen Li; Xuechen Zhang; Yixiao Huang; Samet Oymak;	arxiv-cs.LG	2024-07-08
986	Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study underscores the crucial role of prompt engineering in maximizing the educational benefits of LLMs. By systematically categorizing and testing these strategies, we provide a comprehensive framework for both educators and students to optimize LLM-based learning experiences.	Tianyu Wang; Nianjun Zhou; Zhixiong Chen;	arxiv-cs.AI	2024-07-07
987	Flood Simulation: Integrating UAS Imagery and Ai-Generated Data With Diffusion Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The primary goal of early disaster impact assessments is to gather georeferenced data about affected areas. Floods, a major natural calamity, pose challenges in data collection …	Xiyang Hu; Maryam Rahnemoonfar;	IGARSS 2024 – 2024 IEEE International Geoscience and Remote …	2024-07-07
988	A Novel Automated Urban Building Analysis Framework Based on GPT and SAM Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Rapid urban development necessitates advanced methodologies for efficiently acquiring and analyzing detailed building information. This study proposes an automated framework, …	Yuchao Sun; Xianping Ma; Yizhen Yan; Man-On Pun; Bo Huang;	IGARSS 2024 – 2024 IEEE International Geoscience and Remote …	2024-07-07
989	Image-Conditional Diffusion Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT).	XINGYANG NIE et. al.	arxiv-cs.CV	2024-07-07
990	MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid advancement of Large Language Models (LLMs) and Large Multimodal Models (LMMs) has heightened the demand for AI-based scientific assistants capable of understanding …	ZEKUN LI et. al.	ArXiv	2024-07-06
991	Associative Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step.	Ivan Rodkin; Yuri Kuratov; Aydar Bulatov; Mikhail Burtsev;	arxiv-cs.CL	2024-07-05
992	Using LLMs to Label Medical Papers According to The CIViC Evidence Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP.	Markus Hisch; Xing David Wang;	arxiv-cs.CL	2024-07-05
993	Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored.	Nathan Herr; Fernando Acero; Roberta Raileanu; María Pérez-Ortiz; Zhibin Li;	arxiv-cs.AI	2024-07-05
994	Generalists Vs. Specialists: Evaluating Large Language Models for Urdu Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models–XLM-Roberta-large, mT5-large, and Llama-3-8b–that have been fine-tuned on specific tasks.	Samee Arif; Abdul Hameed Azeemi; Agha Ali Raza; Awais Athar;	arxiv-cs.CL	2024-07-05
995	Enhancing Multi-Agent Communication Collaboration Through GPT-Based Semantic Information Extraction and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	Xinfeng Deng; Li Zhou; Dezun Dong; Jibo Wei;	ACM Turing Award Celebration Conference 2024	2024-07-05
996	HYBRINFOX at CheckThat! 2024 – Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! competition. The specificity of the method is to use a hybrid …	MORGANE CASANOVA et. al.	Conference and Labs of the Evaluation Forum	2024-07-04
997	GPT-4 Vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains.	JIANHAO YAN et. al.	arxiv-cs.CL	2024-07-04
998	TrackPGD: Efficient Adversarial Attack Using Object Binary Masks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce TrackPGD, a novel white-box attack that utilizes predicted object binary masks to target robust transformer trackers.	Fatemeh Nourilenjan Nokabadi; Yann Batiste Pequignot; Jean-Francois Lalonde; Christian Gagné;	arxiv-cs.CV	2024-07-04
999	From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with a focus on their abilities in reasoning and explainability.	Stefanie Krause; Frieder Stolzenburg;	arxiv-cs.AI	2024-07-04
1000	HYBRINFOX at CheckThat! 2024 — Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat!	MORGANE CASANOVA et. al.	arxiv-cs.CL	2024-07-04
1001	Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task.	Sachin Yadav; Tejaswi Choppa; Dominik Schlechtweg;	arxiv-cs.CL	2024-07-04
1002	CATT: Character-based Arabic Tashkeel Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new approach to training ATD models.	Faris Alasmary; Orjuwan Zaafarani; Ahmad Ghannam;	arxiv-cs.CL	2024-07-03
1003	Large Language Models As Evaluators for Scientific Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study explores how well the state-of-the-art Large Language Models (LLMs), like GPT-4 and Mistral, can assess the quality of scientific summaries or, more fittingly, scientific syntheses, comparing their evaluations to those of human annotators.	Julia Evans; Jennifer D’Souza; Sören Auer;	arxiv-cs.CL	2024-07-03
1004	Mast Kalandar at SemEval-2024 Task 8: On The Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness.	Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava; Radhika Mamidi;	arxiv-cs.CL	2024-07-03
1005	RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG.	YUE YU et. al.	arxiv-cs.CL	2024-07-02
1006	Assessing The Code Clone Detection Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection.	Zixian Zhang; Takfarinas Saber;	arxiv-cs.SE	2024-07-02
1007	GPT Prompt Engineering for Scheduling Appliances Usage for Energy Cost Optimization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we propose a novel approach that makes use of a GPT model and of prompt engineering to build a proper input to GPT, given a domestic energy dataset. Specifically, …	Marco Siino; I. Tinnirello;	2024 IEEE International Symposium on Measurements & …	2024-07-02
1008	Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.	MENGLIN YANG et. al.	arxiv-cs.LG	2024-07-01
1009	DC Bias Content Extraction of Power Transformer Under AC and DC Environment and Its Suppression Measures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The phenomenon of transformer dc bias (TDB) will saturate the transformer core, resulting in the local overheating, accelerating the ageing of insulating material, and even …	ZHIWEI CHEN et. al.	IEEE Transactions on Industrial Electronics	2024-07-01
1010	GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we construct a large-scale benchmark called GRASP, which consists of 16,000 grid-based environments where the agent is tasked with an energy collection problem.	Zhisheng Tang; Mayank Kejriwal;	arxiv-cs.AI	2024-07-01
1011	Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: LLMs struggle to converge and consistently exploit even when explicitly prompted to do so, and are sensitive to prompt variations. To bridge this gap, we propose an agentic flow framework: LLM with Enhanced Algorithmic Dueling (LEAD), which integrates off-the-shelf DB algorithms with LLM agents through fine-grained adaptive interplay.	Fanzeng Xia; Hao Liu; Yisong Yue; Tongxin Li;	arxiv-cs.LG	2024-07-01
1012	Image-to-Text Logic Jailbreak: Your Imagination Can Help You Do Anything Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the integration of visual and text inputs in VLMs, new security issues emerge, as malicious attackers can exploit multiple modalities to achieve their objectives.	Xiaotian Zou; Ke Li; Yongkang Chen;	arxiv-cs.CR	2024-07-01
1013	Transformer Autoencoder for K-means Efficient Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View	Wenhao Wu; Weiwei Wang; Xixi Jia; Xiangchu Feng;	Eng. Appl. Artif. Intell.	2024-07-01
1014	MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions.	YUBO MA et. al.	arxiv-cs.CV	2024-07-01
1015	TextCheater: A Query-Efficient Textual Adversarial Attack in The Hard-Label Setting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing a query-efficient attack strategy to generate high-quality adversarial examples under the hard-label black-box setting is a fundamental yet challenging problem, …	HAO PENG et. al.	IEEE Transactions on Dependable and Secure Computing	2024-07-01
1016	Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based models have made significant advancements across various domains, largely due to the self-attention mechanism’s ability to capture contextual relationships in …	HULIN WANG et. al.	IEEE Transactions on Computers	2024-07-01
1017	Prompting GPT -4 to Support Automatic Safety Case Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti;	Expert Syst. Appl.	2024-07-01
1018	Adaptive Masked Autoencoder Transformer for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	XIANGRU CHEN et. al.	Appl. Soft Comput.	2024-07-01
1019	Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts.	Kota Shamanth Ramanath Nayak; Leila Kosseim;	arxiv-cs.CL	2024-07-01
1020	RoBERTa, ResNeXt and BiLSTM with Self-attention: The Ultimate Trio for Customer Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View	Amir Jabbary Lak; Reza Boostani; Farhan A. Alenizi; Amin Salih Mohammed; S. M. Fakhrahmad;	Appl. Soft Comput.	2024-07-01
1021	Multi-Turn Hidden Backdoor in Large Language Model-powered Chatbot Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Model (LLM)-powered chatbot services like GPTs, simulating human-to-human conversation via machine-generated text, are used in numerous fields. They are enhanced by …	Bocheng Chen; Nikolay Ivanov; Guangjing Wang; Qiben Yan;	Proceedings of the 19th ACM Asia Conference on Computer and …	2024-07-01
1022	A Study on The Effectiveness of GPT-4V in Classifying Driver Behavior Captured on Video Using Just A Few Frames Per Video Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper introduces an innovative study that evaluates the effectiveness of GPT-4V vision processing technology in identifying risk events within driving scenarios. These …	JOAO FELIPE GOBETI CALENZANI et. al.	2024 International Joint Conference on Neural Networks …	2024-06-30
1023	LegalTurk Optimized BERT for Multi-Label Text Classification and NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies.	Farnaz Zeidi; Mehmet Fatih Amasyali; Çiğdem Erol;	arxiv-cs.CL	2024-06-30
1024	WallFacer: Harnessing Multi-dimensional Ring Parallelism for Efficient Long Sequence Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods are either constrained by the number of attention heads or excessive communication overheads. To address this problem, we propose WallFacer, a multi-dimensional distributed training system for long sequences, fostering an efficient communication paradigm and providing additional tuning flexibility for communication arrangements.	ZIMING LIU et. al.	arxiv-cs.DC	2024-06-30
1025	Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning has gained significant traction in natural language processing due to the emergence of state-of-the-art pre-trained language models (P.L.M.s). Unlike traditional …	Shadi Jaradat; Richi Nayak; Alexander Paz; Mohammed Elhenawy;	Algorithms	2024-06-30
1026	A Method for Tibetan Offensive Language Detection Based on Prompt Learning and Information Theory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Offensive language on social media is a serious social problem, which affects people’s mental health and social harmony. However, there is a lack of effective detection methods …	HANG REN et. al.	2024 International Joint Conference on Neural Networks …	2024-06-30
1027	LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, prior research harbors two primary concerns: firstly, a lack of contemplation regarding whether the natural language generated by LLM (LLMNL) truly aligns with human natural language (HNL), a critical foundational question; secondly, an oversight that augmented data is randomly generated by LLM, implying that not all data may possess equal training value, that could impede the performance of classifiers. To address these challenges, we introduce the scaling laws to intrinsically calculate LLMNL and HNL.	Zhenhua Wang; Guang Xu; Ming Ren;	arxiv-cs.CL	2024-06-29
1028	Optimizing Uyghur Speech Synthesis By Combining Pretrained Cross-Lingual Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: End-to-end speech synthesis methodologies have exhibited considerable advancements for languages with abundant corpus resources. Nevertheless, such achievements are yet to be …	Kexin Lu; Zhihua Huang; Mingming Yin; Ke Chen;	ACM Transactions on Asian and Low-Resource Language …	2024-06-28
1029	Machine Learning Predictors for Min-Entropy Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing data from Generalized Binary Autoregressive Models, a subset of Markov processes, we demonstrate that machine learning models (including a hybrid of convolutional and recurrent Long Short-Term Memory layers and the transformer-based GPT-2 model) outperform traditional NIST SP 800-90B predictors in certain scenarios.	Javier Blanco-Romero; Vicente Lorenzo; Florina Almenares Mendoza; Daniel Díaz-Sánchez;	arxiv-cs.LG	2024-06-28
1030	Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users’ quit-vaping intentions.	SAI KRISHNA REVANTH VURUMA et. al.	arxiv-cs.CL	2024-06-28
1031	Fine-tuned Network Relies on Generic Representation to Solve Unseen Cognitive Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning pretrained language models has shown promising results on a wide range of tasks, but when encountering a novel task, do they rely more on generic pretrained representation, or develop brand new task-specific solutions?	Dongyan Lin;	arxiv-cs.LG	2024-06-27
1032	The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP).	Xiliang Zhu; Shayna Gardiner; Tere Roldán; David Rossouw;	arxiv-cs.CL	2024-06-27
1033	FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FRED, a wafer-scale interconnect that is tailored for the high-BW requirements of wafer-scale networks and can efficiently execute communication patterns of different parallelization strategies.	Saeed Rashidi; William Won; Sudarshan Srinivasan; Puneet Gupta; Tushar Krishna;	arxiv-cs.AR	2024-06-27
1034	A Pyramid Gaussian Pooling Based CNN and Transformer Hybrid Network for Smoke Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Visual smoke semantic segmentation is a challenging task due to semi‐transparency, variable shapes, and complex textures of smoke. To improve segmentation performance, a …	Guiqian Wang; Feiniu Yuan; Hongdi Li; Zhijun Fang;	IET Image Process.	2024-06-26
1035	BADGE: BADminton Report Generation and Evaluation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel framework named BADGE, designed for this purpose using LLM.	Shang-Hsuan Chiang; Lin-Wei Chao; Kuang-Da Wang; Chih-Chuan Wang; Wen-Chih Peng;	arxiv-cs.CL	2024-06-26
1036	Automating Clinical Trial Eligibility Screening: Quantitative Analysis of GPT Models Versus Human Expertise Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Objective: This study quantitatively assesses the performance of GPT model in classifying patient eligibility for clinical trials, aiming to minimize the need for expert clinical …	ARTI DEVI et. al.	Proceedings of the 17th International Conference on …	2024-06-26
1037	Autonomous Prompt Engineering in Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prompt engineering is a crucial yet challenging task for optimizing the performance of large language models (LLMs) on customized tasks. This pioneering research introduces the …	Daan Kepel; Konstantina Valogianni;	ArXiv	2024-06-25
1038	SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR).	Quan Mai; Susan Gauch; Douglas Adams;	arxiv-cs.CL	2024-06-25
1039	This Paper Had The Smartest Reviewers — Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of speech and train machine learning models for automatic flattery detection.	LUKAS CHRIST et. al.	arxiv-cs.SD	2024-06-25
1040	Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora.	YIMING LI et. al.	arxiv-cs.CL	2024-06-25
1041	CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design.	NAFIS NEEHAL et. al.	arxiv-cs.CL	2024-06-25
1042	This Paper Had The Smartest Reviewers – Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Flattery is an important aspect of human communication that facilitates social bonding, shapes perceptions, and influences behavior through strategic compliments and praise, …	LUKAS CHRIST et. al.	ArXiv	2024-06-25
1043	Unambiguous Recognition Should Not Rely Solely on Natural Language Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This bias stems from the inherent characteristics of the dataset. To mitigate this bias, we propose a LaTeX printed text recognition model trained on a mixed dataset of pseudo-formulas and pseudo-text.	Renqing Luo; Yuhan Xu;	arxiv-cs.CV	2024-06-24
1044	Exploring The Capability of Mamba in Speech Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compared Mamba with state-of-the-art Transformer variants for various speech applications, including ASR, text-to-speech, spoken language understanding, and speech summarization.	Koichi Miyazaki; Yoshiki Masuyama; Masato Murata;	arxiv-cs.SD	2024-06-24
1045	GPT-4V Explorations: Mining Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the application of the GPT-4V(ision) large visual language model to autonomous driving in mining environments, where traditional systems often falter in understanding intentions and making accurate decisions during emergencies.	Zixuan Li;	arxiv-cs.CV	2024-06-24
1046	Using GPT-4 Turbo to Automatically Identify Defeaters in Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are convincing arguments, supported by a body of evidence and aiming at demonstrating that a system will function as intended. Producers of systems can rely …	K. K. SHAHANDASHTI et. al.	2024 IEEE 32nd International Requirements Engineering …	2024-06-24
1047	Exploring The Capabilities of Large Language Models for The Generation of Safety Cases: The Case of GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of large language models (LLMs) and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across …	Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti;	2024 IEEE 32nd International Requirements Engineering …	2024-06-24
1048	Exploring Factual Entailment with NLI: A News Media Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel — a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles.	Guy Mor-Lan; Effi Levi;	arxiv-cs.CL	2024-06-24
1049	DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models.	YUANG PENG et. al.	arxiv-cs.CV	2024-06-24
1050	The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions.	Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz;	arxiv-cs.CL	2024-06-24
1051	OlympicArena Medal Ranks: Who Is The Most Intelligent AI So Far? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)?	Zhen Huang; Zengzhi Wang; Shijie Xia; Pengfei Liu;	arxiv-cs.CL	2024-06-24
1052	Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT).	Ni Wang; Dongliang Liao; Xing Xu;	arxiv-cs.CV	2024-06-23
1053	PlagBench: Exploring The Duality of Large Language Models in Plagiarism Generation and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Also, how LLMs can facilitate the detection of LLM-generated plagiarism remains largely unexplored. To address these gaps, we introduce \textbf{{\sf PlagBench}}, a dataset of 46.5K synthetic text pairs that represent three major types of plagiarism: verbatim copying, paraphrasing, and summarization.	JOOYOUNG LEE et. al.	arxiv-cs.CL	2024-06-23
1054	Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate a broader view of knowledge location, that of concepts or clusters of related information, instead of disparate individual facts.	Christopher Burger; Yifan Hu; Thai Le;	arxiv-cs.LG	2024-06-22
1055	The Role of Generative AI in Qualitative Research: GPT-4’s Contributions to A Grounded Theory Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present reflections on our experience using a generative AI model in qualitative research, to illuminate the AI’s contributions to our analytic process. Our analytic focus was …	Ravi Sinha; Idris Solola; Ha Nguyen; H. Swanson; LuEttaMae Lawrence;	Proceedings of the 2024 Symposium on Learning, Design and …	2024-06-21
1056	How Effective Is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom’s Revised Taxonomy? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2024-06-21
1057	VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack.	Jonathan Rusert;	naacl	2024-06-20
1058	Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3.	Sindhu Kishore; Hangfeng He;	naacl	2024-06-20
1059	MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3.	SANCHIT AHUJA et. al.	naacl	2024-06-20
1060	Does GPT Really Get It? A Hierarchical Scale to Quantify Human Vs AI’s Understanding of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences.	Mirabel Reid; Santosh S. Vempala;	arxiv-cs.AI	2024-06-20
1061	CryptoGPT: A 7B Model Rivaling GPT-4 in The Task of Analyzing and Classifying Real-time Financial News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CryptoGPT: a 7B model competing with GPT-4 in a specific task — The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT.	Ying Zhang; Matthieu Petit Guillaume; Aurélien Krauth; Manel Labidi;	arxiv-cs.AI	2024-06-20
1062	VLM Agents Generate Their Own Memories: Distilling Experience Into Embodied Programs of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce In-Context Abstraction Learning (ICAL), which iteratively refines suboptimal trajectories into high-quality data with optimized actions and detailed reasoning.	GABRIEL SARCH et. al.	arxiv-cs.CV	2024-06-20
1063	Removing RLHF Protections in GPT-4 Via Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate.	QIUSI ZHAN et. al.	naacl	2024-06-20
1064	Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance.	XIANGRU TANG et. al.	naacl	2024-06-20
1065	A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems.	Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas;	naacl	2024-06-20
1066	Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang.	Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu;	naacl	2024-06-20
1067	Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults.	Afonso de Sá Delgado Neto; Maximilian Egger; Mayank Bakshi; Rawad Bitar;	arxiv-cs.LG	2024-06-20
1068	CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively.	Ming Jiang; Mansi Joshi;	naacl	2024-06-20
1069	On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck.	TING-RUI CHIANG et. al.	naacl	2024-06-20
1070	A Continued Pretrained LLM Approach for Automatic Medical Note Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing.	DONG YUAN et. al.	naacl	2024-06-20
1071	SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning.	ARASH ARDAKANI et. al.	naacl	2024-06-20
1072	Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models.	Anej Svete; Ryan Cotterell;	naacl	2024-06-20
1073	ChatGPT As Research Scientist: Probing GPT’s Capabilities As A Research Librarian, Research Ethicist, Data Generator and Data Predictor IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research …	Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R. Banaji;	arxiv-cs.AI	2024-06-20
1074	Does GPT-4 Pass The Turing Test? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness.	Cameron Jones; Ben Bergen;	naacl	2024-06-20
1075	Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature.	Anshuman Chhabra; Hadi Askari; Prasant Mohapatra;	naacl	2024-06-20
1076	SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs.	Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng;	naacl	2024-06-20
1077	Metacognitive Prompting Improves Understanding in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes.	Yuqing Wang; Yun Zhao;	naacl	2024-06-20
1078	A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem.	JIAQI LIU et. al.	arxiv-cs.RO	2024-06-19
1079	Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles.	Hamdireza Rouzegar; Masoud Makrehchi;	arxiv-cs.CL	2024-06-19
1080	Fine-Tuning BERTs for Definition Extraction from Mathematical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fine-tuned three pre-trained BERT models on the task of definition extraction from mathematical English written in LaTeX.	Lucy Horowitz; Ryan Hathaway;	arxiv-cs.CL	2024-06-19
1081	Putting GPT-4o to The Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study …	SAKIB SHAHRIAR et. al.	ArXiv	2024-06-19
1082	Generating Educational Materials with Different Levels of Readability Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning.	Chieh-Yang Huang; Jing Wei; Ting-Hao ‘Kenneth’ Huang;	arxiv-cs.CL	2024-06-18
1083	ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time.	TEAM GLM et. al.	arxiv-cs.CL	2024-06-18
1084	Reality Check: Assessing GPT-4 in Fixing Real-World Software Vulnerabilities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Discovering and mitigating software vulnerabilities is a challenging task. These vulnerabilities are often caused by simple, otherwise (and in other contexts) harmless code …	ZOLTÁN SÁGODI et. al.	Proceedings of the 28th International Conference on …	2024-06-18
1085	SwinStyleformer Is A Favorable Choice for Image Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects.	Jiawei Mao; Guangyi Zhao; Xuesong Yin; Yuanqi Chang;	arxiv-cs.CV	2024-06-18
1086	What Makes Two Language Models Think Alike? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question.	Jeanne Salle; Louis Jalouzot; Nur Lan; Emmanuel Chemla; Yair Lakretz;	arxiv-cs.CL	2024-06-18
1087	ChatGPT: Perspectives from Human–computer Interaction and Psychology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The release of GPT-4 has garnered widespread attention across various fields, signaling the impending widespread adoption and application of Large Language Models (LLMs). However, …	Jiaxi Liu;	Frontiers in Artificial Intelligence	2024-06-18
1088	Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a thorough analysis and discussion of the results.	ANKIT AICH et. al.	arxiv-cs.CL	2024-06-18
1089	Minimal Self in Humanoid Robot Alter3 Driven By Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Alter3, a humanoid robot that demonstrates spontaneous motion generation through the integration of GPT-4, Large Language Model (LLM).	Takahide Yoshida; Suzune Baba; Atsushi Masumori; Takashi Ikegami;	arxiv-cs.RO	2024-06-17
1090	Problematic Tokens: Tokenizer Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This misrepresentation results in the propagation of under-trained or untrained tokens, which perpetuate biases and pose serious concerns related to data security and ethical standards. We aim to dissect the tokenization mechanics of GPT-4o, illustrating how its simplified token-handling methods amplify these risks and offer strategic solutions to mitigate associated security and ethical issues.	Jin Yang; Zhiqiang Wang; Yanbin Lin; Zunduo Zhao;	arxiv-cs.CL	2024-06-17
1091	A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a two-dimensional zero-shot evaluation method for DST using GPT-4, which divides the evaluation into two dimensions: accuracy and completeness.	Ming Gu; Yan Yang;	arxiv-cs.CL	2024-06-17
1092	Cultural Conditioning or Placebo? On The Effectiveness of Socio-Demographic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI) or neutral (MMLU and ETHICS).	SAGNIK MUKHERJEE et. al.	arxiv-cs.CL	2024-06-17
1093	Significant Productivity Gains Through Programming with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models like GPT and Codex drastically alter many daily tasks, including programming, where they can rapidly generate code from natural language or informal …	Thomas Weber; Maximilian Brandmaier; Albrecht Schmidt; Sven Mayer;	Proceedings of the ACM on Human-Computer Interaction	2024-06-17
1094	Look Further Ahead: Testing The Limits of GPT-4 in Path Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still face challenges with long-horizon planning. To study this, we propose path planning tasks as a platform to evaluate LLMs’ ability to navigate long trajectories under geometric constraints.	Mohamed Aghzal; Erion Plaku; Ziyu Yao;	arxiv-cs.AI	2024-06-17
1095	Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify a pitfall of vanilla iterative DPO – improved response quality can lead to increased verbosity.	JIE LIU et. al.	arxiv-cs.CL	2024-06-17
1096	GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts.	Binnur Görer; Fatma Başak Aydemir;	arxiv-cs.SE	2024-06-17
1097	DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales.	FAN ZHOU et. al.	arxiv-cs.DB	2024-06-17
1098	WellDunn: On The Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model’s utility in clinical practice.	SEYEDALI MOHAMMADI et. al.	arxiv-cs.AI	2024-06-17
1099	ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, we present Video Diffusion GPT (ViD-GPT).	Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao;	arxiv-cs.CV	2024-06-16
1100	Dyamond: A 1T1C DRAM In-memory Computing Accelerator with Compact MAC-SIMD and Adaptive Column Addition Dataflow Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We propose Dyamond, a ITIC DRAM in-memory computing accelerator with column addition (CA) dataflow, for high density and energy efficiency. LSB-CA minimizes ADC readouts to …	SEONGYON HONG et. al.	2024 IEEE Symposium on VLSI Technology and Circuits (VLSI …	2024-06-16
1101	KGPA: Robustness Evaluation for Large Language Models Via Cross-Domain Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs).	AIHUA PEI et. al.	arxiv-cs.CL	2024-06-16
1102	Exposing The Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dataset MWP-MISTAKE, incorporating MWPs with both correct and incorrect reasoning steps generated through rule-based methods and smaller language models.	Joykirat Singh; Akshay Nambi; Vibhav Vineet;	arxiv-cs.CL	2024-06-16
1103	Generating Tables from The Parametric Knowledge of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables.	Yevgeni Berkovitch; Oren Glickman; Amit Somech; Tomer Wolfson;	arxiv-cs.CL	2024-06-16
1104	Breaking Boundaries: Investigating The Effects of Model Editing on Cross-linguistic Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts.	SOMNATH BANERJEE et. al.	arxiv-cs.CL	2024-06-16
1105	Large Language Models for Automatic Milestone Detection in Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate an LLM’s performance on recordings of a group oral communication task in which utterances are often truncated or not well-formed.	ZHUOXU DUAN et. al.	arxiv-cs.CL	2024-06-16
1106	Distilling Opinions at Scale: Incremental Opinion Summarization Using XL-OPSUMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally.	SRI RAGHAVA MUDDU et. al.	arxiv-cs.CL	2024-06-16
1107	ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning Via Shared Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an approach to optimize Parameter Efficient Fine Tuning (PEFT) for Pretrained Language Models (PLMs) by implementing a Shared Low Rank Adaptation (ShareLoRA).	Yurun Song; Junchen Zhao; Ian G. Harris; Sangeetha Abdu Jyothi;	arxiv-cs.CL	2024-06-15
1108	Multilingual Large Language Models and Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks.	Daniil Gurgurov; Tanja Bäumel; Tatiana Anikina;	arxiv-cs.CL	2024-06-15
1109	GPT-Fabric: Folding and Smoothing Fabric By Leveraging Pre-Trained Foundation Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Fabric manipulation has applications in folding blankets, handling patient clothing, and protecting items with covers. It is challenging for robots to perform fabric manipulation …	Vedant Raval; Enyu Zhao; Hejia Zhang; S. Nikolaidis; Daniel Seita;	ArXiv	2024-06-14
1110	GPT-4o: Visual Perception Performance of Multimodal Large Language Models in Piglet Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The initial evaluation experiments in this study validate the potential of multimodal large language models in livestock scene video understanding and provide new directions and references for future research on animal behavior video understanding.	Yiqi Wu; Xiaodan Hu; Ziming Fu; Siling Zhou; Jiangong Li;	arxiv-cs.CV	2024-06-14
1111	Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work enables extensive hardware/mapping exploration by extending the DSE framework Stream towards support for transformers across a wide variety of hardware architectures and different execution schedules.	Steven Colleman; Arne Symons; Victor J. B. Jung; Marian Verhelst;	arxiv-cs.AR	2024-06-14
1112	Mean-Shift Feature Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models developed in NLP make a great impact on computer vision fields producing promising performance on various tasks.	Takumi Kobayashi;	cvpr	2024-06-13
1113	General Point Model Pretraining with Autoencoding and Autoregressive Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the General Language Model we propose a General Point Model (GPM) that seamlessly integrates autoencoding and autoregressive tasks in a point cloud transformer.	ZHE LI et. al.	cvpr	2024-06-13
1114	GPT-ology, Computational Models, Silicon Sampling: How Should We Think About LLMs in Cognitive Science? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models have taken the cognitive science world by storm. It is perhaps timely now to take stock of the various research paradigms that have been used to make …	Desmond C. Ong;	ArXiv	2024-06-13
1115	SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum we introduce SDPose a new self-distillation method for improving the performance of small transformer-based models.	SICHEN CHEN et. al.	cvpr	2024-06-13
1116	GPT-Fabric: Smoothing and Folding Fabric By Leveraging Pre-Trained Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-Fabric for the canonical tasks of fabric smoothing and folding, where GPT directly outputs an action informing a robot where to grasp and pull a fabric.	Vedant Raval; Enyu Zhao; Hejia Zhang; Stefanos Nikolaidis; Daniel Seita;	arxiv-cs.RO	2024-06-13
1117	Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze a mechanism used in two LMs to selectively inhibit items in a context in one task, and find that it underlies a commonly used abstraction across many context-retrieval behaviors.	Jack Merullo; Carsten Eickhoff; Ellie Pavlick;	arxiv-cs.CL	2024-06-13
1118	MoMask: Generative Masked Modeling of 3D Human Motions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MoMask a novel masked modeling framework for text-driven 3D human motion generation.	Chuan Guo; Yuxuan Mu; Muhammad Gohar Javed; Sen Wang; Li Cheng;	cvpr	2024-06-13
1119	MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.	Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi;	cvpr	2024-06-13
1120	Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models.	Han Cai; Muyang Li; Qinsheng Zhang; Ming-Yu Liu; Song Han;	cvpr	2024-06-13
1121	OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions without a large-scale animal text-motion dataset.	ZHANGSIHAO YANG et. al.	cvpr	2024-06-13
1122	Permutation Equivariance of Transformers and Its Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks.	HENGYUAN XU et. al.	cvpr	2024-06-13
1123	GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic versatile and human-aligned evaluation metric for text-to-3D generative models.	TONG WU et. al.	cvpr	2024-06-13
1124	Complex Image-Generative Diffusion Transformer for Audio Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance audio denoising performance, this paper introduces a complex image-generative diffusion transformer that captures more information from the complex Fourier domain.	Junhui Li; Pu Wang; Jialu Li; Youshan Zhang;	arxiv-cs.SD	2024-06-13
1125	Fine-Tuned ‘Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification.	Martin Juan José Bucher; Marco Martini;	arxiv-cs.CL	2024-06-12
1126	Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning.	Jaehoon Kim; Seungwan Jin; Sohyun Park; Someen Park; Kyungsik Han;	arxiv-cs.CL	2024-06-12
1127	Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we first introduce LoCoV1, a 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long.	Jon Saad-Falcon; Daniel Y Fu; Simran Arora; Neel Guha; Christopher Re;	icml	2024-06-12
1128	Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2.	NICHOLAS CARLINI et. al.	icml	2024-06-12
1129	Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by previous theoretical study of static version of the attention multiplication problem [Zandieh, Han, Daliri, and Karbasi ICML 2023, Alman and Song NeurIPS 2023], we formally define a dynamic version of attention matrix multiplication problem.	Jan van den Brand; Zhao Song; Tianyi Zhou;	icml	2024-06-12
1130	What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth.	Xingwu Chen; Difan Zou;	icml	2024-06-12
1131	AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively.	REDUAN ACHTIBAT et. al.	icml	2024-06-12
1132	Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses—that intuitively contain more learnable information and are harder to overfit—from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge.	Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion;	icml	2024-06-12
1133	How Language Model Hallucinations Can Snowball IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim.	Muru Zhang; Ofir Press; William Merrill; Alisa Liu; Noah A. Smith;	icml	2024-06-12
1134	Trainable Transformer in Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new efficient construction, Transformer in Transformer (in short, TINT), that allows a transformer to simulate and fine-tune more complex models during inference (e.g., pre-trained language models).	Abhishek Panigrahi; Sadhika Malladi; Mengzhou Xia; Sanjeev Arora;	icml	2024-06-12
1135	PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality.	Praneeth Kacham; Vahab Mirrokni; Peilin Zhong;	icml	2024-06-12
1136	Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference.	HAOQI WU et. al.	icml	2024-06-12
1137	InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval.	BOXIN WANG et. al.	icml	2024-06-12
1138	Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly supervise superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model?	COLLIN BURNS et. al.	icml	2024-06-12
1139	Timer: Generative Pre-trained Transformers Are Large Time Series Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM).	YONG LIU et. al.	icml	2024-06-12
1140	GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website.	Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su;	icml	2024-06-12
1141	In-Context Principle Learning from Mistakes IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples.	TIANJUN ZHANG et. al.	icml	2024-06-12
1142	Prodigy: An Expeditiously Adaptive Parameter-Free Learner IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally.	Konstantin Mishchenko; Aaron Defazio;	icml	2024-06-12
1143	Asymmetry in Low-Rank Adapters of Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices.	JIACHENG ZHU et. al.	icml	2024-06-12
1144	Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias terms.	Brian K Chen; Tianyang Hu; Hui Jin; Hwee Kuan Lee; Kenji Kawaguchi;	icml	2024-06-12
1145	Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge.	Fangyun Wei; Xi Chen; Lin Luo;	icml	2024-06-12
1146	Position: On The Possibilities of AI-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds.	SOURADIP CHAKRABORTY et. al.	icml	2024-06-12
1147	Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.	CHENG HAN et. al.	icml	2024-06-12
1148	Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed `OutEffHop`) and use it to address the outlier inefficiency problem of training gigantic transformer-based models.	JERRY YAO-CHIEH HU et. al.	icml	2024-06-12
1149	Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Differentiable Channel Selection, or DCS-Transformer.	Yancheng Wang; Ping Li; Yingzhen Yang;	icml	2024-06-12
1150	SpikeZIP-TF: Conversion Is All You Need for Transformer-based SNN Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel ANN-to-SNN conversion method called SpikeZIP-TF, where ANN and SNN are exactly equivalent, thus incurring no accuracy degradation.	KANG YOU et. al.	icml	2024-06-12
1151	Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models.	AmirMohammad Azadi; Baktash Ansari; Sina Zamani; Sauleh Eetemadi;	arxiv-cs.CL	2024-06-11
1152	Anomaly Detection on Unstable Logs with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs.	Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand;	arxiv-cs.SE	2024-06-11
1153	LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approaches for the SMM4H24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders.	Dasun Athukoralage; Thushari Atapattu; Menasha Thilakaratne; Katrina Falkner;	arxiv-cs.CL	2024-06-11
1154	LLM-Powered Multimodal AI Conversations for Diabetes Prevention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The global prevalence of diabetes remains high despite rising life expectancy with improved quality and access to healthcare services. The significant burden that diabetes imposes …	Dung Dao; Jun Yi Claire Teo; Wenru Wang; Hoang D. Nguyen;	Proceedings of the 1st ACM Workshop on AI-Powered Q&A …	2024-06-10
1155	Unveiling The Safety of GPT-4o: An Empirical Study Using Jailbreak Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this paper adopts a series of multi-modal and uni-modal jailbreak attacks on 4 commonly used benchmarks encompassing three modalities (ie, text, speech, and image), which involves the optimization of over 4,000 initial text queries and the analysis and statistical evaluation of nearly 8,000+ response on GPT-4o.	Zonghao Ying; Aishan Liu; Xianglong Liu; Dacheng Tao;	arxiv-cs.CR	2024-06-10
1156	LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent.	SIMRANJIT SINGH et. al.	arxiv-cs.DC	2024-06-10
1157	SecureNet: A Comparative Study of DeBERTa and Large Language Models for Phishing Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Phishing, whether through email, SMS, or malicious websites, poses a major threat to organizations by using social engineering to trick users into revealing sensitive information. …	Sakshi Mahendru; Tejul Pandit;	2024 IEEE 7th International Conference on Big Data and …	2024-06-10
1158	Validating LLM-Generated Programs with Metamorphic Prompt Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research is required to comprehensively explore these critical concerns surrounding LLM-generated code. In this paper, we propose a novel solution called metamorphic prompt testing to address these challenges.	Xiaoyin Wang; Dakai Zhu;	arxiv-cs.SE	2024-06-10
1159	In-Context Learning and Fine-Tuning GPT for Argument Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an ICL strategy for ATC combining kNN-based examples selection and majority vote ensembling.	Jérémie Cabessa; Hugo Hernault; Umer Mushtaq;	arxiv-cs.CL	2024-06-10
1160	Improving ROUGE-1 By 6%: A Novel Multilingual Transformer for Abstractive News Summarization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language processing (NLP) has undergone a significant transformation, evolving from manually crafted rules to powerful deep learning techniques such as transformers. These …	Sandeep Kumar; Arun Solanki;	Concurr. Comput. Pract. Exp.	2024-06-10
1161	Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture.	Martin Courtois; Malte Ostendorff; Leonhard Hennig; Georg Rehm;	arxiv-cs.CL	2024-06-10
1162	Detection of Malicious Smart Contracts By Fine-tuning GPT-3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper introduces a comprehensive framework for the detection and identification of malicious smart contracts, emphasizing their vulnerabilities. The framework leverages the …	Msvpj Sathvik; Hirak Mazumdar;	Secur. Priv.	2024-06-09
1163	Hidden Holes: Topological Aspects of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods developed in this paper are novel in the field and based on mathematical apparatus that might be unfamiliar to the target audience.	Stephen Fitz; Peter Romero; Jiyan Jonas Schneider;	arxiv-cs.CL	2024-06-09
1164	Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, resource-intensive VTs updating and high mobility of vehicles require intensive computation, communication, and storage resources, especially for their migration among RSUs with limited coverages. To address these issues, we propose an attribute-aware auction-based mechanism to optimize resource allocation during VTs migration by considering both price and non-monetary attributes, e.g., location and reputation.	YONGJU TONG et. al.	arxiv-cs.AI	2024-06-08
1165	MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature.	GYEONG HOON YI et. al.	arxiv-cs.CL	2024-06-08
1166	G-Transformer: Counterfactual Outcome Prediction Under Dynamic and Time-varying Treatment Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present G-Transformer for counterfactual outcome prediction under dynamic and time-varying treatment strategies.	Hong Xiong; Feng Wu; Leon Deng; Megan Su; Li-wei H Lehman;	arxiv-cs.LG	2024-06-08
1167	SelfDefend: LLMs Can Defend Themselves Against Jailbreaking in A Practical Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by how the traditional security concept of shadow stacks defends against memory overflow attacks, this paper introduces a generic LLM jailbreak defense framework called SelfDefend, which establishes a shadow LLM as a defense instance (in detection state) to concurrently protect the target LLM instance (in normal answering state) in the normal stack and collaborate with it for checkpoint-based access control.	XUNGUANG WANG et. al.	arxiv-cs.CR	2024-06-08
1168	Do LLMs Recognize Me, When I Is Not Me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first study examining indexical shift in any language, releasing a Turkish dataset specifically designed for this purpose.	Metehan Oğuz; Yusuf Umut Ciftci; Yavuz Faruk Bakman;	arxiv-cs.CL	2024-06-08
1169	Automata Extraction from Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automata extraction algorithm specifically designed for Transformer models.	Yihao Zhang; Zeming Wei; Meng Sun;	arxiv-cs.LG	2024-06-08
1170	VTrans: Accelerating Transformer Compression with Variational Information Bottleneck Based Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges, we propose VTrans, an iterative pruning framework guided by the Variational Information Bottleneck (VIB) principle.	Oshin Dutta; Ritvik Gupta; Sumeet Agarwal;	arxiv-cs.LG	2024-06-07
1171	BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense.	Baktash Ansari; Mohammadmostafa Rostamkhani; Sauleh Eetemadi;	arxiv-cs.CL	2024-06-07
1172	Transformer Conformal Prediction for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a conformal prediction method for time series using the Transformer architecture to capture long-memory and long-range dependencies.	Junghwan Lee; Chen Xu; Yao Xie;	arxiv-cs.LG	2024-06-07
1173	Low-Resource Cross-Lingual Summarization Through Few-Shot Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4.	Gyutae Park; Seojin Hwang; Hwanhee Lee;	arxiv-cs.CL	2024-06-07
1174	Mixture-of-Agents Enhances Large Language Model Capabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology.	Junlin Wang; Jue Wang; Ben Athiwaratkun; Ce Zhang; James Zou;	arxiv-cs.CL	2024-06-07
1175	Logic Synthesis with Generative Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named ctrw (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting.	XIHAN LI et. al.	arxiv-cs.LO	2024-06-07
1176	Can GPT Embeddings Enhance Visual Exploration of Literature Datasets? A Case Study on Isostatic Pressing Research Related Papers Related Patents Related Grants Related Venues Related Experts View	Hongjiang Lv; Zhibin Niu; Wei Han; Xiang Li;	J. Vis.	2024-06-07
1177	GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents’ performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of LLM agents.	ANTHONY COSTARELLI et. al.	arxiv-cs.CL	2024-06-06
1178	MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Multi-path Enhanced Taylor (MET) Transformer based U-net for Speech Enhancement (MUSE), a lightweight speech enhancement network built upon the Unet architecture.	Zizhen Lin; Xiaoting Chen; Junyu Wang;	arxiv-cs.SD	2024-06-06
1179	Exploring The Latest LLMs for Leaderboard Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document).	Salomon Kabongo; Jennifer D’Souza; Sören Auer;	arxiv-cs.CL	2024-06-06
1180	Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations.	NEEMESH YADAV et. al.	arxiv-cs.CL	2024-06-06
1181	From Tarzan to Tolkien: Controlling The Language Proficiency Level of LLMs for Content Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of controlling the difficulty level of text generated by Large Language Models (LLMs) for contexts where end-users are not fully proficient, such as language learners.	Ali Malik; Stephen Mayhew; Chris Piech; Klinton Bicknell;	arxiv-cs.CL	2024-06-05
1182	The Good, The Bad, and The Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states.	MIKHAIL MOZIKOV et. al.	arxiv-cs.AI	2024-06-05
1183	CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition.	YE ZENG et. al.	arxiv-cs.IT	2024-06-05
1184	Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Global Clipper and Global Hybrid Clipper, effective mitigation strategies specifically designed for transformer-based models.	QUTUB SYED SHA et. al.	arxiv-cs.CV	2024-06-05
1185	Learning to Grok: Emergence of In-context Learning and Skill Composition in Modular Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks.	Tianyu He; Darshil Doshi; Aritra Das; Andrey Gromov;	arxiv-cs.LG	2024-06-04
1186	Probing The Category of Verbal Aspect in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian.	Anisia Katinskaia; Roman Yangarber;	arxiv-cs.CL	2024-06-04
1187	Multi-layer Learnable Attention Mask for Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Comprehensive experimental validation on various datasets, such as MADv2, QVHighlights, ImageNet 1K, and MSRVTT, demonstrates the efficacy of the LAM, exemplifying its ability to enhance model performance while mitigating redundant computations. This pioneering approach presents a significant advancement in enhancing the understanding of complex scenarios, such as in movie understanding.	Wayner Barrios; SouYoung Jin;	arxiv-cs.CV	2024-06-04
1188	A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs).	Remi Genet; Hugo Inzirillo;	arxiv-cs.LG	2024-06-04
1189	Randomized Geometric Algebra Methods for Convex Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce randomized algorithms to Clifford’s Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces.	Yifei Wang; Sungyoon Kim; Paul Chu; Indu Subramaniam; Mert Pilanci;	arxiv-cs.LG	2024-06-04
1190	Too Big to Fail: Larger Language Models Are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous findings show that changes in PPL when masking attention layers in pre-trained transformer-based NLMs reflect linguistic anomalies associated with Alzheimer’s disease dementia. Building upon this, we explore a novel bidirectional attention head ablation method that exhibits properties attributed to the concepts of cognitive and brain reserve in human brain studies, which postulate that people with more neurons in the brain and more efficient processing are more resilient to neurodegeneration.	Changye Li; Zhecheng Sheng; Trevor Cohen; Serguei Pakhomov;	arxiv-cs.CL	2024-06-04
1191	Eliciting The Priors of Large Language Models Using Iterated In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a prompt-based workflow for eliciting prior distributions from LLMs.	Jian-Qiao Zhu; Thomas L. Griffiths;	arxiv-cs.CL	2024-06-03
1192	Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our empirical study focuses on evaluating adversarial robustness of object trackers based on bounding box versus binary mask predictions, and attack methods at different levels of perturbations.	Fatemeh Nourilenjan Nokabadi; Jean-François Lalonde; Christian Gagné;	arxiv-cs.CV	2024-06-03
1193	Mapping Study Variables to Common Data Elements Using GPT for Sheets: Towards Standardized Data Collection and Sharing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secondary use or reuse of biomedical research data has drawn significant attention and is of growing importance. Non-standardized representation and wide variability of clinical …	Pritham Ram; Na Hong; Hua Xu; Xiaoqian Jiang;	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
1194	SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for semantic understanding for complex tasks like debugging and program repair.	YANGRUIBO DING et. al.	arxiv-cs.CL	2024-06-03
1195	Superhuman Performance in Urology Board Questions By An Explainable Large Language Model Enabled for Context Integration of The European Association of Urology Guidelines: The UroBot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: UroBot was developed using OpenAI’s GPT-3.5, GPT-4, and GPT-4o models, employing retrieval-augmented generation (RAG) and the latest 2023 guidelines from the European Association of Urology (EAU).	MARTIN J. HETZ et. al.	arxiv-cs.CL	2024-06-03
1196	Performance Evaluation of Multimodal Large Language Models (LLaVA and GPT-4-based ChatGPT) in Medical Image Classification Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have gained significant attention due to their prospective applications in medicine. Utilizing multimodal LLMs can potentially assist clinicians in …	Yuhang Guo; Zhiyu Wan;	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
1197	Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.	CHENG HAN et. al.	arxiv-cs.CV	2024-06-03
1198	Seeing Beyond Borders: Evaluating LLMs in Multilingual Ophthalmological Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs), such as GPT-3.5 [1] and GPT-4 [2], have significant potential for transforming several aspects of patient care from clinical note summarization to …	DAVID RESTREPO et. al.	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
1199	In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties?	GRZEGORZ KASZUBA et. al.	arxiv-cs.LG	2024-06-03
1200	Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs.	SHIQI LIU et. al.	arxiv-cs.CL	2024-06-02
1201	Drive As Veteran: Fine-tuning of An Onboard Large Language Model for Highway Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Due to the limitations of network communication conditions for online calling GPT, the onboard deployment of Large Language Models for autonomous driving is in need. In this …	YUJIN WANG et. al.	2024 IEEE Intelligent Vehicles Symposium (IV)	2024-06-02
1202	RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks.	Md. Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Md. Ashad Alam;	arxiv-cs.CL	2024-06-01
1203	Low-Contrast Medical Image Segmentation Via Transformer and Boundary Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Low-contrast medical image segmentation is a challenging task that requires full use of local details and global context. However, existing convolutional neural networks (CNNs) …	YINGLIN ZHANG et. al.	IEEE Transactions on Emerging Topics in Computational …	2024-06-01
1204	Hunt-inspired Transformer for Visual Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhibin Zhang; Wanli Xue; Yuxi Zhou; Kaihua Zhang; Shengyong Chen;	Pattern Recognit.	2024-06-01
1205	Utilizing Passage‐level Relevance and Kernel Pooling for Enhancing BERT‐based Document Reranking Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The pre‐trained language model (PLM) based on the Transformer encoder, namely BERT, has achieved state‐of‐the‐art results in the field of Information Retrieval. Existing …	MIN PAN et. al.	Computational Intelligence	2024-06-01
1206	Hyneter:Hybrid Network Transformer for Multiple Computer Vision Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this article, we point out that the essential differences between convolutional neural network (CNN)-based and transformer-based detectors, which cause worse performance of …	Dong Chen; Duoqian Miao; Xuerong Zhao;	IEEE Transactions on Industrial Informatics	2024-06-01
1207	EdgeTran: Device-Aware Co-Search of Transformers for Efficient Inference on Mobile Edge Platforms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while …	Shikhar Tuli; N. Jha;	IEEE Transactions on Mobile Computing	2024-06-01
1208	Beyond Boundaries: A Human-like Approach for Question Answering Over Structured and Unstructured Information Sources Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering …	Jens Lehmann; Dhananjay Bhandiwad; Preetam Gattogi; S. Vahdati;	Transactions of the Association for Computational …	2024-06-01
1209	SwinFG: A Fine-grained Recognition Scheme Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhipeng Ma; Xiaoyu Wu; Anzhuo Chu; Lei Huang; Zhiqiang Wei;	Expert Syst. Appl.	2024-06-01
1210	Attribute-Based Injection Transformer for Personalized Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Personal attributes have been proven to be useful for sentiment analysis. However, previous models of learning attribute-specific language representations are suboptimal because …	Youjia Zhang; Jin Wang; Liang-Chih Yu; Dan Xu; Xuejie Zhang;	IEEE Transactions on Emerging Topics in Computational …	2024-06-01
1211	FuzzyTP-BERT: Enhancing Extractive Text Summarization with Fuzzy Topic Modeling and Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View	Aytuğ Onan; Hesham A. Alhumyani;	J. King Saud Univ. Comput. Inf. Sci.	2024-06-01
1212	Multimodal Metadata Assignment for Cultural Heritage Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset.	LUIS REI et. al.	arxiv-cs.CV	2024-06-01
1213	A Transformer and Convolution-Based Learning Framework for Automatic Modulation Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic modulation classification (AMC) is a typical pattern classification task that is an intermediate process between signal detection and demodulation. Deep learning methods …	Wenxuan Ma; Zhuoran Cai; Chuan Wang;	IEEE Communications Letters	2024-06-01
1214	Beyond Metrics: Evaluating LLMs’ Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our evaluation includes both quantitative analysis using metrics like F1 score and qualitative assessment of LLMs’ explanations for their predictions. We find that, while Mistral-7b and Mixtral-8x7b achieved high F1 scores, they and other LLMs such as GPT-3.5-Turbo, Llama-2-70b, and Gemma-7b struggled with understanding linguistic and contextual nuances, as well as lack of transparency in their decision-making process as observed from their explanations.	MILLICENT OCHIENG et. al.	arxiv-cs.CL	2024-06-01
1215	LiteFormer: A Lightweight and Efficient Transformer for Rotating Machine Fault Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, …	WENJUN SUN et. al.	IEEE Transactions on Reliability	2024-06-01
1216	Transformer-based Fall Detection in Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	Adrián Núñez-Marcos; I. Arganda-Carreras;	Eng. Appl. Artif. Intell.	2024-06-01
1217	Explainable Attention Pruning: A Metalearning-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Pruning, as a technique to reduce the complexity and size of transformer-based models, has gained significant attention in recent years. While various models have been …	P. Rajapaksha; Noel Crespi;	IEEE Transactions on Artificial Intelligence	2024-06-01
1218	How Random Is Random? Evaluating The Randomness and Humaness of LLMs’ Coin Flips Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One uniquely human trait is our inability to be random. We see and produce patterns where there should not be any and we do so in a predictable way. LLMs are supplied with human …	K. V. Koevering; Jon Kleinberg;	ArXiv	2024-05-31
1219	A Comparison of Correspondence Analysis with PMI-based Word Embedding Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we link correspondence analysis (CA) to the factorization of the PMI matrix.	Qianqian Qi; Ayoub Bagheri; David J. Hessen; Peter G. M. van der Heijden;	arxiv-cs.CL	2024-05-31
1220	Bi-Directional Transformers Vs. Word2vec: Discovering Vulnerabilities in Lifted Compiled Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting vulnerabilities within compiled binaries is challenging due to lost high-level code structures and other factors such as architectural dependencies, compilers, and optimization options. To address these obstacles, this research explores vulnerability detection using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa to learn semantics from intermediate representation (LLVM IR) code.	Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier;	arxiv-cs.CR	2024-05-30
1221	Learning General Policies for Planning Through GPT Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning …	NICHOLAS ROSSETTI et. al.	International Conference on Automated Planning and …	2024-05-30
1222	The Point of View of A Sentiment: Towards Clinician Bias Detection in Psychiatric Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging pre-trained and large language models (PLMs and LLMs), this work aims to characterize potentially harmful language usage in psychiatric notes by identifying the sentiment expressed in sentences describing patients based on the reader’s point of view.	Alissa A. Valentine; Lauren A. Lepow; Lili Chan; Alexander W. Charney; Isotta Landi;	arxiv-cs.CL	2024-05-30
1223	Divide-and-Conquer Meets Consensus: Unleashing The Power of Functions in Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus.	JINGCHANG CHEN et. al.	arxiv-cs.CL	2024-05-30
1224	Automatic Graph Topology-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers.	CHAO WANG et. al.	arxiv-cs.NE	2024-05-30
1225	Ensemble Model With Bert, Roberta and Xlnet For Molecular Property Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a novel approach for predicting molecular properties with high accuracy without the need for extensive pre-training. Employing ensemble learning and supervised …	Junling Hu;	ArXiv	2024-05-30
1226	DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances.	JIA LI et. al.	arxiv-cs.CL	2024-05-30
1227	Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RIR robustly improves knowledge-intensive visual question answering (VQA) of GPT-4V by 37-43%, GPT-4 Turbo by 25-27%, and GPT-4o by 18-20% in terms of open-ended VQA evaluation metrics. To our surprise, we discover that RIR helps the model to better access its own world knowledge.	Jialiang Xu; Michael Moor; Jure Leskovec;	arxiv-cs.CL	2024-05-29
1228	Beyond Agreement: Diagnosing The Rationale Alignment of Automated Essay Scoring Methods Based on Linguistically-informed Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that BERT-like models primarily focus on sentence-level features, whereas LLMs such as GPT-3.5, GPT-4 and Llama-3 are sensitive to conventions & accuracy, language complexity, and organization, indicating a more comprehensive rationale alignment with scoring rubrics.	Yupei Wang; Renfen Hu; Zhe Zhao;	arxiv-cs.CL	2024-05-29
1229	Multi-objective Cross-task Learning Via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new learning-based framework by leveraging the strong reasoning capability of the GPT-based architecture to automate surgical robotic tasks.	Jiawei Fu; Yonghao Long; Kai Chen; Wang Wei; Qi Dou;	arxiv-cs.RO	2024-05-29
1230	Voice Jailbreak Attacks Against GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first systematic measurement of jailbreak attacks against the voice mode of GPT-4o.	Xinyue Shen; Yixin Wu; Michael Backes; Yang Zhang;	arxiv-cs.CR	2024-05-29
1231	STAT: Shrinking Transformers After Training Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present STAT: a simple algorithm to prune transformer models without any fine-tuning. STAT eliminates both attention heads and neurons from the network, while preserving …	Megan Flynn; Alexander Wang; Dean Edward Alvarez; Christopher De Sa; Anil Damle;	ArXiv	2024-05-29
1232	Towards Next-Generation Urban Decision Support Systems Through AI-Powered Generation of Scientific Ontology Using Large Language Models – A Case in Optimizing Intermodal Freight Transportation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management …	JOSE TUPAYACHI et. al.	ArXiv	2024-05-29
1233	MDS-ViTNet: Improving Saliency Prediction for Eye-Tracking with Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking.	Polezhaev Ignat; Goncharenko Igor; Iurina Natalya;	arxiv-cs.CV	2024-05-29
1234	Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Language models, such as GPT-3 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks, using instruction fine-tuning. …	PENG LI et. al.	Proc. ACM Manag. Data	2024-05-29
1235	Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As interest in reformulating the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA.	Simranjit Singh; Georgios Pavlakos; Dimitrios Stamoulis;	arxiv-cs.CV	2024-05-29
1236	A Multi-Source Retrieval Question Answering Framework Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information.	RIDONG WU et. al.	arxiv-cs.IR	2024-05-29
1237	AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we rethink the approach to jailbreaking LLMs and formally define three essential properties from the attacker’ s perspective, which contributes to guiding the design of jailbreak methods.	JIAWEI CHEN et. al.	arxiv-cs.CV	2024-05-29
1238	LMO-DP: Optimizing The Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$).	QIN YANG et. al.	arxiv-cs.CR	2024-05-29
1239	Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Repeat Ranking method – where we evaluate the same responses multiple times and train only on those responses which are consistently ranked.	Peter Devine;	arxiv-cs.CL	2024-05-29
1240	Data-Efficient Approach to Humanoid Control Via Fine-Tuning A Pre-Trained GPT on Action Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories.	Siddharth Padmanabhan; Kazuki Miyazawa; Takato Horii; Takayuki Nagai;	arxiv-cs.RO	2024-05-28
1241	Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks.	Shubham Vatsal; Ayush Singh;	arxiv-cs.CL	2024-05-28
1242	Notes on Applicability of GPT-4 to Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics.	Łukasz Borchmann;	arxiv-cs.CL	2024-05-28
1243	Delving Into Differentially Private Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such `reduction’ is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively.	YOULONG DING et. al.	arxiv-cs.LG	2024-05-28
1244	I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach aims to revolutionize teachers’ assessment of students’ practices by leveraging Generative Artificial Intelligence (GenAI) to offer detailed insights into classroom dynamics.	UNGGI LEE et. al.	arxiv-cs.HC	2024-05-28
1245	Look Ahead Text Understanding and LLM Stitching Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper proposes a look ahead text understanding problem with look ahead section identification (LASI) as an example. This problem may appear in generative AI as well as human …	Junlin Julian Jiang; Xin Li;	International Conference on Web and Social Media	2024-05-28
1246	Deployment of Large Language Models to Control Mobile Robots at The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated.	PASCAL SIKORSKI et. al.	arxiv-cs.RO	2024-05-27
1247	CTranS: A Multi-Resolution Convolution-Transformer Network for Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Achieving accurate medical image segmentation requires considering both global contextual information and local regional details. Compared to traditional convolutional neural …	Zhendi Gong; Andrew P. French; Guoping Qiu; Xin Chen;	2024 IEEE International Symposium on Biomedical Imaging …	2024-05-27
1248	How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they …	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	ArXiv	2024-05-27
1249	Multi-objective Representation for Numbers in Clinical Narratives: A CamemBERT-Bio-Based Alternative to Large-Scale LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \textit{Objective:} this research aims to categorize numerical values extracted from medical documents into eight specific physiological categories using CamemBERT-bio.	Boammani Aser Lompo; Thanh-Dung Le;	arxiv-cs.CL	2024-05-27
1250	Toward A Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered a …	M. EMANI et. al.	2024 IEEE International Parallel and Distributed Processing …	2024-05-27
1251	Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous approaches to 3D human motion generation have achieved notable success, they often rely on extensive training and are limited to specific tasks. To address these challenges, we introduce Motion-Agent, an efficient conversational framework designed for general human motion generation, editing, and understanding.	QI WU et. al.	arxiv-cs.CV	2024-05-27
1252	Vision-and-Language Navigation Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposal, the Vision-and-Language Navigation Generative Pretrained Transformer (VLN-GPT), adopts a transformer decoder model (GPT2) to model trajectory sequence dependencies, bypassing the need for historical encoding modules.	Wen Hanlin;	arxiv-cs.AI	2024-05-27
1253	RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm.	TIANYU YU et. al.	arxiv-cs.CL	2024-05-27
1254	LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker.	Shaohua Dong; Yunhe Feng; Qing Yang; Yuewei Lin; Heng Fan;	arxiv-cs.CV	2024-05-27
1255	InversionView: A General-Purpose Method for Reading Information from Neural Activations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activations.	Xinting Huang; Madhur Panwar; Navin Goyal; Michael Hahn;	arxiv-cs.LG	2024-05-27
1256	Assessing LLMs Suitability for Knowledge Graph Completion Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Recent work has shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot …	Vasile Ionut Remus Iga; Gheorghe Cosmin Silaghi;	arxiv-cs.CL	2024-05-27
1257	Performance Evaluation of Reddit Comments Using Machine Learning and Natural Language Processing Methods in Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments.	Xiaoxia Zhang; Xiuyuan Qi; Zixin Teng;	arxiv-cs.CL	2024-05-26
1258	Higher-Order Transformer Derivative Estimates for Explicit Pathwise Learning Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider realistic transformers with multiple (non-linearized) attention heads per block and layer normalization.	Yannick Limmer; Anastasis Kratsios; Xuwei Yang; Raeid Saqur; Blanka Horvath;	arxiv-cs.LG	2024-05-26
1259	M3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three …	MINGSHUANG LUO et. al.	ArXiv	2024-05-25
1260	M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation.	MINGSHUANG LUO et. al.	arxiv-cs.CV	2024-05-25
1261	Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens.	HOAI-CHAU TRAN et. al.	arxiv-cs.LG	2024-05-25
1262	A Registration Method of Overlap Aware Point Clouds Based on Transformer-to-Transformer Regression Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has recently become widely adopted in point cloud registration. Nevertheless, Transformer is unsuitable for handling dense point clouds due to resource constraints and …	YAFEI ZHAO et. al.	Remote. Sens.	2024-05-25
1263	Activator: GLU Activation Function As The Core Component of A Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimental assessments conducted by this research show that both proposed modifications and reductions offer competitive performance in relation to baseline architectures, in support of the aims of this work in establishing a more efficient yet capable alternative to the traditional attention mechanism as the core component in designing transformer architectures.	Abdullah Nazhat Abdullah; Tarkan Aydin;	arxiv-cs.CV	2024-05-24
1264	Incremental Comprehension of Garden-Path Sentences By Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa.	ANDREW LI et. al.	arxiv-cs.CL	2024-05-24
1265	Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS).	Barış Büyüktaş; Kenneth Weitzel; Sebastian Völkers; Felix Zailskas; Begüm Demir;	arxiv-cs.CV	2024-05-24
1266	SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings.	GUIBAO SHEN et. al.	arxiv-cs.CV	2024-05-24
1267	Enhancing Non-player Characters in Unity 3D Using GPT-3.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This case study presents a comprehensive integration process of OpenAI’s GPT-3.5 large language model (LLM) into Unity 3D to enhance non-player characters (NPCs) in video games …	John Sissler;	ACM Games: Research and Practice	2024-05-24
1268	PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis.	ZICHENG WANG et. al.	arxiv-cs.CV	2024-05-24
1269	GPTZoo: A Large-scale Dataset of GPTs for The Research Community Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support academic research on GPTs, we introduce GPTZoo, a large-scale dataset comprising 730,420 GPT instances.	Xinyi Hou; Yanjie Zhao; Shenao Wang; Haoyu Wang;	arxiv-cs.SE	2024-05-24
1270	A Comparative Analysis of Distributed Training Strategies for GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques developed to address these challenges, enabling the efficient and scalable training of Large Language Models.	Ishan Patwardhan; Shubham Gandhi; Om Khare; Amit Joshi; Suraj Sawant;	arxiv-cs.DC	2024-05-24
1271	Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD).	Gao Tianci;	arxiv-cs.RO	2024-05-24
1272	SMART: Scalable Multi-agent Real-time Motion Generation Via Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens.	Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan;	arxiv-cs.RO	2024-05-24
1273	Comet: A Communication-efficient and Performant Approximation for Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel plug-in method Comet to effectively reduce the communication cost without compromising the inference performance.	Xiangrui Xu; Qiao Zhang; Rui Ning; Chunsheng Xin; Hongyi Wu;	arxiv-cs.LG	2024-05-24
1274	The Buffer Mechanism for Multi-Step Information Reasoning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy based on their inherent structure and horizontal thinking strategy based on Chain of Thought to achieve multi-step reasoning.	ZHIWEI WANG et. al.	arxiv-cs.AI	2024-05-24
1275	GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey.	Virginia K. Felkner; Jennifer A. Thompson; Jonathan May;	arxiv-cs.CL	2024-05-24
1276	SMART: Scalable Multi-agent Real-time Simulation Via Next-token Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their …	Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan;	ArXiv	2024-05-24
1277	CulturePark: Boosting Cross-cultural Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.	CHENG LI et. al.	arxiv-cs.AI	2024-05-23
1278	CEEBERT: Cross-Domain Inference in Early Exit BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point.	Divya Jyoti Bajpai; Manjesh Kumar Hanawal;	arxiv-cs.CL	2024-05-23
1279	An Evaluation of Estimative Uncertainty in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares estimative uncertainty in commonly used large language models (LLMs) like GPT-4 and ERNIE-4 to that of humans, and to each other.	Zhisheng Tang; Ke Shen; Mayank Kejriwal;	arxiv-cs.CL	2024-05-23
1280	JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.	KUN ZHOU et. al.	arxiv-cs.CL	2024-05-23
1281	Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to The Edge of Generalization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with.	Boshi Wang; Xiang Yue; Yu Su; Huan Sun;	arxiv-cs.CL	2024-05-23
1282	Efficient Point Transformer with Dynamic Token Aggregating for LiDAR Point Cloud Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: They also tend to be slow due to requiring time-consuming point cloud sampling and grouping processes. To address these issues, we propose an efficient point TransFormer with Dynamic Token Aggregating (DTA-Former) for point cloud representation and processing.	Dening Lu; Jun Zhou; Linlin Xu; Jonathan Li;	arxiv-cs.CV	2024-05-23
1283	AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce AutoCoder, the first Large Language Model to surpass GPT-4 Turbo (April 2024) and GPT-4o in pass@1 on the Human Eval benchmark test ($\mathbf{90.9\%}$ vs. …	Bin Lei; Yuchen Li; Qiuwu Chen;	ArXiv	2024-05-23
1284	ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD.	Luan Thanh Nguyen;	arxiv-cs.CL	2024-05-22
1285	Transformer in Touch: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to comprehensively outline the application and development of Transformers in tactile technology.	Jing Gao; Ning Cheng; Bin Fang; Wenjuan Han;	arxiv-cs.LG	2024-05-21
1286	Contextualized Word Embeddings Expose Ethnic Biases in News Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The web is a major source for news and information. Yet, news can perpetuate and amplify biases and stereotypes. Prior work has shown that training static word embeddings can …	Guusje Thijs; D. Trilling; A. Kroon;	Proceedings of the 16th ACM Web Science Conference	2024-05-21
1287	Towards Authoring Open-Ended Behaviors for Narrative Puzzle Games with Large Language Model Support Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing games with branching story lines, object annotations, scene details, and dialog can be challenging due to the intensive authoring required. We investigate the potential …	Britney Ngaw; Grishma Jena; João Sedoc; Aline Normoyle;	Proceedings of the 19th International Conference on the …	2024-05-21
1288	How Reliable AI Chatbots Are for Disease Prediction from Patient Complaints? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making.	Ayesha Siddika Nipu; K M Sajjadul Islam; Praveen Madiraju;	arxiv-cs.AI	2024-05-21
1289	Advancing Web Science Through Foundation Model for Tabular Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As the landscape of web science expands, handling the vast datasets collected from the Web while preserving computational efficiency and privacy remains a significant challenge. …	Inwon Kang;	Companion Publication of the 16th ACM Web Science Conference	2024-05-21
1290	Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs).	MOHAMED AMINE FERRAG et. al.	arxiv-cs.CR	2024-05-21
1291	Cardistry: Exploring A GPT Model Workflow As An Adapted Method of Gaminiscing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cardistry is an application that enables users to create their own playing cards for use in evocative storytelling games. It is driven by OpenAI’s Generative Pre-trained …	BRANDON LYMAN et. al.	Proceedings of the 19th International Conference on the …	2024-05-21
1292	Exploring The Gap: The Challenge of Achieving Human-like Generalization for Concept-based Translation Instruction Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study utilizes concept description instructions and few-shot learning examples to examine the effectiveness of a large language model (GPT-4) in generating Chinese-to-English …	Ming Qian; Chuiqing Kong;	AAAI Spring Symposia	2024-05-20
1293	Automated Hardware Logic Obfuscation Framework Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process.	BANAFSHEH SABER LATIBARI et. al.	arxiv-cs.CR	2024-05-20
1294	From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, …	PRIYANKA NANAYAKKARA et. al.	2024 IEEE Symposium on Security and Privacy (SP)	2024-05-19
1295	DaVinci at SemEval-2024 Task 9: Few-shot Prompting GPT-3.5 for Unconventional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types.	Suyash Vardhan Mathur; Akshett Rai Jindal; Manish Shrivastava;	arxiv-cs.CL	2024-05-19
1296	Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning …	Congpeng Du; Seok-Bum Ko; Hao Zhang;	2024 IEEE International Symposium on Circuits and Systems …	2024-05-19
1297	Zero-Shot Stance Detection Using Contextual Data Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this approach, we aim to fine-tune an existing model at test time.	Ghazaleh Mahmoudi; Babak Behkamkia; Sauleh Eetemadi;	arxiv-cs.CL	2024-05-19
1298	Enhancing User Experience in Large Language Models Through Human-centered Design: Integrating Theoretical Insights with An Experimental Study to Meet Diverse Software Learning Needs with A Single Document Knowledge Base Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The experimental results demonstrate the effect of different elements’ forms and organizational methods in the document, as well as GPT’s relevant configurations, on the interaction effectiveness between GPT and software learners.	Yuchen Wang; Yin-Shan Lin; Ruixin Huang; Jinyin Wang; Sensen Liu;	arxiv-cs.HC	2024-05-19
1299	Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques.	Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; Chen Hajaj;	arxiv-cs.CR	2024-05-18
1300	Benchmarking Large Language Models on CFLUE – A Chinese Financial Language Understanding Evaluation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In light of recent breakthroughs in large language models (LLMs) that have revolutionized natural language processing (NLP), there is an urgent need for new benchmarks to keep …	Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo;	Annual Meeting of the Association for Computational …	2024-05-17
1301	GPTs Window Shopping: An Analysis of The Landscape of Custom ChatGPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Customization comes in the form of prompt-tuning, analysis of reference resources, browsing, and external API interactions, alongside a promise of revenue sharing for created custom GPTs. In this work, we peer into the window of the GPT Store and measure its impact.	Benjamin Zi Hao Zhao; Muhammad Ikram; Mohamed Ali Kaafar;	arxiv-cs.SI	2024-05-17
1302	Benchmarking Large Language Models on CFLUE — A Chinese Financial Language Understanding Evaluation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CFLUE, the Chinese Financial Language Understanding Evaluation benchmark, designed to assess the capability of LLMs across various dimensions.	Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo;	arxiv-cs.CL	2024-05-17
1303	Quantitative Analysis of GPT-4 Model: Optimizing Patient Eligibility Classification for Clinical Trials and Reducing Expert Judgment Dependency Related Papers Related Patents Related Grants Related Venues Related Experts View	ARTI DEVI et. al.	International Conference on Medical and Health Informatics	2024-05-17
1304	Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, no comparative study examining different LLMs has yet been reported for web-form-test generation.	TAO LI et. al.	arxiv-cs.SE	2024-05-16
1305	GPT Store Mining and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings aim to enhance understanding of the GPT ecosystem, providing valuable insights for future research, development, and policy-making in generative AI.	Dongxun Su; Yanjie Zhao; Xinyi Hou; Shenao Wang; Haoyu Wang;	arxiv-cs.LG	2024-05-16
1306	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 1.55B parameters.	RHEA SANJAY SUKTHANKER et. al.	arxiv-cs.LG	2024-05-16
1307	Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP).	Tong Zhan; Chenxi Shi; Yadong Shi; Huixiang Li; Yiyu Lin;	arxiv-cs.CL	2024-05-15
1308	Comparing The Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to compare the performance of two large language models, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings.	Birger Moell;	arxiv-cs.CL	2024-05-15
1309	GPT-3.5 for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models.	Anisia Katinskaia; Roman Yangarber;	arxiv-cs.CL	2024-05-14
1310	Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis Between Emotional Stimuli Prompt, Fine-Tuning, and In-Context Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Textual emotion recognition (TER) has significant commercial potential since it can be used as an excellent tool to monitor a brand/business reputation, understand customer …	E. Nfaoui; Hanane Elfaik;	J. Theor. Appl. Electron. Commer. Res.	2024-05-14
1311	Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Continual learning, which acts as an effective tool for detecting newly emerged deepfake audio while maintaining performance on older types, lacks a well-constructed and user-friendly evaluation framework. To address this gap, we introduce EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection.	Xiaohui Zhang; Jiangyan Yi; Jianhua Tao;	arxiv-cs.SD	2024-05-14
1312	Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes a concurrent programming framework for quantitatively analyzing the efficiency challenges in serving multiple long-context requests under limited size of GPU high-bandwidth memory (HBM) regime.	Yao Fu;	arxiv-cs.LG	2024-05-14
1313	PRECYSE: Predicting Cybersickness Using Transformer for Multimodal Time-Series Sensor Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cybersickness, a factor that hinders user immersion in VR, has been the subject of ongoing attempts to predict it using AI. Previous studies have used CNN and LSTM for prediction …	Dayoung Jeong; Kyungsik Han;	Proceedings of the ACM on Interactive, Mobile, Wearable and …	2024-05-13
1314	Open-vocabulary Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the \textbf{Brain Prompt GPT (BP-GPT)}.	Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He;	arxiv-cs.HC	2024-05-13
1315	Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs.	CHENGYUE WU et. al.	arxiv-cs.CL	2024-05-13
1316	Can GNN Be Good Adapter for LLMs? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs.	XUANWEN HUANG et. al.	www	2024-05-13
1317	Relationalizing Tables with Large Language Models: The Promise and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural …	Zezhou Huang; Eugene Wu;	2024 IEEE 40th International Conference on Data Engineering …	2024-05-13
1318	Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Memes are important because they serve as conduits for expressing emotions, opinions, and social commentary online, providing valuable insight into public sentiment, trends, and …	F. Abdullakutty; Usman Naseem;	Companion Proceedings of the ACM on Web Conference 2024	2024-05-13
1319	Using ChatGPT for Thematic Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The utilisation of AI-driven tools, notably ChatGPT, within academic research is increasingly debated from several perspectives including ease of implementation, and potential …	Aleksei Turobov; Diane Coyle; Verity Harding;	ArXiv	2024-05-13
1320	Large Language Models: Principles and Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The last few years have been marked by several breakthroughs in the domain of generative AI. Large language models such as GPT-4 are able to solve a plethora of tasks, ranging …	Immanuel Trummer;	2024 IEEE 40th International Conference on Data Engineering …	2024-05-13
1321	Decision Mamba Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models.	André Correia; Luís A. Alexandre;	arxiv-cs.LG	2024-05-13
1322	Transformer Models for Brazilian Portuguese Question Generation: An Experimental Study Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Unlike tasks such as translation or summarization, generating meaningful questions necessitates a profound understanding of context, semantics, and syntax. This complexity arises …	Julia da Rocha Junqueira; U. Corrêa; Larissa A. de Freitas;	The International FLAIRS Conference Proceedings	2024-05-13
1323	The Personality Dimensions GPT-3 Expresses During Human-Chatbot Interactions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models such as GPT-3 and ChatGPT can mimic human-to-human conversation with unprecedented fidelity, which enables many applications such as conversational agents …	N. Kovačević; Christian Holz; Markus Gross; Rafael Wampfler;	Proceedings of the ACM on Interactive, Mobile, Wearable and …	2024-05-13
1324	COLA: Cross-city Mobility Transformer for Human Trajectory Simulation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to explore the intriguing problem of mobility transfer across cities, grasping the universal patterns of human trajectories to augment the powerful Transformer with external mobility data.	Yu Wang; Tongya Zheng; Yuxuan Liang; Shunyu Liu; Mingli Song;	www	2024-05-13
1325	Coding Historical Causes of Death Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death.	BJØRN PEDERSEN et. al.	arxiv-cs.LG	2024-05-13
1326	ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: SMS, or short messaging service, is a widely used and cost-effective communication medium that has sadly turned into a haven for unwanted messages, commonly known as SMS spam. …	Mohammad Amaz Uddin; Muhammad Nazrul Islam; L. Maglaras; Helge Janicke; Iqbal H. Sarker;	ArXiv	2024-05-12
1327	L(u)PIN: LLM-based Political Ideology Nowcasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method to analyze ideological positions of individual parliamentary representatives by leveraging the latent knowledge of LLMs.	Ken Kato; Annabelle Purnomo; Christopher Cochrane; Raeid Saqur;	arxiv-cs.CL	2024-05-12
1328	Limited Ability of LLMs to Simulate Human Psychological Behaviours: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we prompt OpenAI’s flagship models, GPT-3.5 and GPT-4, to assume different personas and respond to a range of standardized measures of personality constructs.	Nikolay B Petrov; Gregory Serapio-García; Jason Rentfrow;	arxiv-cs.CL	2024-05-12
1329	Can Language Models Explain Their Own Classification Behavior? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules.	Dane Sherburn; Bilal Chughtai; Owain Evans;	arxiv-cs.LG	2024-05-12
1330	An Autoethnographic Reflection of Prompting A Custom GPT Based on Oneself Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: What if you could have a chat with yourself? OpenAI’s introduction of custom GPTs in November 2023 provides an opportunity for non-technical users to create specialized generative …	Priscilla Y. Lo;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-11
1331	GPTs in Mafia-like Game Simulation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this research, we explore the potential of Generative AI models, focusing on their application in role-playing simulations through Spyfall, a renowned mafia-style game. By …	Munyeong Kim;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-11
1332	RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel zero-shot video captioning framework named Retrieval-Enhanced Test-Time Adaptation (RETTA), which takes advantage of existing pretrained large-scale vision and language models to directly generate captions with test-time adaptation.	YUNCHUAN MA et. al.	arxiv-cs.CV	2024-05-11
1333	ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Peer influence plays a crucial role in promoting classroom participation, where behaviors from active students can contribute to a collective classroom learning experience. …	ZIYI LIU et. al.	Proceedings of the CHI Conference on Human Factors in …	2024-05-11
1334	Integrating Expertise in LLMs: Crafting A Customized Nutrition Assistant with Refined Template Instructions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have the potential to contribute to the fields of nutrition and dietetics in generating food product explanations that facilitate informed food …	Annalisa Szymanski; Brianna L Wimer; Oghenemaro Anuyah; H. Eicher-Miller; Ronald A Metoyer;	Proceedings of the CHI Conference on Human Factors in …	2024-05-11
1335	Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution.	IBAI RAMIREZ et. al.	arxiv-cs.LG	2024-05-10
1336	Data-Driven Strategies for Complex System Forecasts: The Role of Textual Big Data and State-Space Transformers in Decision Support Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this research, an innovative state space-based Transformer model is proposed to address the challenges of complex system prediction tasks. By integrating state space theory, …	HUAIRONG HUO et. al.	Syst.	2024-05-10
1337	TacoERE: Cluster-aware Compression for Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm.	YONG GUAN et. al.	arxiv-cs.CL	2024-05-10
1338	Multimodal LLMs Struggle with Basic Visual Network Analysis: A VNA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose.	Evan M. Williams; Kathleen M. Carley;	arxiv-cs.CV	2024-05-10
1339	A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task.	Dongwei Sun; Yajie Bao; Junmin Liu; Xiangyong Cao;	arxiv-cs.CV	2024-05-10
1340	ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design.	Francisco Olivos; Minhui Liu;	arxiv-cs.CY	2024-05-10
1341	Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder.	YAO GE et. al.	arxiv-cs.CL	2024-05-09
1342	People Cannot Distinguish GPT-4 from A Human in A Turing Test IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or …	Cameron R. Jones; Benjamin K. Bergen;	ArXiv	2024-05-09
1343	Optimizing Software Vulnerability Detection Using RoBERTa and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View	Cho Xuan Do; Nguyen Trong Luu; Phuong Thi Lan Nguyen;	Autom. Softw. Eng.	2024-05-08
1344	Leveraging GenAI for An Intelligent Tutoring System for R: A Quantitative Evaluation of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The tremendous advances in Artificial Intelligence (AI) open new opportunities for education, with Intelligent Tutoring Systems (ITS) powered by Generative Artificial Intelligence …	LUKAS FRANK et. al.	2024 IEEE Global Engineering Education Conference (EDUCON)	2024-05-08
1345	Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals.	Aylin Gunal; Baihan Lin; Djallel Bouneffouf;	arxiv-cs.CL	2024-05-08
1346	Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference.	HAOQI WU et. al.	arxiv-cs.CR	2024-05-08
1347	Integrating Pepper Robot and GPT for Neuromyth Educational Conversation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of neuromyths, or false beliefs about brain function and learning, has been a significant challenge in the field of education. These myths often hinders the learning …	Abdelhadi Hireche; Abdelkader Nasreddine Belkacem;	2024 IEEE Global Engineering Education Conference (EDUCON)	2024-05-08
1348	Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by recent work that has utilised very powerful LLMs, such as GPT-4, to evaluate the outputs produced by less powerful models, we conduct an automated analysis of the quality of the feedback produced by several open source models using a dataset from an introductory programming course.	CHARLES KOUTCHEME et. al.	arxiv-cs.CL	2024-05-08
1349	A Transformer with Stack Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism.	Jiaoda Li; Jennifer C. White; Mrinmaya Sachan; Ryan Cotterell;	arxiv-cs.CL	2024-05-07
1350	The Silicon Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness.	Lena Armstrong; Abbey Liu; Stephen MacNeil; Danaë Metaxa;	arxiv-cs.CY	2024-05-07
1351	Evaluating Text Summaries Generated By Large Language Models Using OpenAI’s GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research examines the effectiveness of OpenAI’s GPT models as independent evaluators of text summaries generated by six transformer-based models from Hugging Face: DistilBART, BERT, ProphetNet, T5, BART, and PEGASUS.	Hassan Shakil; Atqiya Munawara Mahi; Phuoc Nguyen; Zeydy Ortiz; Mamoun T. Mardini;	arxiv-cs.CL	2024-05-07
1352	Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries.	Hassan Shakil; Zeydy Ortiz; Grant C. Forbes;	arxiv-cs.CL	2024-05-07
1353	GPT-Enabled Cybersecurity Training: A Tailored Approach for Effective Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the limitations of traditional Cybersecurity Awareness and Training (CSAT) programs and proposes an innovative solution using Generative Pre-Trained Transformers (GPT) to address these shortcomings.	Nabil Al-Dhamari; Nathan Clarke;	arxiv-cs.CR	2024-05-07
1354	How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms.	Jorge García-Carrasco; Alejandro Maté; Juan Trujillo;	arxiv-cs.LG	2024-05-07
1355	Structured Click Control in Transformer-based Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness of the response, we propose a structured click intent model based on graph neural networks, which adaptively obtains graph nodes via the global similarity of user-clicked Transformer tokens.	Long Xu; Yongquan Chen; Rui Huang; Feng Wu; Shiwu Lai;	arxiv-cs.CV	2024-05-07
1356	Hire Me or Not? Examining Language Model’s Behavior with Occupation Attributes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the impressive performance in various downstream tasks, large language models (LLMs) have been widely integrated into production pipelines, like recruitment and recommendation systems.	Damin Zhang; Yi Zhang; Geetanjali Bihani; Julia Rayz;	arxiv-cs.CL	2024-05-06
1357	Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This anchored bias challenges the integrity of GPT-2’s decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias.	Ruizhe Li; Yanjun Gao;	arxiv-cs.CL	2024-05-06
1358	Addressing Data Scarcity in The Medical Domain: A GPT-Based Approach for Synthetic Data Generation and Feature Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research confronts the persistent challenge of data scarcity in medical machine learning by introducing a pioneering methodology that harnesses the capabilities of Generative …	F. Sufi;	Inf.	2024-05-06
1359	Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs have not yet been used to characterize synergistic learning in students’ collaborative discourse. In this exploratory work, we take a first step towards adopting a human-in-the-loop prompt engineering approach with GPT-4-Turbo to summarize and categorize students’ synergistic learning during collaborative discourse.	Clayton Cohn; Caitlin Snyder; Justin Montenegro; Gautam Biswas;	arxiv-cs.CL	2024-05-06
1360	Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation.	Keith Burghardt; Kai Chen; Kristina Lerman;	arxiv-cs.CL	2024-05-06
1361	Detecting Anti-Semitic Hate Speech Using Transformer-based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we developed a new data labeling technique and established a proof of concept targeting anti-Semitic hate speech, utilizing a variety of transformer models such as BERT (arXiv:1810.04805), DistillBERT (arXiv:1910.01108), RoBERTa (arXiv:1907.11692), and LLaMA-2 (arXiv:2307.09288), complemented by the LoRA fine-tuning approach (arXiv:2106.09685).	Dengyi Liu; Minghao Wang; Andrew G. Catlin;	arxiv-cs.CL	2024-05-06
1362	Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems.	Bingzhang Wang; Muhammad Monjurul Karim; Chenxi Liu; Yinhai Wang;	arxiv-cs.MA	2024-05-05
1363	Can Large Language Models Make The Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents reports on a series of experiments with a novel dataset evaluating how well Large Language Models (LLMs) can mark (i.e. grade) open text responses to short answer questions, Specifically, we explore how well different combinations of GPT version and prompt engineering strategies performed at marking real student answers to short answer across different domain areas (Science and History) and grade-levels (spanning ages 5-16) using a new, never-used-before dataset from Carousel, a quizzing platform.	Owen Henkel; Adam Boxer; Libby Hills; Bill Roberts;	arxiv-cs.CL	2024-05-05
1364	Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, …	Sven Jacobs; Steffen Jaschke;	2024 36th International Conference on Software Engineering …	2024-05-05
1365	Unraveling The Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the underexplored area of evaluating LLMs in low-resourced languages such as Bengali.	Fatema Tuj Johora Faria; Mukaffi Bin Moin; Asif Iftekher Fahim; Pronay Debnath; Faisal Muhammad Shah;	arxiv-cs.CL	2024-05-05
1366	U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on self-attention with downsampled tokens, we propose a series of U-shaped DiTs (U-DiTs) in the paper and conduct extensive experiments to demonstrate the extraordinary performance of U-DiT models.	YUCHUAN TIAN et. al.	arxiv-cs.CV	2024-05-04
1367	SCATT: Transformer Tracking with Symmetric Cross-attention Related Papers Related Patents Related Grants Related Venues Related Experts View	Jianming Zhang; Wentao Chen; Jiangxin Dai; Jin Zhang;	Appl. Intell.	2024-05-04
1368	A Combination of BERT and Transformer for Vietnamese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction.	Hieu Ngo Trung; Duong Tran Ham; Tin Huynh; Kiem Hoang;	arxiv-cs.CL	2024-05-04
1369	REASONS: A Benchmark for REtrieval and Automated CitationS Of ScieNtific Sentences Using Public and Proprietary LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article.	DEEPA TILWANI et. al.	arxiv-cs.CL	2024-05-03
1370	Structural Pruning of Pre-trained Language Models Via Neural Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process.	Aaron Klein; Jacek Golebiowski; Xingchen Ma; Valerio Perrone; Cedric Archambeau;	arxiv-cs.LG	2024-05-03
1371	Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on The Travelling Salesman Problem Using GPT-3.5 Turbo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP).	Mahmoud Masoud; Ahmed Abdelhay; Mohammed Elhenawy;	arxiv-cs.CL	2024-05-03
1372	The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This study introduces a systematic framework to compare the efficacy of Large Language Models (LLMs) for fine-tuning across various cheminformatics tasks. Employing a uniform …	Youngmin Lee; Andrew S. I. D. Lang; Duoduo Cai; Wheat R. Stephen;	ArXiv	2024-05-02
1373	UQA: Corpus for Urdu Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers.	Samee Arif; Sualeha Farid; Awais Athar; Agha Ali Raza;	arxiv-cs.CL	2024-05-02
1374	Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. We investigate the ability of …	TOLGA BUZ et. al.	STARSEM	2024-05-02
1375	The Effectiveness of LLMs As Annotators: A Comparative Overview and Empirical Analysis of Direct Representation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data.	Maja Pavlovic; Massimo Poesio;	arxiv-cs.CL	2024-05-02
1376	Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing.	TOLGA BUZ et. al.	arxiv-cs.CL	2024-05-02
1377	Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements.	SEUNGONE KIM et. al.	arxiv-cs.CL	2024-05-02
1378	Unveiling The Inherent Needs: GPT Builder As Participatory Design Tool for Exploring Needs and Expectation of AI with Middle-Aged Users Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A generative session that directly involves users in the design process is an effective way to design user-centered experiences by uncovering intrinsic needs. However, engaging …	Huisung Kwon; Y. J. Choi; Sunok Lee; Sangsu Lee;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-02
1379	Empowering IoT with Generative AI: Applications, Case Studies, and Limitations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rise of the Generative Pre-Trained Transformer(GPT) language model, more commonly used as ChatGPT has brought a spotlight on the ever-developing field of Generative AI (GAI).} …	Siva Sai; Mizaan Kanadia; V. Chamola;	IEEE Internet of Things Magazine	2024-05-01
1380	How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system.	JIONGHAO LIN et. al.	arxiv-cs.CL	2024-05-01
1381	A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges.	Ayaz Mehmood; Muhammad Tayyab Zamir; Muhammad Asif Ayub; Nasir Ahmad; Kashif Ahmad;	arxiv-cs.CL	2024-05-01
1382	Chat-GPT; Validating Technology Acceptance Model (TAM) in Education Sector Via Ubiquitous Learning Mechanism IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	N. SAIF et. al.	Comput. Hum. Behav.	2024-05-01
1383	FedViT: Federated Continual Learning of Vision Transformer at Edge Related Papers Related Patents Related Grants Related Venues Related Experts View	XIAOJIANG ZUO et. al.	Future Gener. Comput. Syst.	2024-05-01
1384	Font Transformer for Few-shot Font Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Xu Chen; Lei Wu; Yongliang Su; Lei Meng; Xiangxu Meng;	Comput. Vis. Image Underst.	2024-05-01
1385	Semantic Perceptive Infrared and Visible Image Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	XIN YANG et. al.	Pattern Recognit.	2024-05-01
1386	Do Large Language Models Understand Conversational Implicature – A Case Study with A Chinese Sitcom Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce …	Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu;	ArXiv	2024-04-30
1387	Do Large Language Models Understand Conversational Implicature — A Case Study with A Chinese Sitcom Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $\textit{My Own Swordsman}$.	Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu;	arxiv-cs.CL	2024-04-30
1388	Harmonic LLMs Are Trustworthy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an intuitive method to test the robustness (stability and explainability) of any black-box LLM in real-time via its local deviation from harmoniticity, denoted as $\gamma$.	Nicholas S. Kersting; Mohammad Rahman; Suchismitha Vedala; Yang Wang;	arxiv-cs.LG	2024-04-30
1389	How Can I Improve? Using GPT to Highlight The Desired and Undesired Parts of Open-ended Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our aim is to equip tutors with actionable, explanatory feedback during online training lessons.	JIONGHAO LIN et. al.	arxiv-cs.CL	2024-04-30
1390	RSCaMa: Remote Sensing Image Change Captioning with State Space Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite previous methods progressing in the spatial change perception, there are still weaknesses in joint spatial-temporal modeling. To address this, in this paper, we propose a novel RSCaMa model, which achieves efficient joint spatial-temporal modeling through multiple CaMa layers, enabling iterative refinement of bi-temporal features.	CHENYANG LIU et. al.	arxiv-cs.CV	2024-04-29
1391	Ethical Reasoning and Moral Value Alignment of LLMs Depend on The Language We Prompt Them in Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs — GPT-4, ChatGPT, and Llama2-70B-Chat — perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted.	Utkarsh Agarwal; Kumar Tanmay; Aditi Khandelwal; Monojit Choudhury;	arxiv-cs.CL	2024-04-29
1392	Can GPT-4 Do L2 Analytic Assessment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components.	Stefano Bannò; Hari Krishna Vydana; Kate M. Knill; Mark J. F. Gales;	arxiv-cs.CL	2024-04-29
1393	Structured Named Entity Recognition (NER) in Biomedical Texts Using Pre-Trained Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The field of Natural Language Processing (NLP) has witnessed remarkable progress in recent years, particularly in the domain of biomedical text analysis. Named Entity Recognition …	Pinar Savci; Bihter Das;	2024 12th International Symposium on Digital Forensics and …	2024-04-29
1394	GPT-4 Passes Most of The 297 Written Polish Board Certification Examinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: We developed a software program to download and process PES exams and tested the performance of GPT models using OpenAI Application Programming Interface.	Jakub Pokrywka; Jeremi Kaczmarek; Edward Gorzelańczyk;	arxiv-cs.CL	2024-04-29
1395	Time Machine GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative.	Felix Drinkall; Eghbal Rahimikia; Janet B. Pierrehumbert; Stefan Zohren;	arxiv-cs.CL	2024-04-29
1396	Comparative Analysis of Generic and Fine-Tuned Large Language Models for Conversational Agent Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the rapidly evolving domain of conversational agents, the integration of Large Language Models (LLMs) into Chatbot Development Platforms (CDPs) is a significant innovation. …	LAURA VILLA et. al.	Robotics	2024-04-29
1397	Normalization of Arabic Dialects Into Modern Standard Arabic Using BERT and GPT-2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present an encoder-decored based model for normalization of Arabic dialects using both BERT and GPT-2 based models. Arabic is a language of many dialects that not only differ …	Khalid Alnajjar; Mika Hämäläinen;	J. Data Min. Digit. Humanit.	2024-04-29
1398	PatentGPT: A Large Language Model for Intellectual Property Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain.	ZILONG BAI et. al.	arxiv-cs.CL	2024-04-28
1399	Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the study of how subwording affects the understanding capacity of language models has been very few and only limited to a handful of languages. To reduce this gap we used 6 different tokenization schemes to pretrain relatively small language models in Nepali and used the representations learned to finetune on several downstream tasks.	Nishant Luitel; Nirajan Bekoju; Anand Kumar Sah; Subarna Shakya;	arxiv-cs.CL	2024-04-28
1400	GPT for Games: A Scoping Review (2020-2023) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a scoping review of 55 articles to explore GPT’s potential for games, offering researchers a comprehensive understanding of the current applications and identifying both emerging trends and unexplored areas.	Daijin Yang; Erica Kleinman; Casper Harteveld;	arxiv-cs.HC	2024-04-27
1401	Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work addresses the task of detecting conspiracy theories in German Telegram messages.	Milena Pustet; Elisabeth Steffen; Helena Mihaljević;	arxiv-cs.CL	2024-04-27
1402	MRScore: Evaluating Radiology Report Generation with LLM-based Reward System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MRScore, an automatic evaluation metric tailored for radiology report generation by leveraging Large Language Models (LLMs).	YUNYI LIU et. al.	arxiv-cs.CL	2024-04-27
1403	8-bit Transformer Inference and Fine-tuning for Edge Accelerators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models achieve state-of-the-art accuracy on natural language processing (NLP) and vision tasks, but demand significant computation and memory resources, which makes it …	JEFFREY YU et. al.	Proceedings of the 29th ACM International Conference on …	2024-04-27
1404	CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments.	KAIXUAN HUANG et. al.	arxiv-cs.AI	2024-04-27
1405	UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our team’s participation in the MEDIQA-ClinicalNLP 2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating …	PARTH VASHISHT et. al.	ArXiv	2024-04-27
1406	ChatGPT Is Here to Help, Not to Replace Anybody — An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, 52 first-year CS students were surveyed in order to assess their views on technologies with code-generation capabilities, both from academic and professional perspectives.	Bruno Pereira Cipriano; Pedro Alves;	arxiv-cs.ET	2024-04-26
1407	Enhancing Legal Compliance and Regulation Analysis with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks.	Shabnam Hassani;	arxiv-cs.SE	2024-04-26
1408	UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt — A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework.	PARTH VASHISHT et. al.	arxiv-cs.AI	2024-04-26
1409	ChatGPT Is Here to Help, Not to Replace Anybody – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) like GPT and Bard are capable of producing code based on textual descriptions, with remarkable efficacy. Such technology will have profound …	Bruno Pereira Cipriano; P. Alves;	ArXiv	2024-04-26
1410	Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT As A Pivot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.	Michelle Terblanche; Kayode Olaleye; Vukosi Marivate;	arxiv-cs.CL	2024-04-26
1411	TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters.	LIANG ZHANG et. al.	arxiv-cs.CV	2024-04-25
1412	Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative artificial intelligences, particularly large language models (LLMs), play an increasingly prominent role in human decision-making contexts, necessitating transparency …	Lydia Uhler; Verena Jordan; Jürgen Buder; Markus Huff; Frank Papenmeier;	arxiv-cs.CL	2024-04-25
1413	Exploring Internal Numeracy in Language Models: A Case Study on ALBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models.	Ulme Wennberg; Gustav Eje Henter;	arxiv-cs.CL	2024-04-25
1414	Player-Driven Emergence in LLM-Driven Game Narrative Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives.	XIANGYU PENG et. al.	arxiv-cs.CL	2024-04-25
1415	Towards Efficient Patient Recruitment for Clinical Trials: Application of A Prompt-Based Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR.	Mojdeh Rahmanian; Seyed Mostafa Fakhrahmad; Seyedeh Zahra Mousavi;	arxiv-cs.CL	2024-04-24
1416	GeckOpt: LLM System Efficiency Via Intent-Based Tool Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By …	Michael Fore; Simranjit Singh; Dimitrios Stamoulis;	Proceedings of the Great Lakes Symposium on VLSI 2024	2024-04-24
1417	A Comprehensive Survey on Evaluating Large Language Model Applications in The Medical Industry IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective and ethical deployment.	Yining Huang; Keke Tang; Meilian Chen; Boyuan Wang;	arxiv-cs.CL	2024-04-24
1418	An Automated Learning Model for Twitter Sentiment Analysis Using Ranger AdaBelief Optimizer Based Bidirectional Long Short Term Memory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment analysis is an automated approach which is utilized in process of analysing textual data to describe public opinion. The sentiment analysis has major role in creating …	Sasirekha Natarajan; Smitha Kurian; P. Divakarachari; Przemysław Falkowski‐Gilski;	Expert Syst. J. Knowl. Eng.	2024-04-24
1419	BERT Vs GPT for Financial Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The paper benchmarks several Transformer models [4], to show how these models can judge sentiment from a news event. This signal can then be used for downstream modelling and …	Edward Sharkey; Philip C. Treleaven;	ArXiv	2024-04-24
1420	The Promise and Challenges of Using LLMs to Accelerate The Screening Process of Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening.	Aleksi Huotala; Miikka Kuutila; Paul Ralph; Mika Mäntylä;	arxiv-cs.CL	2024-04-24
1421	Automated Creation of Source Code Variants of A Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CR	2024-04-24
1422	From Complexity to Clarity: How AI Enhances Perceptions of Scientists and The Public’s Understanding of Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public’s understanding of science.	David M. Markowitz;	arxiv-cs.CL	2024-04-23
1423	Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on a specific use case, pharmaceutical manufacturing investigations, and propose that leveraging historical records of manufacturing incidents and deviations in an organization can be beneficial for addressing and closing new cases, or de-risking new manufacturing campaigns.	Hossein Salami; Brandye Smith-Goettler; Vijay Yadav;	arxiv-cs.CL	2024-04-23
1424	Transformers Can Represent $n$-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models.	Anej Svete; Ryan Cotterell;	arxiv-cs.CL	2024-04-23
1425	Pyramid Hierarchical Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer).	Muhammad Ahmad; Muhammad Hassaan Farooq Butt; Manuel Mazzara; Salvatore Distifano;	arxiv-cs.CV	2024-04-23
1426	PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs.	SHASHI KANT GUPTA et. al.	arxiv-cs.CL	2024-04-23
1427	Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance.	Qianru Meng; Xiao Zhang; Guus Ramackers; Visser Joost;	arxiv-cs.SE	2024-04-23
1428	Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates GPT-4V’s ability to interpret meteorological charts and communicate weather hazards appropriately to the user, despite challenges of hallucinations, where generative AI delivers coherent, confident, but incorrect responses. We assess GPT-4V’s competence via its web interface ChatGPT in two tasks: (1) generating a severe-weather outlook from weather-chart analysis and conducting self-evaluation, revealing an outlook that corresponds well with a Storm Prediction Center human-issued forecast; and (2) producing hazard summaries in Spanish and English from weather charts.	JOHN R. LAWSON et. al.	arxiv-cs.CL	2024-04-22
1429	What Do Transformers Know About Government? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence.	JUE HOU et. al.	arxiv-cs.CL	2024-04-22
1430	Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CL	2024-04-22
1431	Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Marking, a novel grading task that enhances automated grading systems by performing an in-depth analysis of student responses and providing students with visual highlights.	Shashank Sonkar; Naiming Liu; Debshila B. Mallick; Richard G. Baraniuk;	arxiv-cs.CL	2024-04-22
1432	How Well Can LLMs Echo Us? Evaluating AI Chatbots’ Role-Play Ability with ECHO Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an oversight limits the potential for advancements in digital human clones and non-player characters in video games. To bridge this gap, we introduce ECHO, an evaluative framework inspired by the Turing test.	MAN TIK NG et. al.	arxiv-cs.CL	2024-04-22
1433	Pre-Calc: Learning to Use The Calculator Improves Numeracy in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively.	Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate;	arxiv-cs.CL	2024-04-22
1434	Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This paper presents a preliminary evaluation of GPT-4-Vision, a state-of-the-art deep learning model, and its capabilities in transforming Unified Modeling Language (UML) class diagrams into fully operating Java class files.	Gábor Antal; Richárd Vozár; Rudolf Ferenc;	arxiv-cs.SE	2024-04-22
1435	Transformer-Driven Resource Allocation for Enhanced Multi-Carrier NOMA Downlink Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a transformer-driven resource allocation strategy to optimize channel assignment and power allocation in multi-carrier non-orthogonal multiple access (NOMA) …	Liang Leon Dong;	2024 IEEE Wireless Communications and Networking Conference …	2024-04-21
1436	SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM’s SVG Editing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For quantitative evaluation of LLMs’ ability to edit SVG, we propose SVGEditBench.	Kunato Nishina; Yusuke Matsui;	arxiv-cs.CV	2024-04-21
1437	Automated Text Mining of Experimental Methodologies from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the fine-tuned DistilBERT, a methodology-specific, pre-trained generative classification language model for mining biomedicine texts.	Ziqing Guo;	arxiv-cs.CL	2024-04-21
1438	Do English Named Entity Recognizers Work Well on Global Englishes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As such, it is unclear whether they generalize for analyzing use of English globally. To test this, we build a newswire dataset, the Worldwide English NER Dataset, to analyze NER model performance on low-resource English variants from around the world.	Alexander Shan; John Bauer; Riley Carlson; Christopher Manning;	arxiv-cs.CL	2024-04-20
1439	Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a solution, we propose a combined intrinsic-extrinsic evaluation framework for subword tokenization.	KHUYAGBAATAR BATSUREN et. al.	arxiv-cs.CL	2024-04-20
1440	Toward A New Era of Rapid Development: Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This …	Gábor Antal; Rich’ard Voz’ar; Rudolf Ferenc;	2024 IEEE/ACM International Workshop on Large Language …	2024-04-20
1441	Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism.	Danqing Ma; Meng Wang; Ao Xiang; Zongqing Qi; Qin Yang;	arxiv-cs.CV	2024-04-19
1442	Enhancing Child Safety in Online Gaming: The Development and Application of Protectbot, An AI-Powered Chatbot Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study introduces Protectbot, an innovative chatbot framework designed to improve safety in children’s online gaming environments. At its core, Protectbot incorporates …	Anum Faraz; Fardin Ahsan; Jinane Mounsef; Ioannis Karamitsos; A. Kanavos;	Inf.	2024-04-19
1443	TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling.	Aleksei Dorkin; Kairit Sirts;	arxiv-cs.CL	2024-04-19
1444	Enabling Natural Zero-Shot Prompting on Encoder Models Via Statement-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label.	Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji;	arxiv-cs.CL	2024-04-19
1445	Linearly-evolved Transformer for Pan-sharpening Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource satellites.To address this challenge between favorable performance and expensive computation, we tailor an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework.	JUNMING HOU et. al.	arxiv-cs.CV	2024-04-19
1446	Crowdsourcing Public Attitudes Toward Local Services Through The Lens of Google Maps Reviews: An Urban Density-based Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel data source and methodological framework that can be easily adapted to different regions, offering useful insights into public sentiment toward the built environment and shedding light on how planning policies can be designed to handle related challenges.	Lingyao Li; Songhua Hu; Atiyya Shaw; Libby Hemphill;	arxiv-cs.SI	2024-04-19
1447	Evaluation of Different Machine Learning and Deep Learning Techniques for Hate Speech Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Detecting online hate speech is important for creating safer online spaces. In this paper, we evaluate the performance of several machine learning (ML) and deep learning (DL) …	Nabil Shawkat; Jamil Saquer; Hazim Shatnawi;	Proceedings of the 2024 ACM Southeast Conference	2024-04-18
1448	MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm.	Jinwu Wang; Wei Mao; Miaomiao Liu;	arxiv-cs.SD	2024-04-18
1449	Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. …	Nils Graef;	arxiv-cs.LG	2024-04-18
1450	Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles.	Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch;	arxiv-cs.CL	2024-04-18
1451	EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research.	Jimenez Eladio; Hao Wu;	arxiv-cs.CL	2024-04-18
1452	Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation.	Yucheng Lin; Yuhan Xia; Yunfei Long;	arxiv-cs.CL	2024-04-18
1453	Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce two new methods, Dubo-SQL v1 and v2.	Dayton G. Thorpe; Andrew J. Duberstein; Ian A. Kinsey;	arxiv-cs.CL	2024-04-18
1454	CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties.	Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu;	arxiv-cs.AI	2024-04-17
1455	Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications.	Wei Chen; Zhiyuan Li;	arxiv-cs.CL	2024-04-17
1456	CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions.	Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak;	arxiv-cs.CL	2024-04-16
1457	AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance.	Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara;	arxiv-cs.CV	2024-04-15
1458	Spatial–Temporal Graph Attention Gated Recurrent Transformer Network for Traffic Flow Forecasting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the significant increase in the number of motor vehicles, road-related issues, such as traffic congestion and accidents, have also escalated. The development of an accurate …	Di Wu; Kai Peng; Shangguang Wang; Victor C. M. Leung;	IEEE Internet of Things Journal	2024-04-15
1459	Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, I argue that we can extract from the way the transformer architecture works a theory of the relationship between context and meaning.	Jumbly Grindrod;	arxiv-cs.CL	2024-04-15
1460	Zero-shot Building Age Classification from Facade Image Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. A building’s age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images …	ZICHAO ZENG et. al.	ArXiv	2024-04-15
1461	Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain.	Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu;	arxiv-cs.CV	2024-04-15
1462	Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper introduces fourteen novel datasets for the evaluation of Large Language Models’ safety in the context of enterprise tasks. A method was devised to evaluate a model’s …	David Nadeau; Mike Kroutikov; Karen McNeil; Simon Baribeau;	ArXiv	2024-04-15
1463	Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility.	SIQIAO XUE et. al.	arxiv-cs.AI	2024-04-15
1464	Leveraging GPT-like LLMs to Automate Issue Labeling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Issue labeling is a crucial task for the effective management of software projects. To date, several approaches have been put forth for the automatic assignment of labels to issue …	Giuseppe Colavito; F. Lanubile; Nicole Novielli; L. Quaranta;	2024 IEEE/ACM 21st International Conference on Mining …	2024-04-15
1465	Few-shot Name Entity Recognition on StackOverflow IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning.	Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo;	arxiv-cs.CL	2024-04-14
1466	A Scalable Sparse Transformer Model for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extracting the melody of a singing voice is an essential task within the realm of music information retrieval (MIR). Recently, transformer based models have drawn great attention …	Shuai Yu; Jun Liu; Yi Yu; Wei Li;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1467	Assessing The Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are structured arguments that allow verifying the correct implementation of the created systems’ non-functional requirements (e.g., safety, security). This …	KIMYA KHAKZAD SHAHANDASHTI et. al.	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1468	Planning to Guide LLM for Code Coverage Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Code coverage serves as a crucial metric to assess testing effectiveness, measuring the degree to which a test suite exercises different facets of the code, such as statements, …	Hridya Dhulipala; Aashish Yadavally; Tien N. Nguyen;	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1469	GPT-4 Driven Cinematic Music Generation Through Text Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech …	Muhammad Taimoor Haseeb; Ahmad Hammoudeh; Gus G. Xia;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1470	Improving Domain Generalization in Speech Emotion Recognition with Whisper Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers have been used successfully in a variety of settings, including Speech Emotion Recognition (SER). However, use of the latest transformer base models in domain …	Erik Goron; Lena Asai; Elias Rut; Martin Dinov;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1471	The Impact of Knowledge Distillation on The Energy Consumption and Runtime Efficiency of NLP Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Context. While models like BERT and GPT are powerful, they require substantial resources. Knowledge distillation can be employed as a technique to enhance their efficiency. Yet, …	YE YUAN et. al.	2024 IEEE/ACM 3rd International Conference on AI …	2024-04-14
1472	Hybrid Convolution-Transformer for Lightweight Single Image Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid development of deep learning has driven the breakthrough in performance of single image super-resolution (SISR). However, many existing works deepen the network to …	Jiuqiang Li; Yutong Ke;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1473	TD-GPT: Target Protein-Specific Drug Molecule Generation GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Drug discovery faces challenges due to the vast chemical space and complex drug-target interactions. This paper proposes a novel deep learning framework TD-GPT for targeted drug …	ZHENGDA HE et. al.	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1474	Fine Tuning Large Language Model for Secure Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI pair programmers, such as GitHub’s Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the …	Junjie Li; Aseem Sangalay; Cheng Cheng; Yuan Tian; Jinqiu Yang;	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1475	Inducing Inductive Bias in Vision Transformer for EEG Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Human brain signals are highly complex and dynamic in nature. Electroencephalogram (EEG) devices capture some of this complexity, both in space and in time, with a certain …	Rabindra Khadka; Pedro G. Lind; G. Mello; M. Riegler; Anis Yazidi;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1476	A Hybrid CNN-Transformer for Focal Liver Lesion Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The early diagnosis of focal liver lesions (FLLs) plays a key role in the successful treatment of liver cancer. To effectively diagnose focal liver lesions, we used …	LING ZHAO et. al.	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1477	OpenTE: Open-Structure Table Extraction From Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents an Open-Structure Table Extraction (OpenTE) task, which aims to extract a table with intrinsic semantic, calculational, and hierarchical structure from …	Haoyu Dong; Mengkang Hu; Qinyu Xu; Haocheng Wang; Yue Hu;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1478	LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NER Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Flat-LAttice Transformer (FLAT) has achieved notable success in Chinese named entity recognition (NER) by integrating lexical information into the widely-used Transformer …	Zongcheng Ji; Yinlong Xiao;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1479	A Lightweight Transformer-based Neural Network for Large-scale Masonry Arch Bridge Point Cloud Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to …	Yixiong Jing; Brian Sheil; S. Acikgoz;	Comput. Aided Civ. Infrastructure Eng.	2024-04-14
1480	Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code.	MENGNAN QI et. al.	arxiv-cs.PL	2024-04-12
1481	CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs.	Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran;	arxiv-cs.CL	2024-04-12
1482	Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap.	Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych;	arxiv-cs.CL	2024-04-12
1483	Can Deep Learning Large Language Models Be Used to Unravel Knowledge Graph Creation? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research focuses on advancing RE methodologies by employing and comparing various NLP models for analyzing medical relationships, particularly concerning Gastroesophageal …	Sydney Anuyah; Sunandan Chakraborty;	Proceedings of the International Conference on Computing, …	2024-04-12
1484	Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels.	William Gantt; Aaron Steven White;	arxiv-cs.CL	2024-04-12
1485	Inheritune: Training Smaller Yet More Attentive Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Layers in this state are unable to learn anything meaningful and mostly redundant; we refer to these as lazy layers. The goal of this paper is to train smaller models by eliminating this structural inefficiency without compromising performance.	Sunny Sanyal; Ravid Shwartz-Ziv; Alexandros G. Dimakis; Sujay Sanghavi;	arxiv-cs.CL	2024-04-12
1486	Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images.	Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi;	arxiv-cs.RO	2024-04-11
1487	Measuring Geographic Diversity of Foundation Models with A Natural Language-based Geo-guessing Experiment on GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. Generative AI based on foundation models provides a first glimpse into the world represented by machines trained on vast amounts of multimodal data ingested by these …	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	ArXiv	2024-04-11
1488	LLM Agents Can Autonomously Exploit One-day Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems.	Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang;	arxiv-cs.CR	2024-04-11
1489	Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented.	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	arxiv-cs.CY	2024-04-11
1490	Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items.	Andreas Säuberli; Simon Clematide;	arxiv-cs.CL	2024-04-11
1491	Map Reading and Analysis with GPT-4V(ision) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In late 2023, the image-reading capability added to a Generative Pre-trained Transformer (GPT) framework provided the opportunity to potentially revolutionize the way we view and …	Jinwen Xu; Ran Tao;	ISPRS Int. J. Geo Inf.	2024-04-11
1492	From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates.	Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu;	arxiv-cs.CL	2024-04-11
1493	Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs.	Miriam Anschütz; Edoardo Mosca; Georg Groh;	arxiv-cs.CL	2024-04-10
1494	Automated Mapping of Common Vulnerabilities and Exposures to MITRE ATT&CK Tactics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Effectively understanding and categorizing vulnerabilities is vital in the ever-evolving cybersecurity landscape, since only one exposure can have a devastating effect on the …	Ioana Branescu; Octavian Grigorescu; Mihai Dascălu;	Inf.	2024-04-10
1495	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	arxiv-cs.CV	2024-04-09
1496	Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere.	Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan;	arxiv-cs.CL	2024-04-09
1497	Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data.	YANJIE LI et. al.	arxiv-cs.LG	2024-04-09
1498	PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English.	Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar;	arxiv-cs.CL	2024-04-08
1499	OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets.	MEHRAN SAFAYANI et. al.	arxiv-cs.CL	2024-04-08
1500	VulnHunt-GPT: A Smart Contract Vulnerabilities Detector Based on OpenAI ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Smart contracts are self-executing programs that can run on a blockchain. Due to the fact of being immutable after their deployment on blockchain, it is crucial to ensure their …	Biagio Boi; Christian Esposito; Sokjoon Lee;	Proceedings of the 39th ACM/SIGAPP Symposium on Applied …	2024-04-08