Paper Digest: Recent Papers on Transformer

July 1, 2020December 23, 2024 admin

Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.

This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to read, write, get answers and review.

Try us today and unlock the full potential of our services for free!

TABLE 1: Paper Digest: Recent Papers on Transformer

	Paper	Author(s)	Source	Date
1	Development of A Large-scale Dataset of Chest Computed Tomography Reports in Japanese and A High-performance Finding Classification Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To develop a comprehensive Japanese CT report dataset through machine translation and establish a specialized language model for structured finding classification.	YOSUKE YAMAGISHI et. al.	arxiv-cs.CL	2024-12-20
2	Demystifying The Potential of ChatGPT-4 Vision for Construction Progress Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of Large Vision-Language Models (LVLMs) such as OpenAI’s GPT-4 Vision into various sectors has marked a significant evolution in the field of artificial intelligence, particularly in the analysis and interpretation of visual data. This paper explores the practical application of GPT-4 Vision in the construction industry, focusing on its capabilities in monitoring and tracking the progress of construction projects.	Ahmet Bahaddin Ersoz;	arxiv-cs.CV	2024-12-20
3	Linguistic Features Extracted By GPT-4 Improve Alzheimer’s Disease Detection Based on Spontaneous Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech.	Jonathan Heitz; Gerold Schneider; Nicolas Langer;	arxiv-cs.CL	2024-12-20
4	BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the potential of recurrent neural networks (RNNs) and other subquadratic architectures as competitive alternatives to transformer-based models in low-resource language modeling scenarios.	Patrick Haller; Jonas Golde; Alan Akbik;	arxiv-cs.CL	2024-12-20
5	How Good Is GPT at Writing Political Speeches for The White House? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using large language models (LLMs), computers are able to generate a written text in response to a us er request.	Jacques Savoy;	arxiv-cs.CL	2024-12-19
6	Graph-Convolutional Networks: Named Entity Recognition and Large Language Model Embedding in Document Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel approach that integrates Named Entity Recognition (NER) and LLM embeddings within a graph-based framework for document clustering.	Imed Keraghel; Mohamed Nadif;	arxiv-cs.CL	2024-12-19
7	A Full Transformer-based Framework for Automatic Pain Estimation Using Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a novel full transformer-based framework consisting of a Transformer in Transformer (TNT) model and a Transformer leveraging cross-attention and self-attention blocks.	Stefanos Gkikas; Manolis Tsiknakis;	arxiv-cs.CV	2024-12-19
8	LLMs As Mediators: Can They Diagnose Conflicts Accurately? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior research indicates that to be able to mediate conflict, observers of disagreements between parties must be able to reliably distinguish the sources of their disagreement as stemming from differences in beliefs about what is true (causality) vs. differences in what they value (morality). In this paper, we test if OpenAI’s Large Language Models GPT 3.5 and GPT 4 can perform this task and whether one or other type of disagreement proves particularly challenging for LLM’s to diagnose.	Özgecan Koçak; Phanish Puranam; Afşar Yegin;	arxiv-cs.CL	2024-12-19
9	FarExStance: Explainable Stance Detection for Farsi Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FarExStance, a new dataset for explainable stance detection in Farsi.	MAJID ZARHARAN et. al.	arxiv-cs.CL	2024-12-18
10	Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce ModernBERT, bringing modern model optimizations to encoder-only models and representing a major Pareto improvement over older encoders.	BENJAMIN WARNER et. al.	arxiv-cs.CL	2024-12-18
11	Fake News Detection: Comparative Evaluation of BERT-like Models and Large Language Models with Generative AI-Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a comparative evaluation of BERT-like encoder-only models and autoregressive decoder-only large language models (LLMs) for fake news detection.	Shaina Raza; Drai Paulen-Patterson; Chen Ding;	arxiv-cs.CL	2024-12-18
12	Lightweight Safety Classification Using Pruned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel technique for content safety and prompt injection classification for Large Language Models.	Mason Sawtell; Tula Masterman; Sandi Besen; Jim Brown;	arxiv-cs.CL	2024-12-17
13	Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, a hybrid model that combines LSTMs for temporal encoding with a Transformer encoder for capturing complex interactions between vehicles is proposed.	Chandra Raskoti; Weizi Li;	arxiv-cs.RO	2024-12-17
14	Investigating Mixture of Experts in Dense Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), one limitation of these neural models is their narrow generalizability and robustness. To cope with …	Effrosyni Sokli; Pranav Kasela; Georgios Peikos; Gabriella Pasi;	arxiv-cs.IR	2024-12-16
15	Causal Diffusion Transformers for Generative Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Causal Diffusion as the autoregressive (AR) counterpart of Diffusion models.	Chaorui Deng; Deyao Zhu; Kunchang Li; Shi Guang; Haoqi Fan;	arxiv-cs.CV	2024-12-16
16	The Open Source Advantage in Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By contrast, open-source initiatives like LLaMA and BLOOM prioritize democratization through community-driven development and computational efficiency. These models have significantly reduced performance gaps, particularly in linguistic diversity and domain-specific applications, while providing accessible tools for global researchers and developers.	Jiya Manchanda; Laura Boettcher; Matheus Westphalen; Jasser Jasser;	arxiv-cs.CL	2024-12-16
17	No More Adam: Learning Rate Scaling at Initialization Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we question the necessity of adaptive gradient methods for training deep neural networks.	Minghao Xu; Lichuan Xiang; Xu Cai; Hongkai Wen;	arxiv-cs.LG	2024-12-16
18	Seeing The Forest and The Trees: Solving Visual Graph and Tree Based Data Structure Problems Using Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research not only introduces an LMM benchmark to facilitate replication and further exploration but also underscores the potential of LMMs in solving complex computing problems, with important implications for pedagogy and assessment practices.	SEBASTIAN GUTIERREZ et. al.	arxiv-cs.AI	2024-12-15
19	Optimized Quran Passage Retrieval Using An Expanded QA Dataset and Fine-Tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Qur’an QA 2023 shared task dataset had a limited number of questions with weak model retrieval. To address this challenge, this work updated the original dataset and improved the model accuracy.	Mohamed Basem; Islam Oshallah; Baraa Hikal; Ali Hamdi; Ammar Mohamed;	arxiv-cs.CL	2024-12-15
20	Do Tutors Learn from Equity Training and Can Generative AI Assess It? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply a mixed-method approach to analyze the performance of 81 undergraduate remote tutors.	DANIELLE R. THOMAS et. al.	arxiv-cs.HC	2024-12-15
21	SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation.	QILONG WU et. al.	arxiv-cs.CL	2024-12-14
22	Tokens, The Oft-overlooked Appetizer: Large Language Models, The Distributional Hypothesis, and Meaning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides creating sub-optimal semantic building blocks and obscuring the model’s access to the necessary distributional patterns, we describe how tokenization pretraining can be a backdoor for bias and other unwanted content, which current alignment practices may not remediate.	JULIA WITTE ZIMMERMAN et. al.	arxiv-cs.CL	2024-12-14
23	SPT: Sequence Prompt Transformer for Interactive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods typically process one image at a time, failing to consider the sequential nature of the images. To overcome this limitation, we propose a novel method called Sequence Prompt Transformer (SPT), the first to utilize sequential image information for interactive segmentation.	Senlin Cheng; Haopeng Sun;	arxiv-cs.CV	2024-12-13
24	Does Multiple Choice Have A Future in The Age of Generative AI? A Posttest-only RCT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a posttest-only randomized control design, we compare the performance of 234 tutors (790 lesson completions) across three conditions: MCQ only, open response only, and a combination of both.	DANIELLE R. THOMAS et. al.	arxiv-cs.HC	2024-12-13
25	Evaluation of GPT-4o & GPT-4o-mini’s Vision Capabilities for Salt Evaporite Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Testing with 12 different types of salts, the GPT-4o model achieved 57% accuracy and a 0.52 F1 score, significantly outperforming both random chance (8%) and GPT-4o mini (11% accuracy).	Deven B. Dangi; Beni B. Dangi; Oliver Steinbock;	arxiv-cs.CV	2024-12-13
26	Adaptive Principal Components Allocation with The $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel Parameter-Efficient Fine-Tuning (PEFT) approach based on Gaussian Graphical Models (GGMs), marking the first application of GGMs to PEFT tasks, to the best of our knowledge.	Jingjing Zheng; Yankai Cao;	arxiv-cs.LG	2024-12-11
27	Advancing Single- and Multi-task Text Classification Through Large Language Model Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study employed a diverse range of models and methods, varying in size and architecture, and including both fine-tuned and pre-trained approaches.	Hang Zhao; Qile P. Chen; Yijing Barry Zhang; Gang Yang;	arxiv-cs.CL	2024-12-11
28	NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection Using Ensembling of BERT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work emphasizes the need for hate speech detection in Devanagari-scripted languages and presents a foundation for further research.	Anmol Guragain; Nadika Poudel; Rajesh Piryani; Bishesh Khanal;	arxiv-cs.CL	2024-12-11
29	A Survey on Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models.	Yang Li; Xinyu Zhou; Yitong Wang; Liangxin Qian; Jun Zhao;	arxiv-cs.CR	2024-12-11
30	Assessing Personalized AI Mentoring with Large Language Models in The Computing Field Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides an in-depth evaluation of three state-of-the-art Large Language Models (LLMs) for personalized career mentoring in the computing field, using three distinct student profiles that consider gender, race, and professional levels.	Xiao Luo; Sean O’Connell; Shamima Mithun;	arxiv-cs.CL	2024-12-11
31	Rethinking Emotion Annotations in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the complexities of emotion annotation in the context of LLMs, focusing on GPT-4 as a leading model.	Minxue Niu; Yara El-Tawil; Amrit Romana; Emily Mower Provost;	arxiv-cs.CL	2024-12-10
32	Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach leveraging large language models (LLMs) like GPT-4, LLaMA 2 (13B), and BERT to generate KGs directly from unstructured data, bypassing traditional pipelines.	Ahan Bhatt; Nandan Vaghela; Kush Dudhia;	arxiv-cs.CL	2024-12-10
33	Causal World Representation in The GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Are generative pre-trained transformer (GPT) models only trained to predict the next token, or do they implicitly learn a world model from which a sequence is generated one token at a time? We examine this question by deriving a causal interpretation of the attention mechanism in GPT, and suggesting a causal world model that arises from this interpretation.	Raanan Y. Rohekar; Yaniv Gurwicz; Sungduk Yu; Vasudev Lal;	arxiv-cs.AI	2024-12-10
34	TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the first time, this paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs, including SRAM, AES, and UART modules. We propose a novel tool for this goal that systematically assesses state-of-the-art LLMs (GPT-4o, Gemini 1.5 pro, and Llama 3.1) in detecting HTs without prior fine-tuning.	Md Omar Faruque; Peter Jamieson; Ahmad Patooghy; Abdel-Hameed A. Badawy;	arxiv-cs.CR	2024-12-10
35	GPT-2 Through The Lens of Vector Symbolic Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the resemblance between decoder-only transformer architecture and vector symbolic architectures (VSA) and presents experiments indicating that GPT-2 uses mechanisms involving nearly orthogonal vector bundling and binding operations similar to VSA for computation and communication between layers.	Johannes Knittel; Tushaar Gangavarapu; Hendrik Strobelt; Hanspeter Pfister;	arxiv-cs.LG	2024-12-10
36	Towards Predictive Communication with Brain-Computer Interfaces Integrating Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This perspective article aims at providing an outline of the state of the art and future developments towards the integration of cutting-edge predictive language models with BCI.	Andrea Caria;	arxiv-cs.HC	2024-12-10
37	Inverting Visual Representations with Detection Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we apply the approach of training inverse models to reconstruct input images from intermediate layers within a Detection Transformer, showing that this approach is efficient and feasible for transformer-based vision models.	Jan Rathjens; Shirin Reyhanian; David Kappel; Laurenz Wiskott;	arxiv-cs.CV	2024-12-09
38	CARP: Visuomotor Policy Learning Via Coarse-to-Fine Autoregressive Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Coarse-to-Fine AutoRegressive Policy (CARP), a novel paradigm for visuomotor policy learning that redefines the autoregressive action generation process as a coarse-to-fine, next-scale approach.	ZHEFEI GONG et. al.	arxiv-cs.RO	2024-12-09
39	SplaXBERT: Leveraging Mixed Precision Training and Context Splitting for Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: SplaXBERT, built on ALBERT-xlarge with context-splitting and mixed precision training, achieves high efficiency in question-answering tasks on lengthy texts. Tested on SQuAD v1.1, …	Zhu Yufan; Hao Zeyu; Li Siqi; Niu Boqian;	arxiv-cs.CL	2024-12-06
40	Exploring Transformer-Based Music Overpainting for Jazz Piano Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs.	Eleanor Row; Ivan Shanin; György Fazekas;	arxiv-cs.SD	2024-12-05
41	FANAL — Financial Activity News Alerting Language Modeling Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FANAL (Financial Activity News Alerting Language Modeling Framework), a specialized BERT-based framework engineered for real-time financial event detection and analysis, categorizing news into twelve distinct financial categories.	Urjitkumar Patel; Fang-Chun Yeh; Chinmay Gondhalekar; Hari Nalluri;	arxiv-cs.CL	2024-12-04
42	Controlling The Mutation in Large Language Models for The Efficient Evolution of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach to mutation control within LLM-driven evolutionary frameworks, inspired by theory of genetic algorithms.	Haoran Yin; Anna V. Kononova; Thomas Bäck; Niki van Stein;	arxiv-cs.NE	2024-12-04
43	A Water Efficiency Dataset for African Data Centers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI computing and data centers consume a large amount of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to …	Noah Shumba; Opelo Tshekiso; Pengfei Li; Giulia Fanti; Shaolei Ren;	arxiv-cs.LG	2024-12-04
44	The Asymptotic Behavior of Attention in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we provide a rigorous, mathematical analysis of the asymptotic properties of attention in transformers.	Álvaro Rodríguez Abella; João Pedro Silvestre; Paulo Tabuada;	arxiv-cs.AI	2024-12-03
45	Transformer-Metric Loss for CNN-Based Face Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a technique for loss evaluation that uses a transformer network as an additive loss in the face recognition domain.	Pritesh Prakash; Ashish Jacob Sam;	arxiv-cs.CV	2024-12-03
46	Su-RoBERTa: A Semi-supervised Approach to Predicting Suicide Risk Through Social Media Using Base Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Su-RoBERTa, a fine-tuned RoBERTa on suicide risk prediction task that utilized both the labeled and unlabeled Reddit data and tackled class imbalance by data augmentation using GPT-2 model.	CHAYAN TANK et. al.	arxiv-cs.HC	2024-12-02
47	Assessing GPT Model Uncertainty in Mathematical OCR Tasks Via Entropy Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the uncertainty of Generative Pre-trained Transformer (GPT) models in extracting mathematical equations from images of varying resolutions and converting them into LaTeX code.	Alexei Kaltchenko;	arxiv-cs.IT	2024-12-02
48	Impact of Data Snooping on Deep Learning Models for Locating Vulnerabilities in Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the impact of data snooping on neural networks for vulnerability detection in lifted code, building on previous research which used word2vec, and unidirectional and bidirectional transformer-based embeddings.	Gary A. McCully; John D. Hastings; Shengjie Xu;	arxiv-cs.CR	2024-12-02
49	TGTOD: A Global Temporal Graph Transformer for Outlier Detection at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we rethink temporal graph Transformers and propose TGTOD, a novel end-to-end Temporal Graph Transformer for Outlier Detection.	Kay Liu; Jiahao Ding; MohamadAli Torkamani; Philip S. Yu;	arxiv-cs.LG	2024-12-01
50	Automated Extraction of Acronym-Expansion Pairs from Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project addresses challenges posed by the widespread use of abbreviations and acronyms in digital texts. We propose a novel method that combines document preprocessing, regular expressions, and a large language model to identify abbreviations and map them to their corresponding expansions.	Izhar Ali; Million Haileyesus; Serhiy Hnatyshyn; Jan-Lucas Ott; Vasil Hnatyshin;	arxiv-cs.CL	2024-12-01
51	A Parallelly Contextual Convolutional Transformer for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	Yuncong Feng; Jianyu Su; Jian Zheng; Yupeng Zheng; Xiaoli Zhang;	Biomed. Signal Process. Control.	2024-12-01
52	Forma Mentis Networks Predict Creativity Ratings of Short Texts Via Interpretable Artificial Intelligence in Human and GPT-simulated Raters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use textual forma mentis networks (TFMN) to extract network (semantic/syntactic associations) and emotional features from approximately one thousand human- and GPT3.5-generated stories.	Edith Haim; Natalie Fischer; Salvatore Citraro; Giulio Rossetti; Massimo Stella;	arxiv-cs.AI	2024-11-30
53	Homeostasis and Sparsity in Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The transformer architecture has become an integral part of the field of modern neural networks, playing a crucial role in a variety of tasks, such as text generation, machine …	Leonid Kotyuzanskiy; Artem Klimov;	arxiv-cs.LG	2024-11-30
54	LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the ever-increasing number of news stories available online, classifying them by topic, regardless of the language they are written in, has become crucial for enhancing readers’ access to relevant content. To address this challenge, we propose a teacher-student framework based on large language models (LLMs) for developing multilingual news classification models of reasonable size with no need for manual data annotation.	Taja Kuzman; Nikola Ljubešić;	arxiv-cs.CL	2024-11-29
55	Waterfall Transformer for Multi-person Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Waterfall Transformer architecture for Pose estimation (WTPose), a single-pass, end-to-end trainable framework designed for multi-person pose estimation.	Navin Ranjan; Bruno Artacho; Andreas Savakis;	arxiv-cs.CV	2024-11-28
56	Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Self-Cross diffusion guidance to penalize the overlap between cross-attention maps and aggregated self-attention maps.	Weimin Qiu; Jieke Wang; Meng Tang;	arxiv-cs.CV	2024-11-28
57	The Impact of Example Selection in Few-Shot Prompting on Automated Essay Scoring Using GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the impact of example selection on the performance of au-tomated essay scoring (AES) using few-shot prompting with GPT models.	Lui Yoshida;	arxiv-cs.CL	2024-11-28
58	SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces SmartLLMSentry, a novel framework that leverages large language models (LLMs), specifically ChatGPT with in-context training, to advance smart contract vulnerability detection.	Oualid Zaazaa; Hanan El Bakkali;	arxiv-cs.CR	2024-11-28
59	Habit Coach: Customising RAG-based Chatbots to Support Behavior Change Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the iterative development of Habit Coach, a GPT-based chatbot designed to support users in habit change through personalized interaction.	Arian Fooroogh Mand Arabi; Cansu Koyuturk; Michael O’Mahony; Raffaella Calati; Dimitri Ognibene;	arxiv-cs.HC	2024-11-28
60	Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Developing a system capable of automatically generating the literature reviews from only the PDF files as input is the primary objective of this research work.	Nurshat Fateh Ali; Md. Mahdi Mohtasim; Shakil Mosharrof; T. Gopi Krishna;	arxiv-cs.CL	2024-11-27
61	Training and Evaluating Language Models with Template-based Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models often struggle with tasks requiring complex reasoning, particularly in mathematical problem-solving, due in part to the scarcity of large-scale, high-quality, domain-specific datasets necessary for training sophisticated reasoning abilities. To address this limitation, we introduce Template-based Data Generation (TDG), a novel approach that leverages LLMs (GPT-4) to automatically generate parameterized meta-templates, which are then used to synthesize a vast array of high-quality problems and solutions.	Yifan Zhang;	arxiv-cs.CL	2024-11-27
62	An Attempt to Develop A Neural Parser Based on Simplified Head-Driven Phrase Structure Grammar on Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aimed to develop a neural parser for Vietnamese based on simplified Head-Driven Phrase Structure Grammar (HPSG).	Duc-Vu Nguyen; Thang Chau Phan; Quoc-Nam Nguyen; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-11-26
63	On Limitations of LLM As Annotator for Low Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on Marathi, a low-resource language, and evaluate the performance of both closed-source and open-source LLMs as annotators.	Suramya Jadhav; Abhay Shanbhag; Amogh Thakurdesai; Ridhima Sinare; Raviraj Joshi;	arxiv-cs.CL	2024-11-26
64	Give Me The Code — Log Analysis of First-Year CS Students’ Interactions With GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite using unsophisticated prompting techniques, our findings suggest that the majority of students successfully leveraged GPT, incorporating the suggested solutions into their projects.	Pedro Alves; Bruno Pereira Cipriano;	arxiv-cs.CY	2024-11-26
65	Distributed Sign Momentum with Local Steps for Training Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates a novel communication-efficient distributed sign momentum method with local updates.	SHUHUA YU et. al.	arxiv-cs.LG	2024-11-26
66	What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational Linguistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of new literature into the English curriculum remains a challenge since educators often lack scalable tools to rapidly evaluate readability and adapt texts for diverse classroom needs. This study proposes to address this gap through a multimodal approach that combines transformer-based text classification with linguistic feature analysis to align texts with UK Key Stages.	Jordan J. Bird;	arxiv-cs.CL	2024-11-26
67	Can Bidirectional Encoder Become The Ultimate Winner for Downstream Applications of Foundation Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article analyzes one-way and bidirectional models based on GPT and BERT and compares their differences based on the purpose of the model.	Lewen Yang; Xuanyu Zhou; Juao Fan; Xinyi Xie; Shengxin Zhu;	arxiv-cs.CL	2024-11-26
68	Can Artificial Intelligence Predict Clinical Trial Outcomes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the predictive capabilities of large language models (LLMs) such as GPT-3.5, GPT-4, and HINT in determining clinical trial outcomes.	Shuyi Jin; Lu Chen; Hongru Ding; Meijie Wang; Lun Yu;	arxiv-cs.LG	2024-11-26
69	The Importance of Visual Modelling Languages in Generative Software Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: GPT-4 accepts image and text inputs, rather than simply natural language. We investigate relevant use cases stemming from these enhanced capabilities of GPT-4.	Roberto Rossi;	arxiv-cs.SE	2024-11-26
70	Can AI Grade Your Essays? A Comparative Analysis of Large Language Models and Teacher Ratings in Multidimensional Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent developments in generative AI, such as large language models, offer potential solutions to facilitate essay-scoring tasks for teachers.	Kathrin Seßler; Maurice Fürstenberg; Babette Bühler; Enkelejda Kasneci;	arxiv-cs.CL	2024-11-25
71	Development of Pre-Trained Transformer-based Models for The Nepali Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing efforts have predominantly concentrated on basic encoder-based models, there is a notable gap in the exploration of decoder-based architectures. To address this gap, we have collected 27.5 GB of Nepali text data, approximately 2.4x larger than any previously available Nepali language corpus.	Prajwal Thapa; Jinu Nyachhyon; Mridul Sharma; Bal Krishna Bal;	arxiv-cs.CL	2024-11-24
72	Nimbus: Secure and Efficient Two-Party Inference for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a new two-party inference framework $\mathsf{Nimbus}$ for Transformer models.	ZHENGYI LI et. al.	arxiv-cs.CR	2024-11-23
73	All That Glitters: Approaches to Evaluations with Unreliable Model and Human Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The effects of this error can escape commonly reported metrics of label quality or obscure questions of accuracy, bias, fairness, and usefulness during model evaluation. This study demonstrates methods for answering such questions even in the context of very low reliabilities from expert humans.	Michael Hardy;	arxiv-cs.CL	2024-11-23
74	Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle these problems, we first propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset, where the complex evaluation task is decoupled into simpler sub-tasks, effectively reducing the learning complexity. Based on this dataset, we design innovative training strategies to effectively distill GPT-4o’s evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.	RONG-CHENG TU et. al.	arxiv-cs.CL	2024-11-23
75	Enhancing Grammatical Error Detection Using BERT with Cleaned Lang-8 Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an improved LLM based model for Grammatical Error Detection (GED), which is a very challenging and equally important problem for many applications.	Rahul Nihalani; Kushal Shah;	arxiv-cs.CL	2024-11-23
76	Improving Next Tokens Via Second-Last Predictions with Generate and Refine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use our model to improve the next token predictions of a standard GPT by combining both predictions in a “generate-then-refine” approach.	Johannes Schneider;	arxiv-cs.CL	2024-11-23
77	Astro-HEP-BERT: A Bidirectional Language Model for Studying The Meanings of Concepts in Astrophysics and High Energy Physics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: I present Astro-HEP-BERT, a transformer-based language model specifically designed for generating contextualized word embeddings (CWEs) to study the meanings of concepts in astrophysics and high-energy physics.	Arno Simons;	arxiv-cs.CL	2024-11-22
78	Inducing Human-like Biases in Moral Reasoning Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the alignment (BrainScore) of large language models (LLMs) fine-tuned for moral reasoning on behavioral data and/or brain data of humans performing the same task.	ARTEM KARPOV et. al.	arxiv-cs.AI	2024-11-22
79	Purrfessor: A Fine-tuned Multimodal LLaVA Diet Health Chatbot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces Purrfessor, an innovative AI chatbot designed to provide personalized dietary guidance through interactive, multimodal engagement.	Linqi Lu; Yifan Deng; Chuan Tian; Sijia Yang; Dhavan Shah;	arxiv-cs.HC	2024-11-22
80	A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings show that transformer-based models have the potential to improve suicide ideation detection, thereby providing a path to develop robust mental health monitoring tools from social media. This research, therefore, underlines the undeniable prospect of advanced techniques in Natural Language Processing (NLP) while improving suicide prevention efforts.	Khalid Hasan; Jamil Saquer;	arxiv-cs.LG	2024-11-22
81	Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing research has primarily focused on model-specific adversarial methods, real-world applications demand a more generalizable and universal approach to audio adversarial attacks. In this paper, we introduce the Chat-Audio Attacks (CAA) benchmark including four distinct types of audio attacks, which aims to explore the the vulnerabilities of LLMs to these audio attacks in conversational scenarios.	WANQI YANG et. al.	arxiv-cs.SD	2024-11-22
82	Multiset Transformer: Advancing Representation Learning in Persistence Diagrams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To improve persistence diagram representation learning, we propose Multiset Transformer.	Minghua Wang; Ziyun Huang; Jinhui Xu;	arxiv-cs.LG	2024-11-21
83	Comparative Analysis of Pooling Mechanisms in LLMs: A Sentiment Analysis Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their widespread use, the comparative performance of these strategies on different LLM architectures remains underexplored. To address this gap, this paper investigates the effects of these pooling mechanisms on two prominent LLM families — BERT and GPT, in the context of sentence-level sentiment analysis.	Jinming Xing; Ruilin Xing; Yan Sun;	arxiv-cs.CL	2024-11-21
84	GPT Versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objectives of the study were to examine novel ethical issues arising from the application of LLMs in multi-robot systems.	REBEKAH ROUSI et. al.	arxiv-cs.RO	2024-11-21
85	Evaluating The Robustness of Analogical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On digit-matrix problems, we find a similar pattern but only on one out of the two types of variants we tested.	Martha Lewis; Melanie Mitchell;	arxiv-cs.CL	2024-11-21
86	BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, We experiment with four models from the BERT family: BERT Base, DistilBERT, ALBERT, and RoBERTa, and use multiclass classification to assess the alignment between CO and PO/PSO pairs.	Natenaile Asmamaw Shiferaw; Simpenzwe Honore Leandre; Aman Sinha; Dillip Rout;	arxiv-cs.LG	2024-11-21
87	Exploring Large Language Models for Climate Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capability of GPT-4 in predicting rainfall at short-term (15-day) and long-term (12-month) scales.	Yang Wang; Hassan A. Karimi;	arxiv-cs.LG	2024-11-20
88	SynEHRgy: Synthesizing Mixed-Type Structured Electronic Health Records Using Decoder-Only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel tokenization strategy tailored for structured EHR data, which encompasses diverse data types such as covariates, ICD codes, and irregularly sampled time series.	Hojjat Karami; David Atienza; Anisoara Ionescu;	arxiv-cs.LG	2024-11-20
89	Topkima-Former: Low-energy, Low-Latency Inference for Transformers Using Top-k In-memory ADC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose innovations at the circuit, architecture, and algorithm levels to accelerate the transformer.	SHUAI DONG et. al.	arxiv-cs.AR	2024-11-20
90	AI-Driven Agents with Prompts Designed for High Agreeableness Increase The Likelihood of Being Mistaken for A Human in The Turing Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Various explanations in the literature address why these GPT agents were perceived as human, including psychological frameworks for understanding anthropomorphism. These findings highlight the importance of personality engineering as an emerging discipline in artificial intelligence, calling for collaboration with psychology to develop ergonomic psychological models that enhance system adaptability in collaborative activities.	U. LEÓN-DOMÍNGUEZ et. al.	arxiv-cs.AI	2024-11-20
91	Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Video Retrieval-Augmented Generation (Video-RAG), a training-free and cost-effective pipeline that employs visually-aligned auxiliary texts to help facilitate cross-modality alignment while providing additional information beyond the visual content.	YONGDONG LUO et. al.	arxiv-cs.CV	2024-11-20
92	Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article aims to introduce a novel approach or model that attains improved performance for Vietnamese NLI.	Dat Van-Thanh Nguyen; Tin Van Huynh; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-11-20
93	Explaining GPT-4’s Schema of Depression Using Machine Behavior Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leveraged contemporary measurement theory to decode how GPT-4 interrelates depressive symptoms to inform both clinical utility and theoretical understanding.	ADITHYA V GANESAN et. al.	arxiv-cs.CL	2024-11-20
94	Benchmarking GPT-4 Against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a comprehensive evaluation of GPT-4’s translation capabilities compared to human translators of varying expertise levels.	JIANHAO YAN et. al.	arxiv-cs.CL	2024-11-20
95	Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation of tokenizers used by 12 LLMs across all 22 official languages of India, with a focus on comparing the efficiency of their tokenization processes.	S. Tamang; D. J. Bora;	arxiv-cs.CL	2024-11-19
96	Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of A Virtual Campus Environment with OpenAI GPT Integration with Unity 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach to multiple language learning, with Hindi the language to be learnt in our case, by using the integration of virtual reality environments and AI enabled tutoring systems using OpenAIs GPT api calls.	Adithya TG; Abhinavaram N; Gowri Srinivasa;	arxiv-cs.HC	2024-11-19
97	Transformer Neural Processes — Kernel Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Transformer Neural Process – Kernel Regression (TNP-KR), a new architecture that incorporates a novel transformer block we call a Kernel Regression Block (KRBlock), which reduces the computational complexity of attention in transformer-based Neural Processes (TNPs) from $\mathcal{O}((n_C+n_T)^2)$ to $O(n_C^2+n_Cn_T)$ by eliminating masked computations, where $n_C$ is the number of context, and $n_T$ is the number of test points, respectively, and a fast attention variant that further reduces all attention calculations to $\mathcal{O}(n_C)$ in space and time complexity.	DANIEL JENSON et. al.	arxiv-cs.LG	2024-11-19
98	Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explored the improvement in terms of multi-class disease classification via pre-trained language models over Medical-Abstracts-TC-Corpus that spans five medical conditions.	Ahmed Akib Jawad Karim; Muhammad Zawad Mahmud; Samiha Islam; Aznur Azam;	arxiv-cs.CL	2024-11-19
99	Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ three distinct text vectorization methods for SVM: Term Frequency Inverse Document Frequency (TF-IDF), Word2Vec, and Bag of Words (BoW) evaluating their effectiveness in distinguishing between genuine and fake news.	Ahmed Akib Jawad Karim; Kazi Hafiz Md Asad; Aznur Azam;	arxiv-cs.CL	2024-11-19
100	Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review examines the development of abstractive NLP-based text summarization approaches and compares them to existing techniques for extractive summarization.	Leon Kopitar; Primoz Kocbek; Lucija Gosak; Gregor Stiglic;	arxiv-cs.CL	2024-11-18
101	Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, the dynamic multi-grained behavior-aware preference is hard to capture in interaction sequences, which reflects interaction-aware sequential pattern. To tackle these challenges, we propose a Multi-Grained Preference enhanced Transformer framework (M-GPT).	CHUAN HE et. al.	arxiv-cs.IR	2024-11-18
102	CNMBert: A Model For Hanyu Pinyin Abbreviation to Character Conversion Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This task is typically one of text-length alignment and seems easy to solve; however, due to the limited informational content in pinyin abbreviations, achieving accurate conversion is challenging. In this paper, we treat this as a Fill-Mask task then propose CNMBert, which stands for zh-CN Pinyin Multi-mask Bert Model, as a solution to this issue.	Zishuo Feng; Feng Cao;	arxiv-cs.CL	2024-11-18
103	Automatic A-C. Network Switching Units Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The desirable characteristics of automatic switching units designed for application in secondary a-c. distribution networks are discussed in this paper. Descriptions are given of …	G. G. Grissinger;	Journal of the A.I.E.E.
104	A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research introduces a novel text generation model that combines BERT’s semantic interpretation strengths with GPT-4’s generative capabilities, establishing a high standard in generating coherent, contextually accurate language.	JIAJING CHEN et. al.	arxiv-cs.CL	2024-11-18
105	Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach that encapsulates conceptual relationships among variables within a well-defined knowledge graph, forming dynamic and learnable KGEs for seamless integration into the transformer architecture.	Shubham Tanaji Kakde; Rony Mitra; Jasashwi Mandal; Manoj Kumar Tiwari;	arxiv-cs.LG	2024-11-17
106	Brain-inspired Action Generation with Spiking Transformer Diffusion Policy Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Especially in Can task, we achieved an improvement of 8%.	Qianhao Wang; Yinqian Sun; Enmeng Lu; Qian Zhang; Yi Zeng;	arxiv-cs.RO	2024-11-15
107	Does Prompt Formatting Have Any Impact on LLM Performance? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although previous research has explored aspects like rephrasing prompt contexts, using various prompting techniques (like in-context learning and chain-of-thought), and ordering few-shot examples, our understanding of LLM sensitivity to prompt templates remains limited. Therefore, this paper examines the impact of different prompt templates on LLM performance.	JIA HE et. al.	arxiv-cs.CL	2024-11-15
108	KuaiFormer: Transformer-Based Retrieval at Kuaishou Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce KuaiFormer, a novel transformer-based retrieval framework deployed in a large-scale content recommendation system.	CHI LIU et. al.	arxiv-cs.IR	2024-11-15
109	CMATH: Cross-Modality Augmented Transformer with Hierarchical Variational Distillation for Multimodal Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Cross-Modality Augmented Transformer with Hierarchical Variational Distillation, called CMATH, which consists of two major components, i.e., Multimodal Interaction Fusion and Hierarchical Variational Distillation.	XIAOFEI ZHU et. al.	arxiv-cs.MM	2024-11-15
110	Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although many model compression approaches have been explored, they often suffer from notorious performance degradation. To address this issue, we introduce a new method, namely Transformer Re-parameterization, to boost the performance of lightweight Transformer models.	Zixing Zhang; Zhongren Dong; Weixiang Xu; Jing Han;	arxiv-cs.SD	2024-11-14
111	Adopting RAG for LLM-Aided Future Vehicle Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to enhance automated design and software development in the automotive industry.	Vahid Zolfaghari; Nenad Petrovic; Fengjunjie Pan; Krzysztof Lebioda; Alois Knoll;	arxiv-cs.SE	2024-11-14
112	BabyLM Challenge: Exploring The Effect of Variation Sets on Language Model Training Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the context of the BabyLM Challenge, we focus on Variation Sets (VSs), sets of consecutive utterances expressing a similar intent with slightly different words and structures, which are ubiquitous in CDS.	Akari Haga; Akiyo Fukatsu; Miyu Oba; Arianna Bisazza; Yohei Oseki;	arxiv-cs.CL	2024-11-14
113	LoRA-LiteE: A Computationally Efficient Framework for Chatbot Preference-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, RLHF methods are often computationally intensive and resource-demanding, limiting their scalability and accessibility for broader applications. To address these challenges, this study introduces LoRA-Lite Ensemble (LoRA-LiteE), an innovative framework that combines Supervised Fine-tuning (SFT) with Low-Rank Adaptation (LoRA) and Ensemble Learning techniques to effectively aggregate predictions of lightweight models, which aim to achieve a balance between the performance and computational cost.	Yahe Yang; Chunliang Tao; Xiaojing Fan;	arxiv-cs.CL	2024-11-14
114	LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH).	XIAONAN NIE et. al.	arxiv-cs.DC	2024-11-13
115	Towards Optimizing A Retrieval Augmented Generation Using Large Language Model on Academic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the growing trend of many organizations integrating Retrieval Augmented Generation (RAG) into their operations, we assess RAG on domain-specific data and test state-of-the-art models across various optimization techniques.	Anum Afzal; Juraj Vladika; Gentrit Fazlija; Andrei Staradubets; Florian Matthes;	arxiv-cs.AI	2024-11-13
116	Evaluating World Models with LLM for Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a comprehensive evaluation of the world models with LLMs from the decision making perspective.	Chang Yang; Xinrun Wang; Junzhe Jiang; Qinggang Zhang; Xiao Huang;	arxiv-cs.AI	2024-11-13
117	TRACE: Transformer-based Risk Assessment for Clinical Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TRACE (Transformer-based Risk Assessment for Clinical Evaluation), a novel method for clinical risk assessment based on clinical data, leveraging the self-attention mechanism for enhanced feature interaction and result interpretation.	Dionysis Christopoulos; Sotiris Spanos; Valsamis Ntouskos; Konstantinos Karantzalos;	arxiv-cs.CV	2024-11-13
118	CamemBERT 2.0: A Smarter French Language Model Aged to Perfection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This issue emphasizes the need for updated models that reflect current linguistic trends. In this paper, we introduce two new versions of the CamemBERT base model-CamemBERTav2 and CamemBERTv2-designed to address these challenges.	WISSAM ANTOUN et. al.	arxiv-cs.CL	2024-11-13
119	Circuit Complexity Bounds for RoPE-based Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we establish a circuit complexity bound for Transformers with $\mathsf{RoPE}$ attention.	BO CHEN et. al.	arxiv-cs.LG	2024-11-12
120	Derivational Morphology Reveals Analogical Generalization in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new method for investigating linguistic generalization in LLMs: focusing on GPT-J, we fit cognitive models that instantiate rule-based and analogical learning to the LLM training data and compare their predictions on a set of nonce adjectives with those of the LLM, allowing us to draw direct conclusions regarding underlying mechanisms.	Valentin Hofmann; Leonie Weissweiler; David Mortensen; Hinrich Schütze; Janet Pierrehumbert;	arxiv-cs.CL	2024-11-12
121	Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using 385 questions spanning seven safety knowledge areas, the study analyzes the models’ accuracy, consistency, and reliability.	Farouq Sammour; Jia Xu; Xi Wang; Mo Hu; Zhenyu Zhang;	arxiv-cs.AI	2024-11-12
122	Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the frequency of (anti-)solidarity towards women and migrants in German parliamentary debates between 1867 and 2022.	AIDA KOSTIKOVA et. al.	emnlp	2024-11-11
123	Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its efficiency, Sentence-BERT tackles STS tasks from a classification perspective, overlooking the progressive nature of semantic relationships, which results in suboptimal performance. To bridge this gap, this paper presents an innovative regression framework and proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss.	Bowen Zhang; Chunping Li;	emnlp	2024-11-11
124	Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-standard varieties from around the world).	EVE FLEISIG et. al.	emnlp	2024-11-11
125	Can LLMs Replace Neil DeGrasse Tyson? Evaluating The Reliability of LLMs As Science Communicators Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on evaluating the reliability of current LLMs as science communicators.	Prasoon Bajpai; Niladri Chatterjee; Subhabrata Dutta; Tanmoy Chakraborty;	emnlp	2024-11-11
126	A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing hate speech detection solutions have utilized the features by treating each post as an isolated input instance for the classification. This paper addresses this issue by introducing a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information.	Prashant Kapil; Asif Ekbal;	arxiv-cs.CL	2024-11-11
127	Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a straightforward yet potent Conversation Reconstruction Attack.	Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang;	emnlp	2024-11-11
128	Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) evolve, evaluating their output reliably becomes increasingly difficult due to the high cost of human evaluation. To address this, we introduce FLAMe, a family of Foundational Large Autorater Models.	TU VU et. al.	emnlp	2024-11-11
129	SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy.	PRAKAMYA MISHRA et. al.	emnlp	2024-11-11
130	Comparing A BERT Classifier and A GPT Classifier for Detecting Connective Language Across Multiple Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an approach for detecting connective language-defined as language that facilitates engagement, understanding, and conversation-from social media discussions.	Josephine Lukito; Bin Chen; Gina M. Masullo; Natalie Jomini Stroud;	emnlp	2024-11-11
131	GPT Vs RETRO: Exploring The Intersection of Retrieval and Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we apply PEFT methods (P-tuning, Adapters, and LoRA) to a modified Retrieval-Enhanced Transformer (RETRO) and a baseline GPT model across several sizes, ranging from 823 million to 48 billion parameters.	Aleksander Ficek; Jiaqi Zeng; Oleksii Kuchaiev;	emnlp	2024-11-11
132	DA3: A Distribution-Aware Adversarial Attack Against Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, they are easy to detect using straightforward detection methods, diminishing the efficacy of such attacks. To address this issue, we propose a Distribution-Aware Adversarial Attack (DA3) method.	Yibo Wang; Xiangjue Dong; James Caverlee; Philip S. Yu;	emnlp	2024-11-11
133	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks.	VIKTORIIA A. CHEKALINA et. al.	emnlp	2024-11-11
134	White-Box Diffusion Transformer for Single-cell RNA-seq Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the process of data acquisition is often constrained by high cost and limited sample availability. To overcome these limitations, we propose a hybrid model based on Diffusion model and White-Box transformer that aims to generate synthetic and biologically plausible scRNA-seq data.	Zhuorui Cui; Shengze Dong; Ding Liu;	arxiv-cs.LG	2024-11-11
135	TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A particular interest lies on keystroke dynamics (KD), which refers to the task of recognizing individuals’ identity based on their unique typing style. In this work, we propose the use of pre-trained language models (PLMs) to recognize such patterns.	Matheus Simão; Fabiano Prado; Omar Abdul Wahab; Anderson Avila;	arxiv-cs.CR	2024-11-11
136	On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.	YEKUN CHAI et. al.	emnlp	2024-11-11
137	BeeManc at The PLABA Track of TAC-2024: RoBERTa for Task 1 — LLaMA3.1 and GPT-4o for Task 2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In task one, we applied fine-tuned ReBERTa-Base models to identify and classify the difficult terms, jargon and acronyms in the biomedical abstracts and reported the F1 score.	Zhidong Ling; Zihao Li; Pablo Romero; Lifeng Han; Goran Nenadic;	arxiv-cs.CL	2024-11-11
138	Unraveling The Gradient Descent Dynamics of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradient Descent (GD) to achieve guaranteed convergence?	Bingqing Song; Boran Han; Shuai Zhang; Jie Ding; Mingyi Hong;	arxiv-cs.LG	2024-11-11
139	On The Reliability of Psychological Scales on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits.	JEN-TSE HUANG et. al.	emnlp	2024-11-11
140	TreeCoders: Trees of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TreeCoders, a novel family of transformer trees.	Pierre Colonna D’Istria; Abdulrahman Altahhan;	arxiv-cs.CL	2024-11-11
141	DAMRO: Dive Into The Attention Mechanism of LVLM to Reduce Object Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issue, we propose DAMRO, a novel training-free strategy that Dive into Attention Mechanism of LVLM to Reduce Object Hallucination.	Xuan Gong; Tianshi Ming; Xinpeng Wang; Zhihua Wei;	emnlp	2024-11-11
142	ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision.	Mallika Garg; Debashis Ghosh; Pyari Mohan Pradhan;	arxiv-cs.CV	2024-11-11
143	Generalizing Clinical De-identification Models By Privacy-safe Data Augmentation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, labeling standards and the formats of patient records vary across different institutions. Our study addresses these issues by exploiting GPT-4 for data augmentation through one-shot and zero-shot prompts.	Woojin Kim; Sungeun Hahm; Jaejin Lee;	emnlp	2024-11-11
144	MTLS: Making Texts Into Linguistic Symbols Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we shift the focus to the symbolic properties and introduce MTLS: a pre-training method to improve the multilingual capability of models by Making Texts into Linguistic Symbols.	Wenlong Fei; Xiaohua Wang; Min Hu; Qingyu Zhang; Hongbo Li;	emnlp	2024-11-11
145	Knowledge Graph Enhanced Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of post-edit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME.	MENGQI ZHANG et. al.	emnlp	2024-11-11
146	BudgetMLAgent: A Cost-Effective LLM Multi-Agent System for Automating Machine Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the motivation of developing a cost-efficient LLM based solution for solving ML tasks, we propose an LLM Multi-Agent based system which leverages combination of experts using profiling, efficient retrieval of past observations, LLM cascades, and ask-the-expert calls.	Shubham Gandhi; Manasi Patwardhan; Lovekesh Vig; Gautam Shroff;	arxiv-cs.MA	2024-11-11
147	Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first of its kind benchmark for depression-anxiety comorbidity classification from social media posts.	AMEY HENGLE et. al.	emnlp	2024-11-11
148	Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months.	Michael Lan; Philip Torr; Fazl Barez;	emnlp	2024-11-11
149	Will LLMs Replace The Encoder-Only Models in Temporal Relation Classification? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task.	Gabriel Roccabruna; Massimo Rizzoli; Giuseppe Riccardi;	emnlp	2024-11-11
150	Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese.	Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari;	emnlp	2024-11-11
151	Leveraging Pre-trained Language Models for Linguistic Analysis: A Case of Argument Structure Constructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of pre-trained language models in identifying argument structure constructions, important for modeling both first and second language learning.	Hakyung Sung; Kristopher Kyle;	emnlp	2024-11-11
152	BiasWipe: Mitigating Unintended Bias in Text Classifiers Through Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a robust and generalizable technique BiasWipe to mitigate unintended bias in language models.	Mamta Mamta; Rishikant Chigrupaatii; Asif Ekbal;	emnlp	2024-11-11
153	FOOL ME IF YOU CAN! An Adversarial Dataset to Investigate The Robustness of LMs in Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models still struggle with recognizing semantic boundaries and often misclassify homonyms in adversarial context. Therefore, we propose FOOL: FOur-fold Obscure Lexical, a new coarse-grained WSD dataset, which includes four different test sets designed to assess the robustness of language models in WSD tasks.	MOHAMAD BALLOUT et. al.	emnlp	2024-11-11
154	Evaluating ChatGPT-3.5 Efficiency in Solving Coding Problems of Different Complexity Levels: An Empirical Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We assess the performance of ChatGPT’s GPT-3.5-turbo model on LeetCode, a popular platform with algorithmic coding challenges for technical interview practice, across three difficulty levels: easy, medium, and hard.	Minda Li; Bhaskar Krishnamachari;	arxiv-cs.SE	2024-11-11
155	High-Fidelity Cellular Network Control-Plane Traffic Generation Without Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the feasibility of developing a high-fidelity MCN control plane traffic generator by leveraging generative ML models.	Z. Jonny Kong; Nathan Hu; Y. Charlie Hu; Jiayi Meng; Yaron Koral;	arxiv-cs.NI	2024-11-11
156	Split and Merge: Aligning Position Biases in LLM-based Evaluators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLM-based evaluators exhibit position bias, or inconsistency, when used to evaluate candidate answers in pairwise comparisons, favoring either the first or second answer regardless of content. To address this limitation, we propose PORTIA, an alignment-based system designed to mimic human comparison strategies to calibrate position bias in a lightweight yet effective manner.	ZONGJIE LI et. al.	emnlp	2024-11-11
157	Using Language Models to Disambiguate Lexical Choices in Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English.	Josh Barua; Sanjay Subramanian; Kayo Yin; Alane Suhr;	emnlp	2024-11-11
158	Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge.	Steven Y. Feng; Noah Goodman; Michael Frank;	emnlp	2024-11-11
159	Pron Vs Prompt: Can Large Language Models Already Challenge A World-Class Fiction Author at Creative Text Writing? Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Are LLMs ready to compete in creative writing skills with a top (rather than average) novelist? To provide an initial answer for this question, we have carried out a contest …	Guillermo Marco; Julio Gonzalo; M.Teresa Mateo-Girona; Ram�n Del Castillo Santos;	emnlp	2024-11-11
160	GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Iterative Refinement Induced Self-Jailbreak (IRIS), a novel approach that leverages the reflective capabilities of LLMs for jailbreaking with only black-box access.	Govind Ramesh; Yao Dou; Wei Xu;	emnlp	2024-11-11
161	Subword Segmentation in LLMs: Looking at Inflection and Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study two criteria: (i) adherence to morpheme boundaries and (ii) the segmentation consistency of the different inflected forms of a lemma.	Marion Di Marco; Alexander Fraser;	emnlp	2024-11-11
162	Evaluating Psychological Safety of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we designed unbiased prompts to systematically evaluate the psychological safety of large language models (LLMs).	Xingxuan Li; Yutong Li; Lin Qiu; Shafiq Joty; Lidong Bing;	emnlp	2024-11-11
163	Universal Response and Emergence of Induction in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By applying our method, we observe signatures of induction behavior within the residual stream of Gemma-2-2B, Llama-3.2-3B, and GPT-2-XL. Across all models, we find that these induction signatures gradually emerge within intermediate layers and identify the relevant model sections composing this behavior.	Niclas Luick;	arxiv-cs.LG	2024-11-11
164	MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how to build small fact-checking models that have GPT-4-level performance but for 400x lower cost.	Liyan Tang; Philippe Laban; Greg Durrett;	emnlp	2024-11-11
165	Annotation Alignment: Comparing LLM and Human Annotations of Conversational Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that larger datasets are needed to resolve whether GPT-4 exhibits disparities in how well it correlates with different demographic groups.	Rajiv Movva; Pang Wei Koh; Emma Pierson;	emnlp	2024-11-11
166	Ambient AI Scribing Support: Comparing The Performance of Specialized AI Agentic Architecture to Leading Foundational Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares Sporo Health’s AI Scribe, a proprietary model fine-tuned for medical scribing, with various LLMs (GPT-4o, GPT-3.5, Gemma-9B, and Llama-3.2-3B) in clinical documentation.	Chanseo Lee; Sonu Kumar; Kimon A. Vogt; Sam Meraj;	arxiv-cs.AI	2024-11-10
167	LProtector: An LLM-driven Vulnerability Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG).	ZE SHENG et. al.	arxiv-cs.CR	2024-11-10
168	Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This thesis introduces a Parameter-Efficient Fine-Tuning (PEFT) approach tailored for GPT-like models, aiming to mitigate hallucinations and enhance reproducibility, particularly in the computational domain of mass spectrometry.	Daniil Sulimov;	arxiv-cs.CL	2024-11-10
169	Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing finance benchmarks often suffer from limited language and task coverage, as well as challenges such as low-quality datasets and inadequate adaptability for LLM evaluation. To address these limitations, we propose Golden Touchstone, the first comprehensive bilingual benchmark for financial LLMs, which incorporates representative datasets from both Chinese and English across eight core financial NLP tasks.	XIAOJUN WU et. al.	arxiv-cs.CL	2024-11-09
170	AI’s Spatial Intelligence: Evaluating AI’s Understanding of Spatial Transformations in PSVT:R and Augmented Reality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies show Artificial Intelligence (AI) with language and vision capabilities still face limitations in spatial reasoning. In this paper, we have studied generative AI’s spatial capabilities of understanding rotations of objects utilizing its image and language processing features.	Uttamasha Monjoree; Wei Yan;	arxiv-cs.AI	2024-11-09
171	High Entropy Alloy Property Predictions Using Transformer-based Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a language transformer-based machine learning model to predict key mechanical properties of high-entropy alloys (HEAs), addressing the challenges due to their complex, multi-principal element compositions and limited experimental data.	Spyros Kamnis; Konstantinos Delibasis;	arxiv-cs.CE	2024-11-07
172	GPT Semantic Cache: Reducing LLM Costs and Latency Via Semantic Embedding Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GPT Semantic Cache, a method that leverages semantic caching of query embeddings in in-memory storage (Redis).	Sajal Regmi; Chetan Phakami Pun;	arxiv-cs.LG	2024-11-07
173	Lightning IR: Straightforward Fine-tuning and Inference of Transformer-based Language Models for Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Lightning IR, an easy-to-use PyTorch Lightning-based framework for applying transformer-based language models in retrieval scenarios.	Ferdinand Schlatt; Maik Fröbe; Matthias Hagen;	arxiv-cs.IR	2024-11-07
174	Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams.	Adriana Caraeni; Alexander Scarlatos; Andrew Lan;	arxiv-cs.CY	2024-11-07
175	FineTuneBench: How Well Do Commercial Fine-tuning APIs Infuse Knowledge Into LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce FineTuneBench, an evaluation framework and dataset for understanding how well commercial fine-tuning APIs can successfully learn new and updated knowledge.	Eric Wu; Kevin Wu; James Zou;	arxiv-cs.CL	2024-11-07
176	Understanding The Effects of Human-written Paraphrases in LLM-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we devise a new data collection strategy to collect Human & LLM Paraphrase Collection (HLPC), a first-of-its-kind dataset that incorporates human-written texts and paraphrases, as well as LLM-generated texts and paraphrases.	Hiu Ting Lau; Arkaitz Zubiaga;	arxiv-cs.CL	2024-11-06
177	A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to explore how LLMs can alleviate the burden of manual summarization, streamline workflow efficiencies, and support informed decision-making in healthcare settings.	YIMING LI et. al.	arxiv-cs.CL	2024-11-06
178	Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the feasibility of using large language models (LLMs), specifically GPT-4o (ChatGPT), for automated grading of conceptual questions in an undergraduate Mechanical Engineering course.	RUJUN GAO et. al.	arxiv-cs.CY	2024-11-05
179	Rethinking Decoders for Transformer-based Semantic Segmentation: Compression Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we argue that there are fundamental connections between semantic segmentation and compression, especially between the Transformer decoders and Principal Component Analysis (PCA).	Qishuai Wen; Chun-Guang Li;	arxiv-cs.CV	2024-11-05
180	Enhancing Transformer Training Efficiency with Dynamic Dropout Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Dynamic Dropout, a novel regularization technique designed to enhance the training efficiency of Transformer models by dynamically adjusting the dropout rate based on training epochs or validation loss improvements.	Hanrui Yan; Dan Shao;	arxiv-cs.LG	2024-11-05
181	From Medprompt to O1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Following on the Medprompt study with GPT-4, we systematically evaluate the o1-preview model across various medical benchmarks.	HARSHA NORI et. al.	arxiv-cs.CL	2024-11-05
182	Automatic Generation of Question Hints for Mathematics Problems Using Large Language Models in Educational Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present here the study of several dimensions: 1) identifying error patterns made by simulated students on secondary-level math exercises; 2) developing various prompts for GPT-4o as a teacher and evaluating their effectiveness in generating hints that enable simulated students to self-correct; and 3) testing the best-performing prompts, based on their ability to produce relevant hints and facilitate error correction, with Llama-3-8B-Instruct as the teacher, allowing for a performance comparison with GPT-4o.	Junior Cedric Tonga; Benjamin Clement; Pierre-Yves Oudeyer;	arxiv-cs.CL	2024-11-05
183	Evaluating The Ability of Large Language Models to Generate Verifiable Specifications in VeriFast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, prior work has not explored how well LLMs can perform specification generation for specifications based in an ownership logic, such as separation logic. To address this gap, this paper explores the effectiveness of large language models (LLMs), specifically OpenAI’s GPT models, in generating fully correct specifications based on separation logic for static verification of human-written programs in VeriFast.	MARILYN REGO et. al.	arxiv-cs.SE	2024-11-04
184	Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we identify representation collapse in the model’s intermediate layers as a key factor limiting their reasoning capabilities.	MD RIFAT AREFIN et. al.	arxiv-cs.LG	2024-11-04
185	Wave Network: An Ultra-Small Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an innovative token representation and update method in a new ultra-small language model: the Wave network.	Xin Zhang; Victor S. Sheng;	arxiv-cs.CL	2024-11-04
186	Advancements and Limitations of LLMs in Replicating Human Color-word Associations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compared multiple generations of LLMs (from GPT-3 to GPT-4o) against human color-word associations using data collected from over 10,000 Japanese participants, involving 17 colors and words from eight categories in Japanese.	Makoto Fukushima; Shusuke Eshita; Hiroshige Fukuhara;	arxiv-cs.CL	2024-11-04
187	Ask, and It Shall Be Given: Turing Completeness of Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function.	Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong;	arxiv-cs.LG	2024-11-04
188	Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging advancements in natural language processing, this study presents a systematic approach to enrich tabular datasets with features derived from large language model embeddings.	Gjergji Kasneci; Enkelejda Kasneci;	arxiv-cs.LG	2024-11-03
189	Can Large Language Model Predict Employee Attrition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning (ML) advancements offer more scalable and accurate solutions, but large language models (LLMs) introduce new potential in human resource management by interpreting nuanced employee communication and detecting subtle turnover cues.	Xiaoye Ma; Weiheng Liu; Changyi Zhao; Liliya R. Tukhvatulina;	arxiv-cs.LG	2024-11-02
190	Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, we introduce the Lingma SWE-GPT series, comprising Lingma SWE-GPT 7B and 72B.	YINGWEI MA et. al.	arxiv-cs.SE	2024-11-01
191	LLMs: A Game-Changer for Software Engineers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a critical analysis of technical strengths, limitations, real-world case studies, and future research directions, this paper argues that LLMs are not just reshaping how software is developed but are redefining the role of developers.	Md Asraful Haque;	arxiv-cs.SE	2024-11-01
192	Unsupervised Graph Transformer With Augmentation-Free Contrastive Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers, having the superior ability to capture both adjacent and long-range dependencies, have been applied to the graph representation learning field. Existing methods are …	Han Zhao; Xu Yang; Kun-Juan Wei; Cheng Deng; Dacheng Tao;	IEEE Transactions on Knowledge and Data Engineering	2024-11-01
193	GameGen-X: Interactive Open-world Game Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos.	Haoxuan Che; Xuanhua He; Quande Liu; Cheng Jin; Hao Chen;	arxiv-cs.CV	2024-11-01
194	Trans-VNet: Transformer-based Tooth Semantic Segmentation in CBCT Images Related Papers Related Patents Related Grants Related Venues Related Experts View	Chen Wang; Jingyu Yang; Baoyu Wu; Ruijun Liu; Peng Yu;	Biomed. Signal Process. Control.	2024-11-01
195	Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism.	BIN LEI et. al.	arxiv-cs.AI	2024-11-01
196	Transformer-CNN for Small Image Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Yan-Lin Chen; Chun-Liang Lin; Yu-Chen Lin; Tzu-Chun Chen;	Signal Process. Image Commun.	2024-11-01
197	GPT for Games: An Updated Scoping Review (2020-2024) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to illustrate the state of the art in innovative GPT applications in games, offering a foundation to enrich game development and enhance player experiences through cutting-edge AI innovations.	Daijin Yang; Erica Kleinman; Casper Harteveld;	arxiv-cs.AI	2024-10-31
198	Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces.	Nouar AlDahoul; Myles Joshua Toledo Tan; Harishwar Reddy Kasireddy; Yasir Zaki;	arxiv-cs.CV	2024-10-31
199	GPT or BERT: Why Not Both? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple way to merge masked language modeling with causal language modeling.	Lucas Georges Gabriel Charpentier; David Samuel;	arxiv-cs.CL	2024-10-31
200	Handwriting Recognition in Historical Documents with Multimodal LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I evaluate the accuracy of handwritten document transcriptions generated by Gemini against the current state of the art Transformer based methods.	Lucian Li;	arxiv-cs.CV	2024-10-31
201	IO Transformer: Evaluating SwinV2-Based Reward Models for Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines SwinV2-based reward models, called the Input-Output Transformer (IO Transformer) and the Output Transformer.	Maxwell Meyer; Jack Spruyt;	arxiv-cs.CV	2024-10-31
202	Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases.	MUHAMMED SAEED et. al.	arxiv-cs.CL	2024-10-31
203	EDT: An Efficient Diffusion Transformer Framework Inspired By Human-like Sketching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the computation budget of transformer-based DPMs, this work proposes the Efficient Diffusion Transformer (EDT) framework.	Xinwang Chen; Ning Liu; Yichen Zhu; Feifei Feng; Jian Tang;	arxiv-cs.CV	2024-10-31
204	Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN).	IBNE HASSAN et. al.	arxiv-cs.CV	2024-10-31
205	An Empirical Analysis of GPT-4V’s Performance on Fashion Aesthetic Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time.	YUKI HIRAKAWA et. al.	arxiv-cs.CV	2024-10-31
206	LoFLAT: Local Feature Matching Using Focused Linear Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper.	Naijian Cao; Renjie He; Yuchao Dai; Mingyi He;	arxiv-cs.CV	2024-10-30
207	ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching.	JUNJIE NI et. al.	arxiv-cs.CV	2024-10-30
208	Automated Personnel Selection for Software Engineers Using LLM-Based Profile Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a fresh dataset and technique as well as shows how transformer models could improve recruiting procedures.	Ahmed Akib Jawad Karim; Shahria Hoque; Md. Golam Rabiul Alam; Md. Zia Uddin;	arxiv-cs.SE	2024-10-30
209	ProTransformer: Robustify Transformers Via Plug-and-Play Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures.	Zhichao Hou; Weizhi Gao; Yuchen Shen; Feiyi Wang; Xiaorui Liu;	arxiv-cs.LG	2024-10-30
210	EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The former hurts the fairness of benchmarks, and the latter hinders practitioners from selecting superior LLMs for specific programming domains. To address these two limitations, we propose a new benchmark – EvoCodeBench, which has the following advances: (1) Evolving data.	JIA LI et. al.	arxiv-cs.CL	2024-10-30
211	GPT-4o Reads The Mind in The Eyes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using two versions of a widely used theory of mind test, the Reading the Mind in Eyes Test and the Multiracial Reading the Mind in the Eyes Test, we found that GPT-4o outperformed humans in interpreting mental states from upright faces but underperformed humans when faces were inverted.	JAMES W. A. STRACHAN et. al.	arxiv-cs.HC	2024-10-29
212	Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context for preference.	Rishabh Adiga; Besmira Nushi; Varun Chandrasekaran;	arxiv-cs.CL	2024-10-29
213	AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent work, AmpleGCG~\citep{liao2024amplegcg}, demonstrates that a generative model can quickly produce numerous customizable gibberish adversarial suffixes for any harmful query, exposing a range of alignment gaps in out-of-distribution (OOD) language spaces. To bring more attention to this area, we introduce AmpleGCG-Plus, an enhanced version that achieves better performance in fewer attempts.	Vishal Kumar; Zeyi Liao; Jaylen Jones; Huan Sun;	arxiv-cs.CL	2024-10-29
214	A Simple Yet Effective Corpus Construction Framework for Indonesian Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How to efficiently construct high-quality evaluation corpora for GEC in low-resource languages has become a significant challenge. To fill these gaps, in this paper, we present a framework for constructing GEC corpora.	NANKAI LIN et. al.	arxiv-cs.CL	2024-10-28
215	SepMamba: State-space Models for Speaker Separation Using Mamba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers.	THOR HØJHUS AVENSTRUP et. al.	arxiv-cs.SD	2024-10-28
216	KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD.	Rambod Azimi; Rishav Rishav; Marek Teichmann; Samira Ebrahimi Kahou;	arxiv-cs.CL	2024-10-28
217	Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a medical literature summary generation method based on the BERT model to address the challenges brought by the current explosion of medical information.	JIACHENG HU et. al.	arxiv-cs.CL	2024-10-28
218	UOttawa at LegalLens-2024: Transformer-based Classification Experiments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents the methods used for LegalLens-2024 shared task, which focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals.	Nima Meghdadi; Diana Inkpen;	arxiv-cs.CL	2024-10-28
219	Gender Bias in LLM-generated Interview Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications.	Haein Kong; Yongsu Ahn; Sangyub Lee; Yunho Maeng;	arxiv-cs.CL	2024-10-28
220	Sequential Choice in Ordered Bundles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate several predictive models, including two custom Transformers using decoder-only and encoder-decoder architectures, fine-tuned GPT-3, a custom LSTM model, a reinforcement learning model, two Markov models, and a zero-order model.	Rajeev Kohli; Kriste Krstovski; Hengyu Kuang; Hengxu Lin;	arxiv-cs.LG	2024-10-28
221	Is GPT-4 Less Politically Biased Than GPT-3.5? A Renewed Investigation of ChatGPT’s Political Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the political biases and personality traits of ChatGPT, specifically comparing GPT-3.5 to GPT-4.	Erik Weber; Jérôme Rutinowski; Niklas Jost; Markus Pauly;	arxiv-cs.CL	2024-10-28
222	Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project explores the security vulnerabilities in relation to prompt injection attacks.	Md Abdur Rahman; Fan Wu; Alfredo Cuzzocrea; Sheikh Iqbal Ahamed;	arxiv-cs.CL	2024-10-27
223	SeisGPT: A Physics-Informed Data-Driven Large Model for Real-Time Seismic Response Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods, which rely on complex finite element models often struggle with balancing computational efficiency and accuracy. To address this challenge, we introduce SeisGPT, a data-driven, large physics-informed model that leverages deep neural networks based on the Generative Pre-trained Transformer (GPT) architecture.	SHIQIAO MENG et. al.	arxiv-cs.CE	2024-10-26
224	Sequential Large Language Model-Based Hyper-Parameter Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study introduces SLLMBO, an innovative framework that leverages Large Language Models (LLMs) for hyperparameter optimization (HPO), incorporating dynamic search space adaptability, enhanced parameter landscape exploitation, and a hybrid, novel LLM-Tree-structured Parzen Estimator (LLM-TPE) sampler.	Kanan Mahammadli; Seyda Bolelli Ertekin;	arxiv-cs.LG	2024-10-26
225	Notes on The Mathematical Structure of GPT LLM Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM. …	Spencer Becker-Kahn;	arxiv-cs.LG	2024-10-25
226	No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts.	ISRAEL FAMA et. al.	arxiv-cs.CL	2024-10-24
227	Probing Ranking LLMs: Mechanistic Interpretability in Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the mechanistic workings of state-of-the-art, fine-tuning-based passage-reranking transformer networks.	Tanya Chowdhury; James Allan;	arxiv-cs.IR	2024-10-24
228	Integrating Large Language Models with Internet of Things Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection, macroprogramming over IoT systems, and sensor data processing.	Mingyu Zong; Arvin Hekmati; Michael Guastalla; Yiyi Li; Bhaskar Krishnamachari;	arxiv-cs.AI	2024-10-24
229	GPT-Signal: Generative AI for Semi-automated Feature Engineering in The Alpha Research Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the recent development of Generative Artificial Intelligence(Gen AI) and Large Language Models (LLMs), we present a novel way of leveraging GPT-4 to generate new return-predictive formulaic alphas, making alpha mining a semi-automated process, and saving time and energy for investors and traders.	Yining Wang; Jinman Zhao; Yuri Lawryshyn;	arxiv-cs.CE	2024-10-24
230	Lightweight Neural App Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel mobile phone control architecture, termed “app agents, for efficient interactions and controls across various Android apps.	FILIPPOS CHRISTIANOS et. al.	arxiv-cs.AI	2024-10-23
231	Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords.	Farshad Jafari; Claire Arthur;	arxiv-cs.IT	2024-10-23
232	Locating Information in Large Language Models Via Random Matrix Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the weight matrices of pretrained transformer models — specifically BERT and Llama — using random matrix theory (RMT) as a zero-information hypothesis.	Max Staats; Matthias Thamm; Bernd Rosenow;	arxiv-cs.LG	2024-10-23
233	An Eye for An AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that although GPT-4o exhibits great potential in solving questions with visual information independently, major limitations still exist to the accuracy and quality of the generated results. We propose several novel approaches for CG educators to incorporate GenAI into CG teaching despite these limitations.	Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Jacqueline Whalley;	arxiv-cs.AI	2024-10-22
234	Interpreting Affine Recurrence Learning in GPT-style Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In-context learning allows transformers to generalize during inference without modifying their weights, yet the precise operations driving this capability remain largely opaque. This paper presents an investigation into the mechanistic interpretability of these transformers, focusing specifically on their ability to learn and predict affine recurrences as an ICL task.	Samarth Bhargav; Alexander Gu;	arxiv-cs.LG	2024-10-22
235	In Context Learning and Reasoning for Symbolic Regression with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore the potential of LLMs to perform symbolic regression — a machine-learning method for finding simple and accurate equations from datasets.	Samiha Sharlin; Tyler R. Josephson;	arxiv-cs.CL	2024-10-22
236	Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through evaluations of edited models and analysis of extracted representations, we show that KE inadvertently affects representations of entities beyond the targeted one, distorting relevant structures that allow a model to infer unseen knowledge about an entity.	KENTO NISHI et. al.	arxiv-cs.LG	2024-10-22
237	GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although large language models (LLMs) have demonstrated potential in code generation tasks, they often encounter issues such as refusal to code or hallucination in geospatial code generation due to a lack of domain-specific knowledge and code corpora. To address these challenges, this paper presents and open-sources the GeoCode-PT and GeoCode-SFT corpora, along with the GeoCode-Eval evaluation dataset.	SHUYANG HOU et. al.	arxiv-cs.SE	2024-10-22
238	Graph Transformers Dream of Electric Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The input to the Transformer is simply the graph incidence matrix; no other explicit positional encoding information is provided. We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms.	Xiang Cheng; Lawrence Carin; Suvrit Sra;	arxiv-cs.LG	2024-10-22
239	Using GPT Models for Qualitative and Quantitative News Analytics in The 2024 US Presidental Election Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper considers an approach of using Google Search API and GPT-4o model for qualitative and quantitative analyses of news through retrieval-augmented generation (RAG).	Bohdan M. Pavlyshenko;	arxiv-cs.CL	2024-10-21
240	Exploring Pretraining Via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pretraining strategy that uses active forgetting to achieve similar cross lingual transfer in decoder-only LLMs.	Divyanshu Aggarwal; Ashutosh Sathe; Sunayana Sitaram;	arxiv-cs.CL	2024-10-21
241	Learning to Differentiate Pairwise-Argument Representations for Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable encoders to produce clearly distinguishable representations, we propose a joint learning framework.	ZHIPANG WANG et. al.	cikm	2024-10-21
242	BART-based Hierarchical Attentional Network for Sentence Ordering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel BART-based Hierarchical Attentional Ordering Network (BHAONet), aiming to address the coherence modeling challenge within paragraphs, which stands as a cornerstone in comprehension, generation, and reasoning tasks.	Yiping Yang; Baiyun Cui; Yingming Li;	cikm	2024-10-21
243	Comparative Study of Multilingual Idioms and Similes in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the gap in the literature concerning the comparative performance of LLMs in interpreting different types of figurative language across multiple languages.	PARIA KHOSHTAB et. al.	arxiv-cs.CL	2024-10-21
244	Inferring Visualization Intent from Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider a conversational approach to visualization, where users specify their needs at each step in natural language, with a visualization being returned in turn.	Haotian Li; Nithin Chalapathi; Huamin Qu; Alvin Cheung; Aditya G. Parameswaran;	cikm	2024-10-21
245	Improving Neuron-level Interpretability with White-box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE), explicitly engineered to capture sparse, low-dimensional structures within data distributions.	Hao Bai; Yi Ma;	arxiv-cs.CL	2024-10-21
246	Does ChatGPT Have A Poetic Style? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the GPT models, especially GPT-4, can successfully produce poems in a range of both common and uncommon English-language forms in superficial yet noteworthy ways, such as by producing poems of appropriate lengths for sonnets (14 lines), villanelles (19 lines), and sestinas (39 lines).	Melanie Walsh; Anna Preus; Elizabeth Gronski;	arxiv-cs.CL	2024-10-20
247	Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias.	Sanguk Lee; Kai-Qi Yang; Tai-Quan Peng; Ruth Heo; Hui Liu;	arxiv-cs.AI	2024-10-20
248	BERTtime Stories: Investigating The Role of Synthetic Story Data in Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe our contribution to the Strict and Strict-Small tracks of the 2nd iteration of the BabyLM Challenge.	Nikitas Theodoropoulos; Giorgos Filandrianos; Vassilis Lyberatos; Maria Lymperaiou; Giorgos Stamou;	arxiv-cs.CL	2024-10-20
249	DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-based Proximal Policy Optimization (DTPPO) method.	Anning Wei; Jintao Liang; Kaiyuan Lin; Ziyue Li; Rui Zhao;	arxiv-cs.MA	2024-10-19
250	Bias Amplification: Language Models As Increasingly Biased Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the gap in understanding the bias amplification of LLMs with four main contributions. Firstly, we propose a theoretical framework, defining the necessary and sufficient conditions for its occurrence, and emphasizing that it occurs independently of model collapse.	ZE WANG et. al.	arxiv-cs.AI	2024-10-19
251	Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This scarcity of annotated data impedes the development of effective machine learning models for cancer document classification. To address this challenge, we present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics.	ELIAS HOSSAIN et. al.	arxiv-cs.AI	2024-10-19
252	From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation By Natural Language Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces SecCode, a framework that leverages an innovative interactive encouragement prompting (EP) technique for secure code generation with \textit{only NL} prompts.	SHIGANG LIU et. al.	arxiv-cs.CR	2024-10-18
253	Automated Genre-Aware Article Scoring and Feedback Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the development of an advanced intelligent article scoring system that not only assesses the overall quality of written work but also offers detailed feature-based scoring tailored to various article genres.	CHIHANG WANG et. al.	arxiv-cs.CL	2024-10-18
254	XPerT: Extended Persistence Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel transformer architecture called the \textit{Extended Persistence Transformer (xPerT)}, which is highly scalable than the compared to Persformer, an existing transformer for persistence diagrams.	Sehun Kim;	arxiv-cs.LG	2024-10-18
255	Harmony: A Home Agent for Responsive Management and Action Optimization with A Locally Deployed Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to optimize the privacy and economy of data processing while maintaining the powerful functions of LLMs, we propose Harmony, a smart home assistant framework that uses a locally deployable small-scale LLM.	Ziqi Yin; Mingxin Zhang; Daisuke Kawahara;	arxiv-cs.HC	2024-10-18
256	Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs.	XINGYU TAN et. al.	arxiv-cs.CL	2024-10-18
257	SBI-RAG: Enhancing Math Word Problem Solving for Students Through Schema-Based Instruction and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Schema-based instruction (SBI) is an evidence-based strategy that helps students categorize problems based on their structure, improving problem-solving accuracy. Building on this, we propose a Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG) framework that incorporates a large language model (LLM).	Prakhar Dixit; Tim Oates;	arxiv-cs.LG	2024-10-17
258	FaithBench: A Diverse Hallucination Benchmark for Summarization By Modern LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FaithBench, a summarization hallucination benchmark comprising challenging hallucinations made by 10 modern LLMs from 8 different families, with ground truth annotations by human experts.	FORREST SHENG BAO et. al.	arxiv-cs.CL	2024-10-17
259	Transfer Learning on Transformers for Building Energy Consumption Forecasting — A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting.	Robert Spencer; Surangika Ranathunga; Mikael Boulic; Andries van Heerden; Teo Susnjak;	arxiv-cs.LG	2024-10-17
260	Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employed a cross-agent prediction model to compare the metacognitive performance of humans and ChatGPT in a language-based memory task involving garden-path sentences preceded by either fitting or unfitting context sentences.	Markus Huff; Elanur Ulakçı;	arxiv-cs.CL	2024-10-17
261	Linguistically Grounded Analysis of Language Models Using Shapley Head Values Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the processing of morphosyntactic phenomena, by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs).	Marcell Fekete; Johannes Bjerva;	arxiv-cs.CL	2024-10-17
262	Measuring and Modifying The Readability of English Texts with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, in a pre-registered human experiment (N = 59), we ask whether Turbo can reliably make text easier or harder to read. We find evidence to support this hypothesis, though considerable variance in human judgments remains unexplained.	Sean Trott; Pamela D. Rivière;	arxiv-cs.CL	2024-10-17
263	Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that transformer-based solutions pose higher computational demands, consistently yield inferior performance, and experience significant performance degradation when quantized to accommodate resource-constrained devices.	Clayton Souza Leite; Henry Mauranen; Aziza Zhanabatyrova; Yu Xiao;	arxiv-cs.LG	2024-10-17
264	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed raster order using BERT- or GPT-like transformer architectures.	LIJIE FAN et. al.	arxiv-cs.CV	2024-10-17
265	Detecting AI-Generated Texts in Cross-Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs.	You Zhou; Jie Wang;	arxiv-cs.CL	2024-10-17
266	Context-Scaling Versus Task-Scaling in In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training.	Amirhesam Abedsoltan; Adityanarayanan Radhakrishnan; Jingfeng Wu; Mikhail Belkin;	arxiv-cs.LG	2024-10-16
267	Stabilize The Latent Space for Image Autoregressive Modeling: A Unified Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This finding contrasts sharply with the field of NLP, where the autoregressive model GPT has established a commanding presence. To address this discrepancy, we introduce a unified perspective on the relationship between latent space and generative models, emphasizing the stability of latent space in image generative modeling.	YONGXIN ZHU et. al.	arxiv-cs.CV	2024-10-16
268	With A Grain of SALT: Are LLMs Fair Across Social Dimensions? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an analysis of biases in open-source Large Language Models (LLMs) across various genders, religions, and races.	Samee Arif; Zohaib Khan; Agha Ali Raza; Awais Athar;	arxiv-cs.CL	2024-10-16
269	Reconstruction of Differentially Private Text Sanitization Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two attacks (black-box and white-box) based on the accessibility to LLMs and show that LLMs could connect the pair of DP-sanitized text and the corresponding private training data of LLMs by giving sample text pairs as instructions (in the black-box attacks) or fine-tuning data (in the white-box attacks).	SHUCHAO PANG et. al.	arxiv-cs.CR	2024-10-16
270	SELF-BART : A Transformer-based Molecular Representation Model Using SELFIES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we develop an encoder-decoder model based on BART that is capable of leaning molecular representations and generate new molecules.	INDRA PRIYADARSINI et. al.	arxiv-cs.CE	2024-10-16
271	When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether GPTs can appropriately respond to unanswerable math word problems by applying prompts typically used in solvable mathematical scenarios.	Asir Saadat; Tasmia Binte Sogir; Md Taukir Azam Chowdhury; Syem Aziz;	arxiv-cs.CL	2024-10-16
272	Unifying Economic and Language Models for Enhanced Sentiment Analysis of The Oil Market Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these LMs often have difficulty with domain-specific terminology, limiting their effectiveness in the crude oil sector. Addressing this gap, we introduce CrudeBERT, a fine-tuned LM specifically for the crude oil market.	Himmet Kaplan; Ralf-Peter Mundani; Heiko Rölke; Albert Weichselbraun; Martin Tschudy;	arxiv-cs.IR	2024-10-16
273	Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Jigsaw Puzzles (JSP), a straightforward yet effective multi-turn jailbreak strategy against the advanced LLMs.	Hao Yang; Lizhen Qu; Ehsan Shareghi; Gholamreza Haffari;	arxiv-cs.CL	2024-10-15
274	Table-LLM-Specialist: Language Model Specialists for Tables Using Iterative Generator-Validator Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Table-LLM-Specialist, or Table-Specialist for short, as a new self-trained fine-tuning paradigm specifically designed for table tasks.	JUNJIE XING et. al.	arxiv-cs.CL	2024-10-15
275	In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have achieved impressive results across various tasks.	Alireza Shamshiri; Kyeong Rok Ryu; June Young Park;	arxiv-cs.CL	2024-10-15
276	TraM : Enhancing User Sleep Prediction with Transformer-based Multivariate Time Series Modeling and Machine Learning Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach that leverages Transformer-based multivariate time series model and Machine Learning Ensembles to predict the quality of human sleep, emotional states, and stress levels.	Jinjae Kim; Minjeong Ma; Eunjee Choi; Keunhee Cho; Chanwoo Lee;	arxiv-cs.LG	2024-10-15
277	De-jargonizing Science for Journalists with GPT-4: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study offers an initial evaluation of a human-in-the-loop system leveraging GPT-4 (a large language model or LLM), and Retrieval-Augmented Generation (RAG) to identify and define jargon terms in scientific abstracts, based on readers’ self-reported knowledge.	Sachita Nishal; Eric Lee; Nicholas Diakopoulos;	arxiv-cs.CL	2024-10-15
278	Embedding Self-Correction As An Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often encounter difficulties in certain aspects of mathematical reasoning, leading to flawed reasoning and erroneous results. To mitigate these issues, we introduce a novel mechanism, the Chain of Self-Correction (CoSC), specifically designed to embed self-correction as an inherent ability in LLMs, enabling them to validate and rectify their own results.	Kuofeng Gao; Huanqia Cai; Qingyao Shuai; Dihong Gong; Zhifeng Li;	arxiv-cs.AI	2024-10-14
279	Rethinking Legal Judgement Prediction in A Realistic Scenario in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking.	Shubham Kumar Nigam; Aniket Deroy; Subhankar Maity; Arnab Bhattacharya;	arxiv-cs.CL	2024-10-14
280	Performance in A Dialectal Profiling Task of LLMs for Varieties of Brazilian Portuguese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results offer sociolinguistic contributions for an equity fluent NLP technology.	Raquel Meister Ko Freitag; Túlio Sousa de Gois;	arxiv-cs.CL	2024-10-14
281	RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers.	Md Kowsher; Tara Esmaeilbeig; Chun-Nam Yu; Mojtaba Soltanalian; Niloofar Yousefi;	arxiv-cs.CL	2024-10-13
282	Evaluating Gender Bias of LLMs in Making Morality Judgements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates whether current closed and open-source LLMs possess gender bias, especially when asked to give moral opinions. To evaluate these models, we curate and introduce a new dataset GenMO (Gender-bias in Morality Opinions) comprising parallel short stories featuring male and female characters respectively.	Divij Bajaj; Yuanyuan Lei; Jonathan Tong; Ruihong Huang;	arxiv-cs.CL	2024-10-13
283	Transformer-based Language Models for Reasoning in The Description Logic ALCQ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we systematically investigate the logical reasoning capabilities of a supervised fine-tuned DeBERTa-based model and two large language models (GPT-3.5, GPT-4) with few-shot prompting.	Angelos Poulis; Eleni Tsalapati; Manolis Koubarakis;	arxiv-cs.CL	2024-10-12
284	\llinstruct: An Instruction-tuned Model for English Language Proficiency Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications.	Debanjan Ghosh; Sophia Chan;	arxiv-cs.CL	2024-10-11
285	Improving Legal Entity Recognition Using A Hybrid Transformer Model and Semantic Filtering Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel hybrid model that enhances the accuracy and precision of Legal-BERT, a transformer model fine-tuned for legal text processing, by introducing a semantic similarity-based filtering mechanism.	Duraimurugan Rajamanickam;	arxiv-cs.CL	2024-10-11
286	Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism Via Dual Diffusion Models and GPT Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Traditional methods often rely on extensive and costly data collection using sonar sensors, jeopardizing data quality and diversity. To overcome these limitations, this study proposes a new sonar image synthesis framework, Synth-SONAR leveraging diffusion models and GPT prompting.	Purushothaman Natarajan; Kamal Basha; Athira Nambiar;	arxiv-cs.CV	2024-10-11
287	Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a pipeline for developing in-house LLMs tailored to identify differential diagnoses from radiology reports.	LUOYAO CHEN et. al.	arxiv-cs.CL	2024-10-11
288	Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture.	Evan Lucas; Dylan Kangas; Timothy C Havens;	arxiv-cs.CL	2024-10-11
289	AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For instance, attacks tend to be less effective when models pay more attention to system prompts designed to ensure LLM safety alignment. Building on this discovery, we introduce an enhanced method that manipulates models’ attention scores to facilitate LLM jailbreaking, which we term AttnGCG.	ZIJUN WANG et. al.	arxiv-cs.CL	2024-10-11
290	Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data.	Grace Proebsting; Adam Poliak;	arxiv-cs.CL	2024-10-11
291	Robust AI-Generated Text Detection By Restricted Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on the robustness of classifier-based detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains.	KRISTIAN KUZNETSOV et. al.	arxiv-cs.CL	2024-10-10
292	HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a method named HorGait, which utilizes a hybrid model with a Transformer architecture for gait recognition on the planar projection of 3D point clouds from LiDAR.	JIAXING HAO et. al.	arxiv-cs.CV	2024-10-10
293	Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While morally clear scenarios are more discernible to LLMs, greater difficulty is encountered in morally ambiguous contexts. In this investigation, we explored LLM calibration to show that human and LLM judgments are poorly aligned in such scenarios.	PRANAV SENTHILKUMAR et. al.	arxiv-cs.CL	2024-10-10
294	Evaluating Transformer Models for Suicide Risk Detection on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on leveraging state-of-the-art natural language processing solutions for identifying suicide risk in social media posts as a submission for the IEEE BigData 2024 Cup: Detection of Suicide Risk on Social Media conducted by the kubapok team.	Jakub Pokrywka; Jeremi I. Kaczmarek; Edward J. Gorzelańczyk;	arxiv-cs.CL	2024-10-10
295	VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce VibeCheck, a system for automatically comparing a pair of LLMs by discovering identifying traits of a model (vibes) that are well-defined, differentiating, and user-aligned.	Lisa Dunlap; Krishna Mandal; Trevor Darrell; Jacob Steinhardt; Joseph E Gonzalez;	arxiv-cs.CL	2024-10-10
296	The Rise of AI-Generated Content in Wikipedia Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative, to establish lower bounds on the presence of AI-generated content in recently created Wikipedia pages.	Creston Brooks; Samuel Eggert; Denis Peskoff;	arxiv-cs.CL	2024-10-10
297	Optimized Spatial Architecture Mapping Flow for Transformer Accelerators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the design process for existing spatial architectures is predominantly manual, and it often involves time-consuming redesigns for new applications and new problem dimensions, which greatly limits the development of optimally designed accelerators for Transformer models. To address these challenges, we propose SAMT (Spatial Architecture Mapping for Transformers), a comprehensive framework designed to optimize the dataflow mapping of Transformer inference workloads onto spatial accelerators.	HAOCHENG XU et. al.	arxiv-cs.AR	2024-10-09
298	Stanceformer: Target-Aware Transformer for Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, these models yield similar performance regardless of whether we utilize or disregard target information, undermining the task’s significance. To address this challenge, we introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference.	Krishna Garg; Cornelia Caragea;	arxiv-cs.CL	2024-10-09
299	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks.	VIKTORIIA CHEKALINA et. al.	arxiv-cs.CL	2024-10-09
300	InAttention: Linear Context Scaling for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we modify the decoder-only transformer, replacing self-attention with InAttention, which scales linearly with context length during inference by having tokens attend only to initial states.	Joseph Eisner;	arxiv-cs.LG	2024-10-09
301	SWE-Bench+: Enhanced Coding Benchmark for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a systematic evaluation of the quality of SWE-bench remains missing. In this paper, we addressed this gap by presenting an empirical analysis of the SWE-bench dataset.	REEM ALEITHAN et. al.	arxiv-cs.SE	2024-10-09
302	Solving Multi-Goal Robotic Tasks with Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, no existing methods effectively combine offline training, multi-goal learning, and transformer-based architectures. In this paper, we address these challenges by introducing a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics.	Paul Gajewski; Dominik Żurek; Marcin Pietroń; Kamil Faber;	arxiv-cs.RO	2024-10-08
303	SC-Bench: A Large-Scale Dataset for Smart Contract Auditing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SC-Bench, the first dataset for automated smart-contract auditing research.	Shihao Xia; Mengting He; Linhai Song; Yiying Zhang;	arxiv-cs.CR	2024-10-08
304	Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini – frontier models trained to be helpful, harmless, and honest – can engage in specification gaming without training on a curriculum of tasks, purely from in-context iterative reflection (which we call in-context reinforcement learning, ICRL).	Leo McKee-Reid; Christoph Sträter; Maria Angelica Martinez; Joe Needham; Mikita Balesni;	arxiv-cs.AI	2024-10-08
305	A GPT-based Decision Transformer for Multi-Vehicle Coordination at Unsignalized Intersections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the application of the Decision Transformer, a decision-making algorithm based on the Generative Pre-trained Transformer (GPT) architecture, to multi-vehicle coordination at unsignalized intersections.	Eunjae Lee; Minhee Kang; Yoojin Choi; Heejin Ahn;	arxiv-cs.RO	2024-10-08
306	Unveiling Transformer Perception By Exploring Input Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a general method for the exploration of equivalence classes in the input space of Transformer models.	Alessandro Benfenati; Alfio Ferrara; Alessio Marta; Davide Riva; Elisabetta Rocchetti;	arxiv-cs.LG	2024-10-08
307	A Comparative Study of Hybrid Models in Health Misinformation Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs), aiming to develop more effective tools for countering the spread of health misinformation during the pan-demic.	Mkululi Sikosana; Oluwaseun Ajao; Sean Maudsley-Barton;	arxiv-cs.IR	2024-10-08
308	Leveraging Free Energy in Pretraining Model Selection for Improved Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a Bayesian model selection criterion, called the downstream free energy, which quantifies a checkpoint’s adaptability by measuring the concentration of nearby favorable parameters for the downstream task.	Michael Munn; Susan Wei;	arxiv-cs.LG	2024-10-07
309	SAND: Smooth Imputation of Sparse and Noisy Functional Data with Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the transformer architecture has come to dominate other models for text and image data, its application to irregularly-spaced longitudinal data has been limited. We introduce a variant of the transformer that enables it to more smoothly impute such functional data.	Ju-Sheng Hong; Junwen Yao; Jonas Mueller; Jane-Ling Wang;	nips	2024-10-07
310	Timer-XL: Long-Context Transformers for Unified Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Timer-XL, a generative Transformer for unified time series forecasting.	Yong Liu; Guo Qin; Xiangdong Huang; Jianmin Wang; Mingsheng Long;	arxiv-cs.LG	2024-10-07
311	LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH).	XIAONAN NIE et. al.	nips	2024-10-07
312	JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.	KUN ZHOU et. al.	nips	2024-10-07
313	Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models Via Recounted Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new prompting technique tailored for temporal reasoning, Narrative-of-Thought (NoT), that first converts the events set to a Python class, then prompts a small model to generate a temporally grounded narrative, guiding the final generation of a temporal graph.	Xinliang Frederick Zhang; Nick Beauchamp; Lu Wang;	arxiv-cs.CL	2024-10-07
314	Achieving Efficient Alignment Through Learned Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Aligner, a novel and simple alignment paradigm that learns the correctional residuals between preferred and dispreferred answers using a small model.	JIAMING JI et. al.	nips	2024-10-07
315	VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach to reduce vision compute by leveraging redundant vision tokens “skipping layers” rather than decreasing the number of vision tokens.	SHIWEI WU et. al.	nips	2024-10-07
316	In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introducing the concept of state vector.	DONGFANG LI et. al.	nips	2024-10-07
317	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 774M parameters.	RHEA SUKTHANKER et. al.	nips	2024-10-07
318	Finding Transformer Circuits With Edge Pruning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we frame circuit discovery as an optimization problem and propose _Edge Pruning_ as an effective and scalable solution.	Adithya Bhaskar; Alexander Wettig; Dan Friedman; Danqi Chen;	nips	2024-10-07
319	Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods.	Lingxiao Zhao; Xueying Ding; Leman Akoglu;	nips	2024-10-07
320	Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Efficient Multi-Task Learning (EMTAL), a novel approach that transforms a pre-trained Transformer into an efficient multi-task learner during training, and reparameterizes the knowledge back to the original Transformer for efficient inference.	Hanwen Zhong; Jiaxin Chen; Yutong Zhang; Di Huang; Yunhong Wang;	nips	2024-10-07
321	$M^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents $M^3$GPT, an advanced \textbf{M}ultimodal, \textbf{M}ultitask framework for \textbf{M}otion comprehension and generation.	MINGSHUANG LUO et. al.	nips	2024-10-07
322	APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents APIGen, an automated data generation pipeline designed to produce verifiable high-quality datasets for function-calling applications.	ZUXIN LIU et. al.	nips	2024-10-07
323	Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks.	Jonathan Hayase; Ema Borevković; Nicholas Carlini; Florian Tramer; Milad Nasr;	nips	2024-10-07
324	ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching.	JUNJIE NI et. al.	nips	2024-10-07
325	FinBen: An Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making.	QIANQIAN XIE et. al.	nips	2024-10-07
326	A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation.	ARCHIT SHARMA et. al.	nips	2024-10-07
327	Does RoBERTa Perform Better Than BERT in Continual Learning: An Attention Sink Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we observe that pre-trained models may allocate high attention scores to some ‘sink’ tokens, such as [SEP] tokens, which are ubiquitous across various tasks.	Xueying Bai; Yifan Sun; Niranjan Balasubramanian;	arxiv-cs.LG	2024-10-07
328	SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods.	PEI ZHOU et. al.	nips	2024-10-07
329	Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks.	JONATHAN THOMM et. al.	nips	2024-10-07
330	The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent research suggests that state-space models (SSMs) like Mamba can be competitive with Transformer models for language modeling with advantageous deployment characteristics. Given the focus and expertise on training large-scale Transformer models, we consider the challenge of converting these pretrained models into SSMs for deployment.	Junxiong Wang; Daniele Paliotta; Avner May; Alexander Rush; Tri Dao;	nips	2024-10-07
331	SemCoder: Training Code Language Models with Comprehensive Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair.	YANGRUIBO DING et. al.	nips	2024-10-07
332	Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library.	YE FANG et. al.	nips	2024-10-07
333	Weak-to-Strong Search: Align Large Language Models Via Searching Over Small Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce weak-to-strong search, framing the alignment of a large language model as a test-time greedy search to maximize the log-likelihood difference between small tuned and untuned models while sampling from the frozen large model.	ZHANHUI ZHOU et. al.	nips	2024-10-07
334	Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work has proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top $k$ similar tokens.	CHAU TRAN et. al.	nips	2024-10-07
335	LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The KG datastore is designed as a plug-and-play module, allowing for seamless integration with various model architectures. We introduce and evaluate three distinct frameworks within this paradigm: KG-LLaVA, which integrates the pre-trained LLaVA model with KG-RAG; Med-XPT, a custom framework combining MedCLIP, a transformer-based projector, and GPT-2; and Bio-LLaVA, which adapts LLaVA by incorporating the Bio-ViT-L vision model.	Ameer Hamza; Yong Hyun Ahn; Sungyoung Lee; Seong Tae Kim;	arxiv-cs.CV	2024-10-07
336	MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents.	WEI TAO et. al.	nips	2024-10-07
337	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a method that is able to distill a pre-trained Transformer architecture into alternative architectures such as state space models (SSMs).	Aviv Bick; Kevin Li; Eric Xing; J. Zico Kolter; Albert Gu;	nips	2024-10-07
338	RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.	SHAOTENG LIU et. al.	nips	2024-10-07
339	SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel transformer-to-SNN conversion method that outputs an end-to-end spike-based transformer, named SpikedAttention.	Sangwoo Hwang; Seunghyun Lee; Dahoon Park; Donghun Lee; Jaeha Kung;	nips	2024-10-07
340	Understanding Transformers Via N-Gram Statistics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper takes a first step in this direction by considering families of functions (i.e. rules) formed out of simple N-gram based statistics of the training data. By studying how well these rulesets approximate transformer predictions, we obtain a variety of novel discoveries: a simple method to detect overfitting during training without using a holdout set, a quantative measure of how transformers progress from learning simple to more complex statistical rules over the course of training, a model-variance criterion governing when transformer predictions tend to be described by N-gram rules, and insights into how well transformers can be approximated by N-gram rulesets in the limit where these rulesets become increasingly complex.	Timothy Nguyen;	nips	2024-10-07
341	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	nips	2024-10-07
342	Approximation Rate of The Transformer Architecture for Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the approximation rate for single-layer Transformers with one head.	Haotian Jiang; Qianxiao Li;	nips	2024-10-07
343	OccamLLM: Fast and Exact Language Model Arithmetic in A Single Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that enables exact arithmetic in a single autoregressive step, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities.	OWEN DUGAN et. al.	nips	2024-10-07
344	Differential Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise.	TIANZHU YE et. al.	arxiv-cs.CL	2024-10-07
345	Perception of Knowledge Boundary for Large Language Models Through Semi-open-ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perceive the LLMs’ knowledge boundary with semi-open-ended questions by discovering more ambiguous answers.	ZHIHUA WEN et. al.	nips	2024-10-07
346	Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.	Akshay Krishnamurthy; Keegan Harris; Dylan J Foster; Cyril Zhang; Aleksandrs Slivkins;	nips	2024-10-07
347	UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction.	Yansong Ning; Hao Liu;	nips	2024-10-07
348	Visual Autoregressive Modeling: Scalable Image Generation Via Next-Scale Prediction IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine next-scale prediction or next-resolution prediction, diverging from the standard raster-scan next-token prediction.	Keyu Tian; Yi Jiang; Zehuan Yuan; BINGYUE PENG; Liwei Wang;	nips	2024-10-07
349	Seshat Global History Databank Text Dataset and Benchmark of Large Language Models’ History Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This benchmarking is particularly challenging, given that human knowledge of history is inherently unbalanced, with more information available on Western history and recent periods. To address this challenge, we introduce a curated sample of the Seshat Global History Databank, which provides a structured representation of human historical knowledge, containing 36,000 data points across 600 historical societies and over 600 scholarly references.	JAKOB HAUSER et. al.	nips	2024-10-07
350	DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size—adding a few thousand parameters for large-scale models in the 100B parameters range.	Matteo Pagliardini; Amirkeivan Mohtashami; François Fleuret; Martin Jaggi;	nips	2024-10-07
351	ProtocoLLM: Automatic Evaluation Framework of LLMs on Domain-Specific Scientific Protocol Formulation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a flexible, automatic framework to evaluate LLM’s capability on SPFT: ProtocoLLM.	Seungjun Yi; Jaeyoung Lim; Juyong Yoon;	arxiv-cs.CL	2024-10-06
352	Selective Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Transformer models face two key challenges when dealing with HSI scenes characterized by diverse land cover types and rich spectral information: (1) fixed receptive field representation overlooks effective contextual information; (2) redundant self-attention feature representation. To address these limitations, we propose a novel Selective Transformer (SFormer) for HSI classification.	Yichu Xu; Di Wang; Lefei Zhang; Liangpei Zhang;	arxiv-cs.CV	2024-10-04
353	How Language Models Prioritize Contextual Grammatical Cues? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun.	Hamidreza Amirzadeh; Afra Alishahi; Hosein Mohebbi;	arxiv-cs.CL	2024-10-04
354	Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts.	AMEY HENGLE et. al.	arxiv-cs.CL	2024-10-04
355	Learning Semantic Structure Through First-Order-Logic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study whether transformer-based language models can extract predicate argument structure from simple sentences.	Akshay Chaturvedi; Nicholas Asher;	arxiv-cs.CL	2024-10-04
356	Dynamic Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we introduce a Timestep-wise Dynamic Width (TDW) approach that adapts model width conditioned on the generation timesteps.	WANGBO ZHAO et. al.	arxiv-cs.CV	2024-10-04
357	Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the tokenization techniques employed by state-of-the-art large language models (LLMs) and their implications on the cost and availability of services across different languages, especially low resource languages.	Abrar Rahman; Garry Bowlin; Binit Mohanty; Sean McGunigal;	arxiv-cs.CL	2024-10-04
358	AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose AutoDAN-Turbo, a black-box jailbreak method that can automatically discover as many jailbreak strategies as possible from scratch, without any human intervention or predefined scopes (e.g., specified candidate strategies), and use them for red-teaming.	XIAOGENG LIU et. al.	arxiv-cs.CR	2024-10-03
359	IndicSentEval: How Effectively Do Multilingual Transformer Models Encode Linguistic Properties for Indic Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate similar questions regarding encoding capability and robustness for 8 linguistic properties across 13 different perturbations in 6 Indic languages, using 9 multilingual Transformer models (7 universal and 2 Indic-specific).	Akhilesh Aravapalli; Mounika Marreddy; Subba Reddy Oota; Radhika Mamidi; Manish Gupta;	arxiv-cs.CL	2024-10-03
360	SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% The Cost Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes SIEVE, a lightweight alternative that matches GPT-4o accuracy at a fraction of the cost.	Jifan Zhang; Robert Nowak;	arxiv-cs.CL	2024-10-03
361	AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first correct-by-construction learning-based system for step-by-step mathematical integration.	Mert Ünsal; Timon Gehr; Martin Vechev;	arxiv-cs.LG	2024-10-03
362	CulturalBench: A Robust, Diverse and Challenging Benchmark on Measuring The (Lack Of) Cultural Knowledge of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CulturalBench: a set of 1,227 human-written and human-verified questions for effectively assessing LLMs’ cultural knowledge, covering 45 global regions including the underrepresented ones like Bangladesh, Zimbabwe, and Peru.	YU YING CHIU et. al.	arxiv-cs.CL	2024-10-03
363	Intrinsic Evaluation of RAG Systems for Deep-Logic Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Overall Performance Index (OPI), an intrinsic metric to evaluate retrieval-augmented generation (RAG) mechanisms for applications involving deep-logic queries.	Junyi Hu; You Zhou; Jie Wang;	arxiv-cs.AI	2024-10-03
364	Coal Mining Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to coal mining question answering (QA) using large language models (LLMs) combined with tailored prompt engineering techniques.	Antonio Carlos Rivera; Anthony Moore; Steven Robinson;	arxiv-cs.CL	2024-10-03
365	TSOTSALearning at LLMs4OL Tasks A and B : Combining Rules to Large Language Model for Ontology Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our contribution to the Large Language Model For Ontology Learning (LLMs4OL) challenge hosted by ISWC conference. The challenge involves extracting and …	Carick Appolinaire Atezong Ymele; Azanzi Jiomekong;	LLMs4OL@ISWC	2024-10-02
366	Automatic Deductive Coding in Discourse Analysis: An Application of Large Language Models in Learning Analytics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding.	Lishan Zhang; Han Wu; Xiaoshan Huang; Tengfei Duan; Hanxiang Du;	arxiv-cs.CL	2024-10-02
367	A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.	LIANG CHEN et. al.	arxiv-cs.CV	2024-10-02
368	Emotion-Aware Response Generation Using Affect-Enriched Embeddings with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel framework that integrates multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as LLAMA 2, Flan-T5, ChatGPT 3.0, and ChatGPT 4.0.	Abdur Rasool; Muhammad Irfan Shahzad; Hafsa Aslam; Vincent Chan;	arxiv-cs.CL	2024-10-02
369	ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, even state-of-the-art vision-language models (VLMs), such as GPT-4o, still fall short of human-level performance, particularly in intricate web environments and long-horizon tasks. To address these limitations, we present ExACT, an approach to combine test-time search and self-learning to build o1-like models for agentic applications.	XIAO YU et. al.	arxiv-cs.CL	2024-10-02
370	Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of LLMs and FinBERT for FSA, comparing their performance on news articles, financial reports and company announcements.	Yanxin Shen; Pulin Kirin Zhang;	arxiv-cs.IR	2024-10-02
371	Creative and Context-Aware Translation of East Asian Idioms with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, compiling a dictionary of candidate translations demands much time and creativity even for expert translators. To alleviate such burden, we evaluate if GPT-4 can help generate high-quality translations.	Kenan Tang; Peiyang Song; Yao Qin; Xifeng Yan;	arxiv-cs.CL	2024-10-01
372	Ayaka: A Versatile Transformer Accelerator With Low-Rank Estimation and Heterogeneous Dataflow Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer model has demonstrated outstanding performance in the field of artificial intelligence. However, its remarkable performance comes at the cost of substantial …	YUBIN QIN et. al.	IEEE Journal of Solid-State Circuits	2024-10-01
373	SIGMA: Secure GPT Inference with Function Secret Sharing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the …	KANAV GUPTA et. al.	Proc. Priv. Enhancing Technol.	2024-10-01
374	When Transformer Meets Large Graphs: An Expressive and Efficient Two-View Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The successes of applying Transformer to graphs have been witnessed on small graphs (e.g., molecular graphs), yet two barriers prevent its adoption on large graphs (e.g., citation …	Weirui Kuang; Zhen Wang; Zhewei Wei; Yaliang Li; Bolin Ding;	IEEE Transactions on Knowledge and Data Engineering	2024-10-01
375	Vision Transformer Promotes Cancer Diagnosis: A Comprehensive Review Related Papers Related Patents Related Grants Related Venues Related Experts View	Xiaoyan Jiang; Shuihua Wang; Eugene Yu-Dong Zhang;	Expert Syst. Appl.	2024-10-01
376	MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone’s Potential with Masked Autoregressive Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on this analysis, we propose Masked Autoregressive Pretraining (MAP) to pretrain a hybrid Mamba-Transformer vision backbone network.	Yunze Liu; Li Yi;	arxiv-cs.CV	2024-10-01
377	AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering.	Weiran Huang; Xiuyuan Chen; Yuan Lin; Yuchen Zhang;	eccv	2024-09-30
378	TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost.	Areeg Fahad Rasheed; M. Zarkoosh; Safa F. Abbas; Sana Sabah Al-Azzawi;	arxiv-cs.CL	2024-09-30
379	Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenges posed by the substantial training time and memory consumption associated with video transformers, focusing on the ViViT (Video Vision Transformer) model, in particular the Factorised Encoder version, as our baseline for action recognition tasks.	Shreyank N Gowda; Anurag Arnab; Jonathan Huang;	eccv	2024-09-30
380	Evaluating The Fairness of Task-adaptive Pretraining on Unlabeled Test Data Before Few-shot Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Few-shot learning benchmarks are critical for evaluating modern NLP techniques.	Kush Dubey;	arxiv-cs.CL	2024-09-30
381	Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face limitations in both shape reconstruction and texture generation. This paper introduces an innovative Analysis-by-Synthesis Transformer that addresses these limitations in a unified framework by effectively modeling pixel-to-shape and pixel-to-texture relationships.	DIAN JIA et. al.	eccv	2024-09-30
382	GENIXER: Empowering Multimodal Large Language Models As A Powerful Data Generator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce , a comprehensive data generation pipeline consisting of four key steps: (i) instruction data collection, (ii) instruction template design, (iii) empowering MLLMs, and (iv) data generation and filtering.	Henry Hengyuan Zhao; Pan Zhou; Mike Zheng Shou;	eccv	2024-09-30
383	HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction.	Fangqin Zhou; Mert Kilickaya; Joaquin Vanschoren; Ran Piao;	eccv	2024-09-30
384	OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.	WENZHAO ZHENG et. al.	eccv	2024-09-30
385	MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce MaskMamba, a novel hybrid model that combines Mamba and Transformer architectures, utilizing Masked Image Modeling for non-autoregressive image synthesis.	Wenchao Chen; Liqiang Niu; Ziyao Lu; Fandong Meng; Jie Zhou;	arxiv-cs.CV	2024-09-30
386	GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called , simultaneously applicable for various vision tasks only with a vanilla ViT.Interestingly, our builds a new benchmark in generalist performance, and fosters mutual enhancement across tasks, leading to significant improvements compared to isolated training.	HAIYANG WANG et. al.	eccv	2024-09-30
387	MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the hypergraph transformer-based method for trajectory prediction is yet to be explored. Therefore, we present a MultiscAle Relational Transformer (MART) network for multi-agent trajectory prediction.	Seongju Lee; Junseok Lee; Yeonguk Yu; Taeri Kim; Kyoobin Lee;	eccv	2024-09-30
388	ACE: All-round Creator and Editor Following Instructions Via Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose ACE, an All-round Creator and Editor, which achieves comparable performance compared to those expert models in a wide range of visual generation tasks.	ZHEN HAN et. al.	arxiv-cs.CV	2024-09-30
389	An Explainable Vision Question Answer Model Via Diffusion Chain-of-Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This means that generating explanations solely for the answer can lead to a semantic discrepancy between the content of the explanation and the question-answering content. To address this, we propose a step-by-step reasoning approach to reduce such semantic discrepancies.	Chunhao LU; Qiang Lu; Jake Luo;	eccv	2024-09-30
390	Sparse Attention Decomposition Applied to Circuit Tracing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we seek to isolate and identify the features used to effect communication and coordination among attention heads in GPT-2 small.	Gabriel Franco; Mark Crovella;	arxiv-cs.LG	2024-09-30
391	Depression Detection in Social Media Posts Using Transformer-based Models and Auxiliary Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing studies have explored various approaches to this problem but often fall short in terms of accuracy and robustness. To address these limitations, this research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers.	Marios Kerasiotis; Loukas Ilias; Dimitris Askounis;	arxiv-cs.CL	2024-09-30
392	LingoQA: Video Question Answering for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving.We release our dataset and benchmark1 as an evaluation platform for vision-language models in autonomous driving.	ANA-MARIA MARCU et. al.	eccv	2024-09-30
393	Multimodal Misinformation Detection By Learning from Synthetic Data with Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions.	Fengzhu Zeng; Wenqian Li; Wei Gao; Yan Pang;	arxiv-cs.CL	2024-09-29
394	3D-CT-GPT: Generating 3D Radiology Reports Through Integration of Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model specifically designed for generating radiology reports from 3D CT scans, particularly chest CTs.	HAO CHEN et. al.	arxiv-cs.CV	2024-09-28
395	Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a custom self-attention in-memory computing architecture based on emerging charge-based memories called gain cells, which can be efficiently written to store new tokens during sequence generation and enable parallel analog dot-product computation required for self-attention.	NATHAN LEROUX et. al.	arxiv-cs.NE	2024-09-28
396	Efficient Federated Intrusion Detection in 5G Ecosystem Using Optimized BERT-based Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs).	Frederic Adjewa; Moez Esseghir; Leila Merghem-Boulahia;	arxiv-cs.CR	2024-09-28
397	INSIGHTBUDDY-AI: Medication Extraction and Entity Linking Using Large Language Models and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects.	Pablo Romero; Lifeng Han; Goran Nenadic;	arxiv-cs.CL	2024-09-28
398	INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore fundamental questions related to solving mathematical reasoning problems using natural language and code with state-of-the-art LLMs, including GPT-4o-mini and LLama-3.1-8b-Turbo.	Xuyuan Xiong; Simeng Han; Ziyue Zhou; Arman Cohan;	arxiv-cs.CL	2024-09-28
399	FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research on food image understanding using recipe data has been a long-standing focus due to the diversity and complexity of the data.	Yuki Imajuku; Yoko Yamakata; Kiyoharu Aizawa;	arxiv-cs.CV	2024-09-27
400	Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Pre-trained language models offer promise for identifying suicidality from unstructured clinical narratives.	ZEHAN LI et. al.	arxiv-cs.CL	2024-09-27
401	Cottention: Linear Transformers With Cosine Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Cottention, a novel attention mechanism that replaces the softmax operation with cosine similarity.	Gabriel Mongaras; Trevor Dohm; Eric C. Larson;	arxiv-cs.LG	2024-09-27
402	Experimental Evaluation of Machine Learning Models for Goal-oriented Customer Service Chatbot with Pipeline Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a tailored experimental evaluation approach for goal-oriented customer service chatbots with pipeline architecture, focusing on three key components: Natural Language Understanding (NLU), dialogue management (DM), and Natural Language Generation (NLG).	Nurul Ain Nabilah Mohd Isa; Siti Nuraishah Agos Jawaddi; Azlan Ismail;	arxiv-cs.AI	2024-09-27
403	Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec.	Richard Yue; John E. Ortega;	arxiv-cs.CL	2024-09-26
404	The Application of GPT-4 in Grading Design University Students’ Assignment and Providing Feedback: An Exploratory Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to investigate whether GPT-4 can effectively grade assignments for design university students and provide useful feedback.	Qian Huang; Thijs Willems; King Wang Poon;	arxiv-cs.AI	2024-09-26
405	Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses from Closed-Domain LLM Vs. Clinical Teams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, responding to these patients’ inquiries has become a significant burden on healthcare workflows, consuming considerable time for clinical care teams. To address this, we introduce RadOnc-GPT, a specialized Large Language Model (LLM) powered by GPT-4 that has been designed with a focus on radiotherapeutic treatment of prostate cancer with advanced prompt engineering, and specifically designed to assist in generating responses.	YUEXING HAO et. al.	arxiv-cs.AI	2024-09-26
406	Assessing The Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people, from multiple social media sources.	Mukaffi Bin Moin; Pronay Debnath; Usafa Akther Rifa; Rijeet Bin Anis;	arxiv-cs.CL	2024-09-25
407	Beyond Turing Test: Can GPT-4 Sway Experts’ Decisions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the post-Turing era, evaluating large language models (LLMs) involves assessing generated text based on readers’ reactions rather than merely its indistinguishability from human-produced content.	Takehiro Takayanagi; Hiroya Takamura; Kiyoshi Izumi; Chung-Chi Chen;	arxiv-cs.CE	2024-09-25
408	Reducing and Exploiting Data Augmentation Noise Through Meta Reweighting Contrastive Learning for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To boost deep learning models’ performance given augmented data/samples in text classification tasks, we propose a novel framework, which leverages both meta learning and contrastive learning techniques as parts of our design for reweighting the augmented samples and refining their feature representations based on their quality.	Guanyi Mou; Yichuan Li; Kyumin Lee;	arxiv-cs.CL	2024-09-25
409	Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code.	Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier;	arxiv-cs.CR	2024-09-25
410	SynChart: Synthesizing Charts from Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a large-scale chart dataset, SynChart, which contains approximately 4 million diverse chart images with over 75 million dense annotations, including data tables, code, descriptions, and question-answer sets. We trained a 4.2B chart-expert model using this dataset and achieve near-GPT-4O performance on the ChartQA task, surpassing GPT-4V.	MENGCHEN LIU et. al.	arxiv-cs.AI	2024-09-24
411	Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms in English, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality.	Xufeng Duan; Xinyu Zhou; Bei Xiao; Zhenguang G. Cai;	arxiv-cs.CL	2024-09-24
412	MonoFormer: One Transformer for Both Diffusion and Autoregression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to study a simple idea: share one transformer for both autoregression and diffusion.	CHUYANG ZHAO et. al.	arxiv-cs.CV	2024-09-24
413	GPT-4 As A Homework Tutor Can Improve Student Engagement and Learning Outcomes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools.	Alessandro Vanzo; Sankalan Pal Chowdhury; Mrinmaya Sachan;	arxiv-cs.CY	2024-09-24
414	SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in FL environments.	Minyeong Choe; Cheolhee Park; Changho Seo; Hyunil Kim;	arxiv-cs.LG	2024-09-23
415	SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce \textit{multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries}, SOFI.	Sebastian Janampa; Marios Pattichis;	arxiv-cs.CV	2024-09-23
416	Towards A Realistic Long-Term Benchmark for Open-Web Research Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present initial results of a forthcoming benchmark for evaluating LLM agents on white-collar tasks of economic value.	Peter Mühlbacher; Nikos I. Bosse; Lawrence Phillips;	arxiv-cs.CL	2024-09-23
417	Improving Academic Skills Assessment with NLP and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP).	Xinyi Huang; Yingyi Wu; Danyang Zhang; Jiacheng Hu; Yujian Long;	arxiv-cs.CL	2024-09-23
418	Can Pre-trained Language Models Generate Titles for Research Papers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fine-tune pre-trained language models to generate titles of papers from their abstracts.	Tohida Rehman; Debarshi Kumar Sanyal; Samiran Chattopadhyay;	arxiv-cs.CL	2024-09-22
419	Evaluating The Quality of Code Comments Generated By Large Language Models for Novice Programmers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated.	Aysa Xuemo Fan; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Jiaze Ke;	arxiv-cs.SE	2024-09-22
420	The Use of GPT-4o and Other Large Language Models for The Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OpenAI’s ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft’s Copilot, Google’s Gemini 1.5 Pro, and Antrophic’s Claude 3.5 Sonnet can be effectively used in various phases of scientific research.	Goran Bubaš;	arxiv-cs.AI	2024-09-21
421	Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Narrow Jump to Conclusions (NJTC) and Normalized Narrow Jump to Conclusions (N-NJTC) – parameter efficient alternatives to standard linear shortcutting that reduces shortcut parameter count by over 97%.	Amrit Diggavi Seshadri;	arxiv-cs.AI	2024-09-21
422	AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the capabilities and potential of the intelligent personal assistant (IPA) CORE (Checklist Organizer for Research and Exploration), designed to support astronauts during procedures onboard the International Space Station (ISS), the Lunar Gateway station, and beyond.	OLIVER BENSCH et. al.	arxiv-cs.AI	2024-09-21
423	Loop Neural Networks for Parameter Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size.	Kei-Sing Ng; Qingchen Wang;	arxiv-cs.AI	2024-09-21
424	On Importance of Pruning and Distillation for Efficient Low Resource NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the case of the low-resource Indic language Marathi.	AISHWARYA MIRASHI et. al.	arxiv-cs.CL	2024-09-21
425	HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This approach ensures that the correlation between the original and updated parameters is preserved, leveraging the semantic features learned during pre-training. Building on this paradigm, we present the Hadamard Updated Transformation (HUT) method.	Geyuan Zhang; Xiaofei Zhou; Chuheng Chen;	arxiv-cs.CL	2024-09-20
426	Prompting Large Language Models for Supporting The Differential Diagnosis of Anemia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by clinical guidelines, our study aimed to develop pathways similar to those that can be obtained in clinical guidelines.	Elisa Castagnari; Lillian Muyama; Adrien Coulet;	arxiv-cs.CL	2024-09-20
427	T2M-X: Learning Expressive Text-to-Motion Generation from Partially Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose T2M-X, a two-stage method that learns expressive text-to-motion generation from partially annotated data.	Mingdian Liu; Yilin Liu; Gurunandan Krishnan; Karl S Bayer; Bing Zhou;	arxiv-cs.CV	2024-09-20
428	Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are renowned for their exceptional capabilities, and applying to a wide range of applications.	Md Abdur Rahman; Hossain Shahriar; Fan Wu; Alfredo Cuzzocrea;	arxiv-cs.CL	2024-09-20
429	‘Since Lawyers Are Males..’: Examining Implicit Gender Bias in Hindi Language Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education.	Ishika Joshi; Ishita Gupta; Adrita Dey; Tapan Parikh;	arxiv-cs.CL	2024-09-20
430	Drift to Remember Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that representational drift can alleviate catastrophic forgetting in AI during new task acquisition. To test this, we introduce DriftNet, a network designed to constantly explore various local minima in the loss landscape while dynamically retrieving relevant tasks.	JIN DU et. al.	arxiv-cs.AI	2024-09-20
431	3DTopia-XL: Scaling High-quality 3D Asset Generation Via Primitive Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for physically based rendering (PBR). In this paper, we introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome these limitations.	ZHAOXI CHEN et. al.	arxiv-cs.CV	2024-09-19
432	$\text{M}^\text{6}(\text{GPT})^\text{3}$: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm for the generation of melodic elements.	Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara;	arxiv-cs.SD	2024-09-19
433	TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing prompt compression techniques either rely on sub-optimal metrics such as information entropy or model it as a task-agnostic token classification problem that fails to capture task-specific information. To address these issues, we propose a novel and efficient reinforcement learning (RL) based task-aware prompt compression method.	SHIVAM SHANDILYA et. al.	arxiv-cs.CL	2024-09-19
434	Introducing The Large Medical Model: State of The Art Healthcare Cost and Risk Prediction with Transformers Trained on Patient Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration.	RICKY SAHU et. al.	arxiv-cs.LG	2024-09-19
435	Program Slicing in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the application of large language models (LLMs) to both static and dynamic program slicing, with a focus on Java programs.	Kimya Khakzad Shahandashti; Mohammad Mahdi Mohajer; Alvine Boaye Belle; Song Wang; Hadi Hemmati;	arxiv-cs.SE	2024-09-18
436	Recommendation with Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a taxonomy that categorizes DGMs into three types: ID-driven models, large language models (LLMs), and multimodal models.	YASHAR DELDJOO et. al.	arxiv-cs.IR	2024-09-18
437	AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders.	PENGAN CHEN et. al.	arxiv-cs.RO	2024-09-18
438	American Sign Language to Text Translation Using Transformer and Seq2Seq with LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the Transformer with the Sequence-to-Sequence (Seq2Seq) model in translating sign language to text.	Gregorius Guntur Sunardi Putra; Adifa Widyadhani Chanda D’Layla; Dimas Wahono; Riyanarto Sarno; Agus Tri Haryono;	arxiv-cs.CL	2024-09-17
439	Small Language Models Can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate the creative fiction writing abilities of a fine-tuned small language model (SLM), BART Large, and compare its performance to humans and two large language models (LLMs): GPT-3.5 and GPT-4o.	Guillermo Marco; Luz Rello; Julio Gonzalo;	arxiv-cs.CL	2024-09-17
440	Adaptive Large Language Models By Layerwise Attention Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we propose to challenge this and introduce adaptive computations for LLM-like setups, which allow the final layer to attend to all of the intermediate layers as it deems fit through the attention mechanism, thereby introducing computational \textbf{attention shortcuts}.	Prateek Verma; Mert Pilanci;	arxiv-cs.CL	2024-09-16
441	Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories.	Shaznin Sultana; Sadia Afreen; Nasir U. Eisty;	arxiv-cs.SE	2024-09-16
442	Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, inspired by the recent public release of the GPT-o1 models, we conduct the first study to compare the effectiveness of different versions of the GPT-family models in APR.	Haichuan Hu; Ye Shang; Guolin Xu; Congqing He; Quanjun Zhang;	arxiv-cs.SE	2024-09-16
443	LLMs for Clinical Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the efficacy of GPT-4 and clinalytix Medical AI in predicting the clinical risk of delirium development.	Mohamed Rezk; Patricia Cabanillas Silva; Fried-Michael Dahlweid;	arxiv-cs.CL	2024-09-16
444	SelECT-SQL: Self-correcting Ensemble Chain-of-Thought for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SelECT-SQL, a novel in-context learning solution that uses an algorithmic combination of chain-of-thought (CoT) prompting, self-correction, and ensemble methods to yield a new state-of-the-art result on challenging Text-to-SQL benchmarks.	Ke Shen; Mayank Kejriwal;	arxiv-cs.CL	2024-09-16
445	Investigating The Impact of Code Comment Inconsistency on Bug Introducing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research investigates the impact of code-comment inconsistency on bug introduction using large language models, specifically GPT-3.5.	Shiva Radmanesh; Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour;	arxiv-cs.SE	2024-09-16
446	CAT: Customized Transformer Accelerator Framework on Versal ACAP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is far more flexible than GPU in hardware customization, and has better and smaller design solution space than traditional FPGA. Therefore, this paper proposes the Customized Transformer Accelerator Framework(CAT), through the CAT framework, a customized Transformer accelerator family can be derived on Versal ACAP, CAT framework has an abstract accelerator architecture design idea, which deconstructs and efficiently maps the Transformer into the hardware, which contains a variety of customizable properties.	Wenbo Zhang; Yiqi Liu; Zhenshan Bao;	arxiv-cs.AR	2024-09-15
447	Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive investigation of the use of large language models (LLMs) and their capabilities in detecting OWASP Top Ten vulnerabilities in Solidity.	Md Tauseef Alam; Raju Halder; Abyayananda Maiti;	arxiv-cs.CR	2024-09-15
448	GP-GPT: Large Language Model for Gene-Phenotype Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis.	YANJUN LYU et. al.	arxiv-cs.CL	2024-09-15
449	Leveraging Open-Source Large Language Models for Native Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Native Language Identification (NLI) – the task of identifying the native language (L1) of a person based on their writing in the second language (L2) – has applications in forensics, marketing, and second language acquisition. Historically, conventional machine learning approaches that heavily rely on extensive feature engineering have outperformed transformer-based language models on this task.	Yee Man Ng; Ilia Markov;	arxiv-cs.CL	2024-09-15
450	Enhancing LLM Problem Solving with REAP: Reflection, Explicit Problem Deconstruction, and Advanced Prompting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have transformed natural language processing, yet improving their problem-solving capabilities, particularly for complex, reasoning-intensive tasks, …	Ryan Lingo; Martin Arroyo; Rajeev Chhajer;	ArXiv	2024-09-14
451	Evaluating Authenticity and Quality of Image Captions Via Sentiment and Semantic Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an evaluation method focused on sentiment and semantic richness.	Aleksei Krotov; Alison Tebo; Dylan K. Picart; Aaron Dean Algave;	arxiv-cs.CV	2024-09-14
452	Autoregressive + Chain of Thought = Recurrent: Recurrence’s Role in Language Models’ Computability and A Revisit of Recurrent Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thoroughly investigate the influence of recurrent structures in neural models on their reasoning abilities and computability, contrasting the role autoregression plays in the neural models’ computational power.	Xiang Zhang; Muhammad Abdul-Mageed; Laks V. S. Lakshmanan;	arxiv-cs.CL	2024-09-13
453	Undergrads Are All You Have Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper also demonstrates that GPT-UGRD is cheaper and easier to train and operate than transformer models. In this paper, we outline the implementation, application, multi-tenanting, and social implications of using this new model in research and other contexts.	Ashe Neth;	arxiv-cs.CY	2024-09-13
454	Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper’s contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.	Jake Street; Isibor Ihianle; Funminiyi Olajide; Ahmad Lotfi;	arxiv-cs.LG	2024-09-12
455	SDformer: Efficient End-to-End Transformer for Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a different window-based Transformer architecture for depth completion tasks named Sparse-to-Dense Transformer (SDformer).	JIAN QIAN et. al.	arxiv-cs.CV	2024-09-12
456	Towards Fairer Health Recommendations: Finding Informative Unbiased Samples Via Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, some of these terms, especially those related to race and ethnicity, can carry different meanings (e.g., white matter of spinal cord). To address this issue, we propose the use of Word Sense Disambiguation models to refine dataset quality by removing irrelevant sentences.	GAVIN BUTTS et. al.	arxiv-cs.CL	2024-09-11
457	A Novel Mathematical Framework for Objective Characterization of Ideas Through Vector Embeddings in LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This method suffers from limitations such as human judgment errors, bias, and oversight. Addressing this gap, our study introduces a comprehensive mathematical framework for automated analysis to objectively evaluate the plethora of ideas generated by CAI systems and/or humans.	B. Sankar; Dibakar Sen;	arxiv-cs.AI	2024-09-11
458	A Fine-grained Sentiment Analysis of App Reviews Using Large Language Models: An Evaluation Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Analyzing user reviews for sentiment towards app features can provide valuable insights into users’ perceptions of app functionality and their evolving needs.	Faiz Ali Shah; Ahmed Sabir; Rajesh Sharma;	arxiv-cs.CL	2024-09-11
459	GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address the challenges of using LLM-as-a-Judge when evaluating grounded answers generated by RAG systems.	Sacha Muller; António Loison; Bilel Omrani; Gautier Viaud;	arxiv-cs.CL	2024-09-10
460	FairHome: A Fair Housing and Fair Lending Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories.	Anusha Bagalkotkar; Aveek Karmakar; Gabriel Arnson; Ondrej Linda;	arxiv-cs.LG	2024-09-09
461	Harmonic Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are becoming very popular and are used for many different purposes, including creative tasks in the arts.	Anna Kruspe;	arxiv-cs.CL	2024-09-09
462	Retrofitting Temporal Graph Neural Networks with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose TF-TGN, which uses Transformer decoder as the backbone model for TGNN to enjoy Transformer’s codebase for efficient training.	QIANG HUANG et. al.	arxiv-cs.LG	2024-09-09
463	NOVI : Chatbot System for University Novice with BERT and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the difficulties of university freshmen in adapting to university life, we developed NOVI, a chatbot system based on GPT-4o.	Yoonji Nam; TaeWoong Seo; Gyeongcheol Shin; Sangji Lee; JaeEun Im;	arxiv-cs.CL	2024-09-09
464	Can Large Language Models Unlock Novel Scientific Research Ideas? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the capability of LLMs in generating novel research ideas based on information from research papers.	Sandeep Kumar; Tirthankar Ghosal; Vinayak Goyal; Asif Ekbal;	arxiv-cs.CL	2024-09-09
465	Identifying The Sources of Ideological Bias in GPT Models Through Linguistic Variation in Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we provide an original approach to identifying ideological bias in generative models, showing that bias can stem from both the training data and the filtering algorithm.	Christina Walker; Joan C. Timoneda;	arxiv-cs.CL	2024-09-09
466	Low Latency Transformer Inference on FPGAs for Physics Applications with Hls4ml Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays(FPGAs) using hls4ml.	ZHIXING JIANG et. al.	arxiv-cs.LG	2024-09-08
467	TracrBench: Generating Interpretability Testbeds with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Achieving a mechanistic understanding of transformer-based language models is an open challenge, especially due to their large number of parameters. Moreover, the lack of ground …	Hannes Thurnherr; J’er’emy Scheurer;	ArXiv	2024-09-07
468	You Can Remove GPT2’s LayerNorm By Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we show that it is possible to remove the LN layers from a pre-trained GPT2-small model by fine-tuning on a fraction (500M tokens) of the training data.	Stefan Heimersheim;	arxiv-cs.CL	2024-09-06
469	The Emergence of Large Language Models (LLM) As A Tool in Literature Reviews: An LLM Automated Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review.	Dmitry Scherbakov; Nina Hubig; Vinita Jansari; Alexander Bakumenko; Leslie A. Lenert;	arxiv-cs.DL	2024-09-06
470	Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PIPELOAD mechanism, we present Hermes, a framework optimized for large model inference on edge devices.	XUEYUAN HAN et. al.	arxiv-cs.DC	2024-09-06
471	Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of …	Maheep Chaudhary; Atticus Geiger;	ArXiv	2024-09-05
472	CA-BERT: Leveraging Context Awareness for Enhanced Multi-Turn Chat Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional models often struggle with determining when additional context is necessary for generating appropriate responses. This paper introduces Context-Aware BERT (CA-BERT), a transformer-based model specifically fine-tuned to address this challenge.	Minghao Liu; Mingxiu Sui; Yi Nan; Cangqing Wang; Zhijie Zhou;	arxiv-cs.CL	2024-09-05
473	LLM-based Multi-agent Poetry Generation in Non-cooperative Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Under the rationale that the learning process of the poetry generation systems should be more human-like and their output more diverse and novel, we introduce a framework based on social learning where we emphasize non-cooperative interactions besides cooperative interactions to encourage diversity.	Ran Zhang; Steffen Eger;	arxiv-cs.CL	2024-09-05
474	CACER: Clinical Concept Annotations for Cancer Events and Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Clinical Concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48,000 medical problems and drug events and 10,000 drug-problem and problem-problem relations.	YUJUAN FU et. al.	arxiv-cs.CL	2024-09-05
475	Detecting Calls to Action in Multimodal Content: Analysis of The 2021 German Federal Election Campaign on Instagram Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts.	Michael Achmann-Denkler; Jakob Fehle; Mario Haim; Christian Wolff;	arxiv-cs.SI	2024-09-04
476	OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the experiments and results for the CheckThat!	WŁODZIMIERZ LEWONIEWSKI et. al.	arxiv-cs.CL	2024-09-04
477	MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Finally many Transformer based approaches rely primarily on CNN based decoders overlooking the benefits of Transformer based decoding models. Recognizing these limitations, we address the need efficient lightweight solutions by introducing MobileUNETR, which aims to overcome the performance constraints associated with both CNNs and Transformers while minimizing model size, presenting a promising stride towards efficient image segmentation.	Shehan Perera; Yunus Erzurumlu; Deepak Gulati; Alper Yilmaz;	arxiv-cs.CV	2024-09-04
478	LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Modeling and predicting such intricate behavior without explicit knowledge of the system’s underlying topology presents a significant challenge, motivating the development of algorithms that can generalize across various grid configurations and boundary conditions. We develop a decoder-only generative pretrained transformer (GPT) model to solve this problem, showing that our model can simulate Life on a toroidal grid with no prior knowledge on the size of the grid, or its periodic boundary conditions (LifeGPT).	Jaime A. Berkovich; Markus J. Buehler;	arxiv-cs.AI	2024-09-03
479	Dialogue You Can Trust: Human and AI Perspectives on Generated Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing the GPT-4o API, we generated a diverse dataset of conversations and conducted a two-part experimental analysis.	Ike Ebubechukwu; Johane Takeuchi; Antonello Ceravola; Frank Joublin;	arxiv-cs.CL	2024-09-03
480	How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper seeks to address this gap by providing a comprehensive case study evaluating LLMs’ performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) with respect to privacy policies and data protection regulations. We introduce a Privacy Technical Review (PTR) framework, highlighting its role in mitigating privacy risks during the software development life-cycle.	XICHOU ZHU et. al.	arxiv-cs.CL	2024-09-03
481	Beyond ChatGPT: Enhancing Software Quality Assurance Tasks with Diverse LLMs and Validation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There remains a gap in understanding the performance of various LLMs in this critical domain. This paper aims to address this gap by conducting a comprehensive investigation into the capabilities of several LLMs across two SQA tasks: fault localization and vulnerability detection.	Ratnadira Widyasari; David Lo; Lizi Liao;	arxiv-cs.SE	2024-09-02
482	Deep Expertise and Interest Personalized Transformer for Expert Finding Related Papers Related Patents Related Grants Related Venues Related Experts View	YINGHUI WANG et. al.	Inf. Process. Manag.	2024-09-01
483	Identifying Influential Nodes in Complex Networks Via Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	LEIYANG CHEN et. al.	Inf. Process. Manag.	2024-09-01
484	Towards Faster Graph Partitioning Via Pre-training and Inductive Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm.	MENG QIN et. al.	arxiv-cs.LG	2024-09-01
485	LightingFormer: Transformer-CNN Hybrid Network for Low-light Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View	Cong Bi; Wenhua Qian; Jinde Cao; Xue Wang;	Comput. Graph.	2024-09-01
486	ST2SI: Image Style Transfer Via Vision Transformer Using Spatial Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View	Wenshu Li; Yinliang Chen; Xiaoying Guo; Xiaoyu He;	Comput. Graph.	2024-09-01
487	Research on LLM Acceleration Using The High-Performance RISC-V Processor Xiangshan (Nanhu Version) Based on The Open-Source Matrix Instruction Set Extension (Vector Dot Product) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main contributions of this paper are as follows: For the characteristics of large language models, custom instructions were extended based on the RISC-V instruction set to perform vector dot product calculations, accelerating the computation of large language models on dedicated vector dot product acceleration hardware.	XU-HAO CHEN et. al.	arxiv-cs.AR	2024-09-01
488	An Empirical Study on Information Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs’ human-like characteristics, we propose and analyze the effects of a series of simple prompt-based methods, which can be generalized to other LLMs and NLP tasks.	RIDONG HAN et. al.	arxiv-cs.CL	2024-08-31
489	From Text to Emotion: Unveiling The Emotion Annotation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the potential of Large Language Models (LLMs), specifically GPT4, in automating or assisting emotion annotation.	Minxue Niu; Mimansa Jaiswal; Emily Mower Provost;	arxiv-cs.CL	2024-08-30
490	Finding Frames with BERT: A Transformer-based Approach to Generic News Frame Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The availability of digital data offers new possibilities for studying how specific aspects of social reality are made more salient in online communication but also raises challenges related to the scaling of framing analysis and its adoption to new research areas (e.g. studying the impact of artificial intelligence-powered systems on representation of societally relevant issues). To address these challenges, we introduce a transformer-based approach for generic news frame detection in Anglophone online content.	Vihang Jumle; Mykola Makhortykh; Maryna Sydorova; Victoria Vziatysheva;	arxiv-cs.CL	2024-08-30
491	Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning), using leverage retrieval information from the memory to aid in generating accurate answers and persuasive explanations without relying on complex networks and extra datasets.	Su Hyeon Lim; Minkuk Kim; Hyeon Bae Kim; Seong Tae Kim;	arxiv-cs.CV	2024-08-30
492	Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study performs a comparative analysis of various natural language models for medical text classification.	SHUBHAM AGARWAL et. al.	arxiv-cs.CL	2024-08-30
493	Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in The Environmental and Climate Change Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through this research, we aim to contribute to the ongoing discussion on the utility and effectiveness of generative LMs in addressing some of the planet’s most urgent issues, highlighting their strengths and limitations in the context of ecology and CC.	Francesca Grasso; Stefano Locci;	arxiv-cs.CL	2024-08-30
494	ProGRes: Prompted Generative Rescoring on ASR N-Best Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs.	Ada Defne Tur; Adel Moumen; Mirco Ravanelli;	arxiv-cs.CL	2024-08-30
495	Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing far-right and far-left ideological keywords and manually labeled them as extremist or non-extremist.	Beidi Dong; Jin R. Lee; Ziwei Zhu; Balassubramanian Srinivasan;	arxiv-cs.CL	2024-08-29
496	MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following current trends in machine learning, we have created a foundation model for the MAPF problems called MAPF-GPT.	Anton Andreychuk; Konstantin Yakovlev; Aleksandr Panov; Alexey Skrynnik;	arxiv-cs.MA	2024-08-29
497	Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Fully Pipelined Distributed Transformer (FPDT) for efficiently training long-context LLMs with extreme hardware efficiency.	JINGHAN YAO et. al.	arxiv-cs.DC	2024-08-29
498	Can AI Replace Human Subjects? A Large-Scale Replication of Psychological Experiments with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that GPT-4 successfully replicates 76.0 percent of main effects and 47.0 percent of interaction effects observed in the original studies, closely mirroring human responses in both direction and significance.	Ziyan Cui; Ning Li; Huaikang Zhou;	arxiv-cs.CL	2024-08-29
499	Unleashing The Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Audio-Language-Referenced SAM 2 (AL-Ref-SAM 2) pipeline to explore the training-free paradigm for audio and language-referenced video object segmentation, namely AVS and RVOS tasks.	SHAOFEI HUANG et. al.	arxiv-cs.CV	2024-08-28
500	FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses Over SORRY-Bench (Automated Multi-shot Jailbreaks) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FRACTURED-SORRY-Bench, a framework for evaluating the safety of Large Language Models (LLMs) against multi-turn conversational attacks.	Aman Priyanshu; Supriti Vijay;	arxiv-cs.CL	2024-08-28
501	Speech Recognition Transformers: Topological-lingualism Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a comprehensive survey of transformer techniques oriented in speech modality.	Shruti Singh; Muskaan Singh; Virender Kadyan;	arxiv-cs.CL	2024-08-27
502	A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this review paper, we provide an extensive overview of various transformer architectures adapted for computer vision tasks.	Gracile Astlin Pereira; Muhammad Hussain;	arxiv-cs.CV	2024-08-27
503	The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of converting these pretrained models for deployment.	Junxiong Wang; Daniele Paliotta; Avner May; Alexander M. Rush; Tri Dao;	arxiv-cs.LG	2024-08-27
504	Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models Without Instruction-Following Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction fine-tuning is crucial for today’s large language models (LLMs) to learn to follow instructions and align with human preferences. Conventionally, supervised data, …	Juncheng Xie; Shensian Syu; Hung-yi Lee;	ArXiv	2024-08-27
505	Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluated multiple models, including OpenAI’s gpt-3.5-turbo, gpt-4o, and ZhipuAI’s glm-4, through a two-phase testing approach.	LIUCHANG XU et. al.	arxiv-cs.CL	2024-08-26
506	Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite considerable efforts in attack detection, intrusion detection systems remain mostly reactive, responding to specific patterns or observed anomalies. This work proposes a proactive approach to anticipate and mitigate malicious activities before they cause damage.	Alaeddine Diaf; Abdelaziz Amara Korba; Nour Elislem Karabadji; Yacine Ghamri-Doudane;	arxiv-cs.CR	2024-08-26
507	One-layer Transformers Fail to Solve The Induction Heads Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient …	Clayton Sanford; Daniel Hsu; Matus Telgarsky;	arxiv-cs.LG	2024-08-26
508	LowCLIP: Adapting The CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address challenges in vision-language retrieval for low-resource languages, we integrated the CLIP model architecture and employed several techniques to balance computational efficiency with performance.	Ali Asgarov; Samir Rustamov;	arxiv-cs.CV	2024-08-25
509	Bidirectional Awareness Induction in Autoregressive Seq2Seq Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Bidirectional Awareness Induction (BAI), a training method that leverages a subset of elements in the network, the Pivots, to perform bidirectional learning without breaking the autoregressive constraints.	Jia Cheng Hu; Roberto Cavicchioli; Alessandro Capotondi;	arxiv-cs.CL	2024-08-25
510	Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Methods: We used 300 gastroenterology board exam-style multiple-choice questions, 138 of which contain images to systematically assess the impact of model configurations and parameters and prompt engineering strategies utilizing GPT-3.5.	SEYED AMIR AHMAD SAFAVI-NAINI et. al.	arxiv-cs.CL	2024-08-25
511	Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine Against COVID-19 Literature: Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed.	XU TONG et. al.	arxiv-cs.CL	2024-08-24
512	Preliminary Investigations of A Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an innovative architecture that leverages the generative capabilities of zero-shot prompting in Large Language Models (LLMs) such as GPT-4(language only), the predictive ability of few-shot (in-context) learning in Large Multimodal Models (LMMs) such as GPT-4(V)ision, and fuses knowledge across image based and linguistic insights for accurate nanomaterial category prediction.	Sakhinana Sagar Srinivas; Geethan Sannidhi; Sreeja Gangasani; Chidaksh Ravuru; Venkataramana Runkana;	arxiv-cs.CV	2024-08-24
513	CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a CNN-Transformer rectified collaborative learning (CTRCL) framework to learn stronger CNN-based and Transformer-based models for MIS tasks via the bi-directional knowledge transfer between them.	LANHU WU et. al.	arxiv-cs.CV	2024-08-24
514	Enhancing Multi-hop Reasoning Through Knowledge Erasure in Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE).	MENGQI ZHANG et. al.	arxiv-cs.CL	2024-08-22
515	Enhancing Automated Program Repair with Solution Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a compelling question: How can we leverage DR scattered across the issue logs to efficiently enhance APR? To investigate this premise, we introduce DRCodePilot, an approach designed to augment GPT-4-Turbo’s APR capabilities by incorporating DR into the prompt instruction.	JIUANG ZHAO et. al.	arxiv-cs.SE	2024-08-21
516	The Self-Contained Negation Test Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we build on Gubelmann and Handschuh (2022), which studies the modification of PLMs’ predictions as a function of the polarity of inputs, in English.	David Kletz; Pascal Amsili; Marie Candito;	arxiv-cs.CL	2024-08-21
517	Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.	MENGLIN YANG et. al.	kdd	2024-08-21
518	Clinical Context-aware Radiology Report Generation from Medical Images Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the use of the transformer model for radiology report generation from chest X-rays.	Sonit Singh;	arxiv-cs.CL	2024-08-21
519	BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a pipeline for developing an in-house LLM to extract clinical information from radiology reports.	YUXUAN CHEN et. al.	arxiv-cs.CL	2024-08-21
520	Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands.	Pihe Hu; Shaolong Li; Longbo Huang;	arxiv-cs.LG	2024-08-21
521	CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of a comprehensive benchmark impedes progress in this field. To bridge this gap, we introduce CharacterEval, a Chinese benchmark for comprehensive RPCA assessment, complemented by a tailored high-quality dataset.	QUAN TU et. al.	acl	2024-08-20
522	Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, methods leveraging pre-trained language models like BERT have been developed, which require less data and yield enhanced performance.	YUCHENG RUAN et. al.	arxiv-cs.CL	2024-08-20
523	Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level operating system microkernel, seL4.	Lichen Zhang; Shuai Lu; Nan Duan;	acl	2024-08-20
524	Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Aim: Our goal is to improve AD detection performance of various ML/DL models.	Emmanuel Iko-Ojo Simon; Chirath Hettiarachchi; Alex Potanin; Hanna Suominen; Fatemeh Fard;	arxiv-cs.SE	2024-08-20
525	Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis.	Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu;	acl	2024-08-20
526	MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability�MELA, with 46K samples covering 10 languages from a diverse set of language families.	ZIYIN ZHANG et. al.	acl	2024-08-20
527	ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, and undergoes a comprehensive training regime with pre-training, SFT, and RLHF.	Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang;	acl	2024-08-20
528	Self-Evolving GPT: A Lifelong Autonomous Experiential Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential learning framework based on LLMs to explore whether LLMs can imitate human ability for learning and utilizing experience.	JINGLONG GAO et. al.	acl	2024-08-20
529	Language Models Can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks.	Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty;	acl	2024-08-20
530	GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.	Virginia Felkner; Jennifer Thompson; Jonathan May;	acl	2024-08-20
531	CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Incorrect initial angles between Q and K can cause misestimation in modeling rotary position embedding of the closest tokens. To address this issue, we propose Collinear Constrained Attention mechanism, namely CoCA.	SHIYI ZHU et. al.	acl	2024-08-20
532	The MERSA Dataset and A Transformer-Based Approach for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Multimodal Emotion Recognition and Sentiment Analysis (MERSA) dataset, which includes both natural and scripted speech recordings, transcribed text, physiological data, and self-reported emotional surveys from 150 participants collected over a two-week period.	Enshi Zhang; Rafael Trujillo; Christian Poellabauer;	acl	2024-08-20
533	MultiLegalPile: A 689GB Multilingual Legal Corpus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, so far, few datasets are available for specialized critical domains such as law and the available ones are often small and only in English. To fill this gap, we curate and release MultiLegalPile, a 689GB corpus in 24 languages from 17 jurisdictions.	Joel Niklaus; Veton Matoshi; Matthias St�rmer; Ilias Chalkidis; Daniel Ho;	acl	2024-08-20
534	MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel map-guided GPT-based agent, dubbed MapGPT, which introduces an online linguistic-formed map to encourage the global exploration.	JIAQI CHEN et. al.	acl	2024-08-20
535	Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs By Sampling with People Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by methods from cognitive science, we propose an iterative method for simultaneously eliciting conversational tones and sentences, where participants alternate between two tasks: (1) one participant identifies the tone of a given sentence and (2) a different participant generates a sentence based on that tone.	Dun-Ming Huang; Pol Van Rijn; Ilia Sucholutsky; Raja Marjieh; Nori Jacoby;	acl	2024-08-20
536	An Empirical Analysis on Large Language Models in Debate Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.	Xinyi Liu; Pinxin Liu; Hangfeng He;	acl	2024-08-20
537	Your Transformer Is Secretly Linear Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper reveals a novel linear characteristic exclusive to transformer decoders, including models like GPT, LLaMA, OPT, BLOOM and others.	ANTON RAZZHIGAEV et. al.	acl	2024-08-20
538	Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Syntactic Transformer language models aim to achieve better generalization through simultaneously modeling syntax trees and sentences.	Yida Zhao; Chao Lou; Kewei Tu;	acl	2024-08-20
539	Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer.	MULING WU et. al.	acl	2024-08-20
540	Linear Transformers with Learnable Kernel Functions Are Better In-Context Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Mirroring the Transformer�s in-context adeptness, it became a strong contender in the field. In our work, we present a singular, elegant alteration to the Based kernel that amplifies its In-Context Learning abilities evaluated with the Multi-Query Associative Recall task and overall language modeling process, as demonstrated on the Pile dataset.	YAROSLAV AKSENOV et. al.	acl	2024-08-20
541	Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian.	CEM ÜYÜK et. al.	arxiv-cs.CL	2024-08-20
542	D2LLM: Decomposed and Distilled Large Language Models for Semantic Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present D2LLMs�Decomposed and Distilled LLMs for semantic search�that combines the best of both worlds.	Zihan Liao; Hang Yu; Jianguo Li; Jun Wang; Wei Zhang;	acl	2024-08-20
543	Rhyme-aware Chinese Lyric Generator Based on GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance.	YIXIAO YUAN et. al.	arxiv-cs.CL	2024-08-19
544	Demystifying The Communication Characteristics for Distributed Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use GPT-based language models as a case study of the transformer architecture due to their ubiquity.	QUENTIN ANTHONY et. al.	arxiv-cs.DC	2024-08-19
545	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs).	Aviv Bick; Kevin Y. Li; Eric P. Xing; J. Zico Kolter; Albert Gu;	arxiv-cs.LG	2024-08-19
546	How Well Do Large Language Models Serve As End-to-End Secure Code Producers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic investigation into LLMs’ inherent potential to generate code with fewer vulnerabilities.	JIANIAN GONG et. al.	arxiv-cs.SE	2024-08-19
547	GPT-based Textile Pilling Classification Using 3D Point Cloud Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PointGPT, the GPT-like big model of point cloud analysis, we incorporate the global features of the input point cloud extracted from the non-parametric network into it, thus proposing the PointGPT+NN model.	Yu Lu; YuYu Chen; Gang Zhou; Zhenghua Lan;	arxiv-cs.CV	2024-08-19
548	GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges have resulted in travel difficulties for passengers in certain areas, while many drivers in other areas are unable to secure orders, leading to a decline in the overall quality of urban transportation services. To address these issues, this paper introduces GARLIC: a framework of GPT-Augmented Reinforcement Learning with Intelligent Control for vehicle dispatching.	XIAO HAN et. al.	arxiv-cs.LG	2024-08-19
549	STransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we review and categorize existing Transformer-based models into two main types: (1) modifications to the model structure and (2) modifications to the input data.	JIAHENG YIN et. al.	arxiv-cs.LG	2024-08-19
550	Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare a keyword filtering approach, a RoBERTa model fine-tuned with generic data from CrisisLex, a base RoBERTa model trained with AL and a fine-tuned RoBERTa model trained with AL regarding classification performance.	David Hanny; Sebastian Schmidt; Bernd Resch;	arxiv-cs.CL	2024-08-19
551	Attention Is A Smoothed Cubic Spline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We highlight a perhaps important but hitherto unobserved insight: The attention module in a transformer is a smoothed cubic spline.	Zehua Lai; Lek-Heng Lim; Yucong Liu;	arxiv-cs.AI	2024-08-18
552	A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets.	CLAUDIO M. V. DE ANDRADE et. al.	arxiv-cs.CL	2024-08-18
553	From Specifications to Prompts: On The Future of Generative LLMs in Requirements Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative LLMs, such as GPT, have the potential to revolutionize Requirements Engineering (RE) by automating tasks in new ways. This column explores the novelties and introduces …	Andreas Vogelsang;	arxiv-cs.SE	2024-08-17
554	See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Designing tasks and finding LLMs’ limitations are becoming increasingly important. In this paper, we investigate the question of whether an LLM can discover its own limitations from the errors it makes.	YULONG CHEN et. al.	arxiv-cs.CL	2024-08-16
555	MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a pure Transformer-based SED model with masked-reconstruction based pre-training, termed MAT-SED.	Pengfei Cai; Yan Song; Kang Li; Haoyu Song; Ian McLoughlin;	arxiv-cs.SD	2024-08-16
556	Extracting Sentence Embeddings from Pretrained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: Given 110M parameters BERT’s hidden representations from multiple layers and multiple tokens we tried various ways to extract optimal sentence representations.	Lukas Stankevičius; Mantas Lukoševičius;	arxiv-cs.CL	2024-08-15
557	Leveraging Web-Crawled Data for High-Quality Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We argue that although the web-crawled data often has formatting errors causing semantic inaccuracies, it can still serve as a valuable source for high-quality supervised fine-tuning in specific domains without relying on advanced models like GPT-4.	Jing Zhou; Chenglin Jiang; Wei Shen; Xiao Zhou; Xiaonan He;	arxiv-cs.CL	2024-08-15
558	Exploring Transformer Models for Sentiment Classification: A Comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning models have proven superior to classical machine learning approaches in various text classification tasks, such as sentiment analysis, question answering, news …	Ali Areshey; H. Mathkour;	Expert Syst. J. Knowl. Eng.	2024-08-14
559	Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This survey paper provides a comprehensive analysis of the utilization of Transformers and LLMs in cyber-threat detection systems.	Hamza Kheddar;	arxiv-cs.CR	2024-08-14
560	CodeMirage: Hallucinations in Code Generated By Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have shown promising potentials in program generation and no-code automation. However, LLMs are prone to generate hallucinations, i.e., they generate …	Vibhor Agarwal; Yulong Pei; Salwa Alamir; Xiaomo Liu;	ArXiv	2024-08-14
561	MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Models for Multimodal Surface Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting).	YONGQUAN HU et. al.	arxiv-cs.HC	2024-08-14
562	Generative AI for Automatic Topic Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to assess the reliability of three LLMs, namely flan, GPT-4o, and GPT-4 mini for topic labelling.	Diego Kozlowski; Carolina Pradier; Pierre Benz;	arxiv-cs.CL	2024-08-13
563	Evaluating Cultural Adaptability of A Large Language Model Via Simulation of Synthetic Personas Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis shows that specifying a person’s country of residence improves GPT-3.5’s alignment with their responses.	Louis Kwok; Michal Bravansky; Lewis D. Griffin;	arxiv-cs.CL	2024-08-13
564	Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel resource efficient deep learning and transformer based framework, Sumotosima (Summarizer for otoscopic images), an end-to-end pipeline for classification followed by summarization.	Eram Anwarul Khan; Anas Anwarul Haq Khan;	arxiv-cs.CV	2024-08-13
565	MGH Radiology Llama: A Llama 3 70B Model for Radiology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the field of radiology has increasingly harnessed the power of artificial intelligence (AI) to enhance diagnostic accuracy, streamline workflows, and improve …	YUCHENG SHI et. al.	ArXiv	2024-08-13
566	Pragmatic Inference of Scalar Implicature By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some.	Ye-eun Cho; Seong mook Kim;	arxiv-cs.CL	2024-08-13
567	A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the constantly evolving field of cybersecurity, it is imperative for analysts to stay abreast of the latest attack trends and pertinent information that aids in the investigation and attribution of cyber-attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecurity experts about cyber-attacks investigations and attribution.	Sampath Rajapaksha; Ruby Rani; Erisa Karafili;	arxiv-cs.CR	2024-08-12
568	Body Transformer: Leveraging Robot Embodiment for Policy Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. Therefore, we propose Body Transformer (BoT), an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process.	Carmelo Sferrazza; Dun-Ming Huang; Fangchen Liu; Jongmin Lee; Pieter Abbeel;	arxiv-cs.RO	2024-08-12
569	A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is a huge gap between LLM’s and human capabilities for understanding abstract concepts and reasoning. We discuss these issues in a larger philosophical context of human knowledge acquisition and the Turing test.	Vladimir Cherkassky; Eng Hock Lee;	arxiv-cs.CL	2024-08-12
570	Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the effectiveness of LLMs in detecting and classifying Common Weakness Enumerations (CWE) using different prompt and role strategies.	Kohei Dozono; Tiago Espinha Gasiba; Andrea Stocco;	arxiv-cs.SE	2024-08-12
571	MGH Radiology Llama: A Llama 3 70B Model for Radiology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an advanced radiology-focused large language model: MGH Radiology Llama.	YUCHENG SHI et. al.	arxiv-cs.CL	2024-08-12
572	The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability of findings across different scenarios. We address this gap by training language models with progressing complexity on trauma-related datasets, including genocide-related court data, a Reddit dataset on post-traumatic stress disorder (PTSD), counseling conversations, and Incel forum posts.	Miriam Schirmer; Tobias Leemann; Gjergji Kasneci; Jürgen Pfeffer; David Jurgens;	arxiv-cs.CL	2024-08-12
573	Spacetime $E(n)$-Transformer: Equivariant Attention for Spatio-temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an $E(n)$-equivariant Transformer architecture for spatio-temporal graph data.	Sergio G. Charles;	arxiv-cs.LG	2024-08-12
574	Is It A Work or Leisure Travel? Applying Text Classification to Identify Work-related Travel on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a model to predict whether a trip is leisure or work-related, utilizing state-of-the-art Automatic Text Classification (ATC) models such as BERT, RoBERTa, and BART to enhance the understanding of user travel purposes and improve recommendation accuracy in specific travel scenarios.	Lucas Félix; Washington Cunha; Jussara Almeida;	arxiv-cs.SI	2024-08-12
575	Cross-Lingual Conversational Speech Summarization with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We build a baseline cascade-based system using open-source speech recognition and machine translation models.	Max Nelson; Shannon Wotherspoon; Francis Keith; William Hartmann; Matthew Snover;	arxiv-cs.CL	2024-08-12
576	GPT-4 Emulates Average-Human Emotional Cognition from A Third-Person Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper extends recent investigations on the emotional reasoning abilities of Large Language Models (LLMs). Current research on LLMs has not directly evaluated the distinction …	Ala Nekouvaght Tak; Jonathan Gratch;	ArXiv	2024-08-11
577	Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches struggle with CQA due to two challenges: (1) precisely identifying necessary conditions and the logical relationship, and (2) verifying conditions to detect any that are missing. In this paper, we propose a novel prompting approach, Chain of condition, by first identifying all conditions and constructing their logical relationships explicitly according to the document, then verifying whether these conditions are satisfied, finally solving the logical expression to indicate any missing conditions and generating the answer accordingly.	Jiuheng Lin; Yuxuan Lai; Yansong Feng;	arxiv-cs.CL	2024-08-10
578	Evaluating The Capability of Large Language Models to Personalize Science Texts for Diverse Middle-school-age Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, GPT-4 was used to profile student learning preferences based on choices made during a training session.	Michael Vaccaro Jr; Mikayla Friday; Arash Zaghi;	arxiv-cs.HC	2024-08-09
579	From Text to Insight: Leveraging Large Language Models for Performance Evaluation in Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through comparative analyses across two studies, including various task performance outputs, we demonstrate that LLMs can serve as a reliable and even superior alternative to human raters in evaluating knowledge-based performance outputs, which are a key contribution of knowledge workers.	Ning Li; Huaikang Zhou; Mingze Xu;	arxiv-cs.CL	2024-08-09
580	Retrieval-augmented Code Completion for Local Projects Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on using LLMs with around 160 million parameters that are suitable for local execution and augmentation with retrieval from local projects.	Marko Hostnik; Marko Robnik-Šikonja;	arxiv-cs.SE	2024-08-09
581	Transformer Explainer: Interactive Learning of Text-Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model.	AEREE CHO et. al.	arxiv-cs.LG	2024-08-08
582	Multi-Class Intrusion Detection Based on Transformer for IoT Networks Using CIC-IoT-2023 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study uses deep learning methods to explore the Internet of Things (IoT) network intrusion detection method based on the CIC-IoT-2023 dataset. This dataset contains extensive …	Shu-Ming Tseng; Yan-Qi Wang; Yung-Chung Wang;	Future Internet	2024-08-08
583	Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles Using LLMs and LMMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores how LLMs and LMMs can assist journalistic practice by generating contextualised captions for images accompanying news articles.	Aliki Anagnostopoulou; Thiago Gouvea; Daniel Sonntag;	arxiv-cs.CL	2024-08-08
584	Towards Explainable Network Intrusion Detection Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current state-of-the-art NIDS rely on artificial benchmarking datasets, resulting in skewed performance when applied to real-world networking environments. Therefore, we compare the GPT-4 and LLama3 models against traditional architectures and transformer-based models to assess their ability to detect malicious NetFlows without depending on artificially skewed datasets, but solely on their vast pre-trained acquired knowledge.	Paul R. B. Houssel; Priyanka Singh; Siamak Layeghy; Marius Portmann;	arxiv-cs.CR	2024-08-08
585	SocFedGPT: Federated GPT-based Adaptive Content Filtering System Leveraging User Interactions in Social Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study presents a multifaceted approach to enhancing user interaction and content relevance in social media platforms through a federated learning framework. We introduce …	Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder;	ArXiv	2024-08-07
586	Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge.	Steven Y. Feng; Noah D. Goodman; Michael C. Frank;	arxiv-cs.CL	2024-08-07
587	A Comparison of LLM Finetuning Methods & Evaluation Metrics with Travel Chatbot Use Case Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We used two pretrained LLMs utilized for fine-tuning research: LLaMa 2 7B, and Mistral 7B.	Sonia Meyer; Shreya Singh; Bertha Tam; Christopher Ton; Angel Ren;	arxiv-cs.CL	2024-08-07
588	Could ChatGPT Get An Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conceptualize these challenges through the lens of vulnerability, the potential for university assessments and learning outcomes to be impacted by student use of generative AI. We investigate the potential scale of this vulnerability by measuring the degree to which AI assistants can complete assessment questions in standard university-level STEM courses.	BEATRIZ BORGES et. al.	arxiv-cs.CY	2024-08-07
589	Evaluating Source Code Quality with Large Languagem Models: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to describe and analyze the results obtained using LLMs as a static analysis tool, evaluating the overall quality of code.	Igor Regis da Silva Simões; Elaine Venson;	arxiv-cs.SE	2024-08-07
590	Image-to-LaTeX Converter for Mathematical Formulas and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this project, we train a vision encoder-decoder model to generate LaTeX code from images of mathematical formulas and text.	Daniil Gurgurov; Aleksey Morshnev;	arxiv-cs.CL	2024-08-07
591	FLASH: Federated Learning-Based LLMs for Advanced Query Processing in Social Networks Through RAG Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our paper introduces a novel approach to social network information retrieval and user engagement through a personalized chatbot system empowered by Federated Learning GPT. The …	Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder;	ArXiv	2024-08-06
592	Evaluating The Translation Performance of Large Language Models Based on Euas-20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the significant progress in translation performance achieved by large language models, machine translation still faces many challenges. Therefore, in this paper, we construct the dataset Euas-20 to evaluate the performance of large language models on translation tasks, the translation ability on different languages, and the effect of pre-training data on the translation ability of LLMs for researchers and developers.	Yan Huang; Wei Liu;	arxiv-cs.CL	2024-08-06
593	Training LLMs to Recognize Hedges in Spontaneous Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: After an error analysis on the top performing approaches, we used an LLM-in-the-Loop approach to improve the gold standard coding, as well as to highlight cases in which hedges are ambiguous in linguistically interesting ways that will guide future research.	Amie J. Paige; Adil Soubki; John Murzaku; Owen Rambow; Susan E. Brennan;	arxiv-cs.CL	2024-08-06
594	HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models.	Pratyush Dhingra; Janardhan Rao Doppa; Partha Pratim Pande;	arxiv-cs.AR	2024-08-06
595	Accuracy and Consistency of LLMs in The Registered Dietitian Exam: The Impact of Prompt Engineering and Knowledge Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to employ the Registered Dietitian (RD) exam to conduct a standard and comprehensive evaluation of state-of-the-art LLMs, GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro, assessing both accuracy and consistency in nutrition queries.	Iman Azimi; Mohan Qi; Li Wang; Amir M. Rahmani; Youlin Li;	arxiv-cs.CL	2024-08-06
596	PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent success of pre-trained models (PTMs) in natural language processing (NLP), we present PTM4Tag+, a tag recommendation framework for Stack Overflow posts that utilizes PTMs in language modeling.	JUNDA HE et. al.	arxiv-cs.SE	2024-08-05
597	Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG.	Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour;	arxiv-cs.SE	2024-08-05
598	Evaluating The Performance of Large Language Models for SDG Mapping (Technical Report) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we compare the performance of various language models on the Sustainable Development Goal (SDG) mapping task, using the output of GPT-4o as the baseline.	Hui Yin; Amir Aryani; Nakul Nambiar;	arxiv-cs.LG	2024-08-04
599	X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities.	HAOYUAN JIANG et. al.	ijcai	2024-08-03
600	QFormer: An Efficient Quaternion Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Secondly, the DCNNs or Transformer-based image denoising models usually have a large number of parameters, high computational complexity, and slow inference speed. To resolve these issues, this paper proposes a highly-efficient Quaternion Transformer (QFormer) for image denoising.	Bo Jiang; Yao Lu; Guangming Lu; Bob Zhang;	ijcai	2024-08-03
601	Class-consistent Contrastive Learning Driven Cross-dimensional Transformer for 3D Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer emerges as an active research topic in medical image analysis. Yet, three substantial challenges limit the effectiveness of both 2D and 3D Transformers in 3D medical …	Qikui Zhu; Chuan Fu; Shuo Li;	ijcai	2024-08-03
602	MiniCPM-V: A GPT-4V Level MLLM on Your Phone IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MiniCPM-V, a series of efficient MLLMs deployable on end-side devices.	YUAN YAO et. al.	arxiv-cs.CV	2024-08-03
603	AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce AVESFormer, the first real-time Audio-Visual Efficient Segmentation transformer that achieves fast, efficient and light-weight simultaneously.	ZILI WANG et. al.	arxiv-cs.CV	2024-08-03
604	Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants.	ZHUOYI LIN et. al.	ijcai	2024-08-03
605	TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires.	Yicheng Lin; Dandan Zhang; Yun Liu;	arxiv-cs.LG	2024-08-02
606	Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the basis of this analysis, we propose DefinitionEMB, a method that utilizes definitions to re-construct isotropically distributed and semantics-related token embeddings for encoder-based PLMs while maintaining original robustness during fine-tuning.	Ying Zhang; Dongyuan Li; Manabu Okumura;	arxiv-cs.CL	2024-08-02
607	Efficacy of Large Language Models in Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the effectiveness of Large Language Models (LLMs) in interpreting existing literature through a systematic review of the relationship between Environmental, Social, and Governance (ESG) factors and financial performance.	Aaditya Shah; Shridhar Mehendale; Siddha Kanthi;	arxiv-cs.CL	2024-08-02
608	Toward Automatic Relevance Judgment Using Vision–Language Models for Image–Text Retrieval Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain.	Jheng-Hong Yang; Jimmy Lin;	arxiv-cs.IR	2024-08-02
609	High-Throughput Phenotyping of Clinical Text Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a performance comparison of GPT-4 and GPT-3.5-Turbo.	Daniel B. Hier; S. Ilyas Munzir; Anne Stahlfeld; Tayo Obafemi-Ajayi; Michael D. Carrithers;	arxiv-cs.CL	2024-08-02
610	Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces ‘Psycho Analyst’, a custom GPT model based on OpenAI’s GPT-4, optimized for pre-screening mental health disorders.	Jinwen Tang; Yi Shang;	arxiv-cs.CY	2024-08-02
611	Toward Automatic Relevance Judgment Using Vision-Language Models for Image-Text Retrieval Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain. This paper assesses …	Jheng-Hong Yang; Jimmy Lin;	ArXiv	2024-08-02
612	TR-TransGAN: Temporal Recurrent Transformer Generative Adversarial Network for Longitudinal MRI Dataset Expansion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Longitudinal magnetic resonance imaging (MRI) datasets have important implications for the study of degenerative diseases because such datasets have data from multiple points in …	CHEN-CHEN FAN et. al.	IEEE Transactions on Cognitive and Developmental Systems	2024-08-01
613	Graph Transformer for 3D Point Clouds Classification and Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	Wei Zhou; Qian Wang; Weiwei Jin; X. Shi; Ying He;	Comput. Graph.	2024-08-01
614	Granting GPT-4 License and Opportunity: Enhancing Accuracy and Confidence Estimation for Few-Shot Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present effort explores methods for effective confidence estimation with GPT-4 with few-shot learning for event detection in the BETTER ontology as a vehicle.	Steven Fincke; Adrien Bibal; Elizabeth Boschee;	arxiv-cs.AI	2024-08-01
615	MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception …	CONG JIN et. al.	IEEE Transactions on Circuits and Systems for Video …	2024-08-01
616	Unmasking Large Language Models By Means of OpenAI GPT-4 and Google AI: A Deep Instruction-based Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View	IDREES A. ZAHID et. al.	Intell. Syst. Appl.	2024-08-01
617	Bilateral Transformer 3D Planar Recovery Related Papers Related Patents Related Grants Related Venues Related Experts View	Fei Ren; Chunhua Liao; Zhina Xie;	Graph. Model.	2024-08-01
618	CATNet: Cascaded Attention Transformer Network for Marine Species Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Weidong Zhang; Gongchao Chen; Peixian Zhuang; Wenyi Zhao; Ling Zhou;	Expert Syst. Appl.	2024-08-01
619	LCFormer: Linear Complexity Transformer for Efficient Image Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View	Xiang Gao; Sining Wu; Ying Zhou; Fan Wang; Xiaopeng Hu;	Multim. Syst.	2024-08-01
620	MAE-EEG-Transformer: A Transformer-based Approach Combining Masked Autoencoder and Cross-individual Data Augmentation Pre-training for EEG Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Miao Cai; Yu Zeng;	Biomed. Signal Process. Control.	2024-08-01
621	Leveraging Large Language Models (LLMs) for Traffic Management at Urban Intersections: The Case of Mixed Traffic Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the ability of a Large Language Model (LLM), specifically, GPT-4o-mini to improve traffic management at urban intersections.	Sari Masri; Huthaifa I. Ashqar; Mohammed Elhenawy;	arxiv-cs.CL	2024-08-01
622	Performance of Recent Large Language Models for A Low-Resourced Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have shown significant advances in the past year.	Ravindu Jayakody; Gihan Dias;	arxiv-cs.CL	2024-07-31
623	OmniParser for Pure Vision Based GUI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface.	Yadong Lu; Jianwei Yang; Yelong Shen; Ahmed Awadallah;	arxiv-cs.CV	2024-07-31
624	The Llama 3 Herd of Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new set of foundation models, called Llama 3.	AARON GRATTAFIORI et. al.	arxiv-cs.AI	2024-07-31
625	Generative Expressive Conversational Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, due to the limitations of small-scale datasets containing scripted recording styles, they often fail to simulate real natural conversational styles. To address the above issues, we propose a novel generative expressive CSS system, termed GPT-Talker.	Rui Liu; Yifan Hu; Yi Ren; Xiang Yin; Haizhou Li;	arxiv-cs.CL	2024-07-31
626	Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ultimately, we find that the GPT models that we evaluated are not suitable for fully automated vulnerability scanning because the false positive and false negative rates are too high to likely be useful in practice.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CR	2024-07-31
627	Enhancing Agricultural Machinery Management Through Advanced LLM Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach that leverages large language models (LLMs), particularly GPT-4, combined with multi-round prompt engineering to enhance decision-making processes in agricultural machinery management.	Emily Johnson; Noah Wilson;	arxiv-cs.CL	2024-07-30
628	Robust Load Prediction of Power Network Clusters Based on Cloud-Model-Improved Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Presenting an innovative approach, the Cloud Model Improved Transformer (CMIT) method integrates the Transformer model with the cloud model utilizing the particle swarm optimization algorithm, with the aim of achieving robust and precise power load predictions.	Cheng Jiang; Gang Lu; Xue Ma; Di Wu;	arxiv-cs.LG	2024-07-30
629	Interpretable Pre-Trained Transformers for Heart Time-Series Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we employ this framework to the analysis of clinical heart time-series data, to create two pre-trained general purpose cardiac models, termed PPG-PT and ECG-PT.	Harry J. Davies; James Monsen; Danilo P. Mandic;	arxiv-cs.LG	2024-07-30
630	AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We derive an analytical model for the dependence of optimal weights on data scale and introduce AutoScale, a novel, practical approach for optimizing data compositions at potentially large training data scales.	FEIYANG KANG et. al.	arxiv-cs.LG	2024-07-29
631	Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address sentiment analysis of Lithuanian five-star-based online reviews from multiple domains that we collect and clean.	Brigita Vileikytė; Mantas Lukoševičius; Lukas Stankevičius;	arxiv-cs.CL	2024-07-29
632	DuA: Dual Attentive Transformer in Long-Term Continuous EEG Emotion Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods encounter significant challenges in real-life scenarios where emotional states evolve over extended periods. To address this issue, we propose a Dual Attentive (DuA) transformer framework for long-term continuous EEG emotion analysis.	YUE PAN et. al.	arxiv-cs.HC	2024-07-29
633	Legal Minds, Algorithmic Decisions: How LLMs Apply Constitutional Principles in Complex Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct an empirical analysis of how large language models (LLMs), specifically GPT-4, interpret constitutional principles in complex decision-making scenarios.	Camilla Bignotti; Carolina Camassa;	arxiv-cs.CL	2024-07-29
634	Motamot: A Dataset for Revealing The Supremacy of Large Language Models Over Transformer Models in Bengali Political Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate political sentiment analysis during Bangladeshi elections, specifically examining how effectively Pre-trained Language Models (PLMs) and Large Language Models (LLMs) capture complex sentiment characteristics.	FATEMA TUJ JOHORA FARIA et. al.	arxiv-cs.CL	2024-07-28
635	The Impact of LoRA Adapters for LLMs on Clinical NLP Classification Under Data Limitations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to the domain gap and limited data availability.	Thanh-Dung Le; Ti Ti Nguyen; Vu Nguyen Ha;	arxiv-cs.CL	2024-07-27
636	FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks.	Seyed Mojtaba Sadjadi; Zeinab Rajabi; Leila Rabiei; Mohammad-Shahram Moin;	arxiv-cs.CL	2024-07-27
637	QT-TDM: Planning With Transformer Dynamics Model and Autoregressive Q-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the success of the Transformer architecture in natural language processing and computer vision, we investigate the use of Transformers in Reinforcement Learning (RL), specifically in modeling the environment’s dynamics using Transformer Dynamics Models (TDMs).	Mostafa Kotb; Cornelius Weber; Muhammad Burhan Hafez; Stefan Wermter;	arxiv-cs.LG	2024-07-26
638	GPT Deciphering Fedspeak: Quantifying Dissent Among Hawks and Doves Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPT-4 to quantify dissent among members on the topic of inflation.	DENIS PESKOFF et. al.	arxiv-cs.AI	2024-07-26
639	Is Larger Always Better? Evaluating and Prompting Large Language Models for Non-generative Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models, for non-generative medical tasks utilizing renowned datasets.	YINGHAO ZHU et. al.	arxiv-cs.CL	2024-07-26
640	Using GPT-4 to Guide Causal Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are interested in the ability of LLMs to identify causal relationships.	Anthony C. Constantinou; Neville K. Kitson; Alessio Zanga;	arxiv-cs.AI	2024-07-26
641	Automatic Detection of Moral Values in Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues.	Vjosa Preniqi; Iacopo Ghinassi; Julia Ive; Kyriaki Kalimeri; Charalampos Saitis;	arxiv-cs.CY	2024-07-26
642	Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel joint graph learning approach that combines the rich contextual representations learned by pre-trained single-cell language models with the structured knowledge encoded in GRNs using graph neural networks (GNNs).	Sindhura Kommu; Yizhi Wang; Yue Wang; Xuan Wang;	arxiv-cs.LG	2024-07-25
643	HDL-GPT: High-Quality HDL Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models.	BHUVNESH KUMAR et. al.	arxiv-cs.LG	2024-07-25
644	The Power of Combining Data and Knowledge: GPT-4o Is An Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models to enhance LNM prediction performance.	Danqing Hu; Bing Liu; Xiaofeng Zhu; Nan Wu;	arxiv-cs.CL	2024-07-25
645	Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving.	Zuoyin Tang; Jianhua He; Dashuai Pei; Kezhong Liu; Tao Gao;	arxiv-cs.AI	2024-07-24
646	Cost-effective Instruction Learning for Pathology Vision and Language Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we propose a cost-effective instruction learning framework for conversational pathology named as CLOVER.	KAITAO CHEN et. al.	arxiv-cs.AI	2024-07-24
647	SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we introduced SDoH-GPT, a simple and effective few-shot Large Language Model (LLM) method leveraging contrastive examples and concise instructions to extract SDoH without relying on extensive medical annotations or costly human intervention.	BERNARDO CONSOLI et. al.	arxiv-cs.CL	2024-07-24
648	Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have exhibited remarkable proficiency in natural language understanding, prompting extensive exploration of their potential applications across …	Cui Long; Yongbin Liu; Chunping Ouyang; Ying Yu;	ArXiv	2024-07-24
649	My Ontologist: Evaluating BFO-Based AI for Definition Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through iterative development of a specialized GPT model named My Ontologist, we aimed to generate BFO-conformant ontologies.	Carter Benson; Alec Sculley; Austin Liebers; John Beverley;	arxiv-cs.DB	2024-07-24
650	Artificial Intelligence in Extracting Diagnostic Data from Dental Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research addresses the issue of missing structured data in dental records by extracting diagnostic information from unstructured text.	YAO-SHUN CHUANG et. al.	arxiv-cs.CL	2024-07-23
651	OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code.	FAN CUI et. al.	arxiv-cs.AR	2024-07-23
652	Can Large Language Models Automatically Jailbreak GPT-4V? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization.	YUANWEI WU et. al.	arxiv-cs.CL	2024-07-23
653	KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the adaptation of Transformerbased models for edge devices through the quantisation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC-V platform.	Aness Al-Qawlaq; Ajay Kumar M; Deepu John;	arxiv-cs.AR	2024-07-22
654	Inverted Activations: Reducing Memory Footprint in Neural Network Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a modification to the handling of activation tensors in pointwise nonlinearity layers.	Georgii Novikov; Ivan Oseledets;	arxiv-cs.LG	2024-07-22
655	Dissecting Multiplication in Transformers: Insights Into LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on observation and analysis, we infer the reasons of transformers deficiencies in multiplication tasks lies in their difficulty in calculating successive carryovers and caching intermediate results, and confirmed this inference through experiments. Guided by these findings, we propose improvements to enhance transformers performance on multiplication tasks.	Luyu Qiu; Jianing Li; Chi Su; Chen Jason Zhang; Lei Chen;	arxiv-cs.CL	2024-07-22
656	RadioRAG: Factual Large Language Models for Enhanced Diagnostics in Radiology Using Dynamic Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have advanced the field of artificial intelligence (AI) in medicine.	SOROOSH TAYEBI ARASTEH et. al.	arxiv-cs.CL	2024-07-22
657	Can GPT-4 Learn to Analyse Moves in Research Article Abstracts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we employ the affordances of GPT-4 to automate the annotation process by using natural language prompts.	Danni Yu; Marina Bondi; Ken Hyland;	arxiv-cs.CL	2024-07-22
658	Efficient Visual Transformer By Learnable Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Learnable Token Merging (LTM), or LTM-Transformer.	Yancheng Wang; Yingzhen Yang;	arxiv-cs.CV	2024-07-21
659	LLMs Left, Right, and Center: Assessing GPT’s Capabilities to Label Political Bias from Web Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale (far-left to far-right).	Raphael Hernandes; Giulio Corsi;	arxiv-cs.CL	2024-07-19
660	Unipa-GPT: Large Language Models for University-oriented QA in Italian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments we adopted both the Retrieval Augmented Generation (RAG) approach and fine-tuning to develop the system.	Irene Siragusa; Roberto Pirrone;	arxiv-cs.CL	2024-07-19
661	Can Open-Source LLMs Compete with Commercial Models? Exploring The Few-Shot Performance of Current GPT Models in Biomedical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3.5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning.	Samy Ateia; Udo Kruschwitz;	arxiv-cs.CL	2024-07-18
662	Evaluating Large Language Models for Anxiety and Depression Classification Using Counseling and Psychotherapy Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts.	Junwei Sun; Siqi Ma; Yiran Fan; Peter Washington;	arxiv-cs.CL	2024-07-18
663	Sharif-STR at SemEval-2024 Task 1: Transformer As A Regression Model for Fine-Grained Scoring of Textual Semantic Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the investigation of sentence-level STR within Track A (Supervised) by leveraging fine-tuning techniques on the RoBERTa transformer.	SEYEDEH FATEMEH EBRAHIMI et. al.	arxiv-cs.CL	2024-07-17
664	LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel LLMs-in-the-loop approach to develop supervised neural machine translation models optimized specifically for medical texts.	Bunyamin Keles; Murat Gunay; Serdar I. Caglar;	arxiv-cs.CL	2024-07-16
665	Large Language Models As Misleading Assistants in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users.	BETTY LI HOU et. al.	arxiv-cs.CL	2024-07-16
666	Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task.	SEYEDEH FATEMEH EBRAHIMI et. al.	arxiv-cs.CL	2024-07-16
667	Educational Personalized Learning Path Planning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its potential, traditional PLPP systems often lack adaptability, interactivity, and transparency. This paper proposes a novel approach integrating Large Language Models (LLMs) with prompt engineering to address these challenges.	Chee Ng; Yuen Fung;	arxiv-cs.CL	2024-07-16
668	Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches …	Yaswanth Narsupalli; Abhranil Chandra; Sreevatsa Muppirala; Manish Gupta; Pawan Goyal;	ArXiv	2024-07-16
669	Does Refusal Training in LLMs Generalize to The Past Tense? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We systematically evaluate this method on Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, GPT-4o mini, GPT-4o, o1-mini, o1-preview, and R2D2 models using GPT-3.5 Turbo as a reformulation model.	Maksym Andriushchenko; Nicolas Flammarion;	arxiv-cs.CL	2024-07-16
670	GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution is a set of features, their properties, definitions, and examples in a machine-readable format, along with the code for RhetAnn and the GPT prompts and fine-tuning procedures for advancing state-of-the-art interpretable propaganda technique detection.	Kyle Hamilton; Luca Longo; Bojan Bozic;	arxiv-cs.CL	2024-07-16
671	A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies show that creating a high-quality training dataset for software engineering chatbots is expensive in terms of both resources and time. Aims: Therefore, in this paper, we present an automated transformer-based approach to augment software engineering chatbot datasets.	Ahmad Abdellatif; Khaled Badran; Diego Elias Costa; Emad Shihab;	arxiv-cs.SE	2024-07-16
672	Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To thoroughly examine the behaviours of Transformer-based MDS models, this paper presents five empirical studies on (1) measuring the impact of document boundary separators quantitatively; (2) exploring the effectiveness of different mainstream Transformer structures; (3) examining the sensitivity of the encoder and decoder; (4) discussing different training strategies; and (5) discovering the repetition in a summary generation.	Congbo Ma; Wei Emma Zhang; Dileepa Pitawela; Haojie Zhuang; Yanfeng Shu;	arxiv-cs.CL	2024-07-16
673	GPT-4V Cannot Generate Radiology Reports Yet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we perform a systematic evaluation of GPT-4V in generating radiology reports on two chest X-ray report datasets: MIMIC-CXR and IU X-Ray.	Yuyang Jiang; Chacha Chen; Dang Nguyen; Benjamin M. Mervak; Chenhao Tan;	arxiv-cs.CY	2024-07-16
674	ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by the need for lightweight, open source, and multilingual dialogue evaluators, this paper introduces GenResCoh (Generated Responses targeting Coherence).	John Mendonça; Isabel Trancoso; Alon Lavie;	arxiv-cs.CL	2024-07-16
675	R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rigorous insights are provided into the influence of jamming LLM word embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE).	ALADIN DJUHERA et. al.	arxiv-cs.LG	2024-07-16
676	Scientific QA System with Verifiable Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the VerifAI project, a pioneering open-source scientific question-answering system, designed to provide answers that are not only referenced but also automatically vetted and verifiable.	ADELA LJAJIĆ et. al.	arxiv-cs.CL	2024-07-16
677	Leveraging LLM-Respondents for Item Evaluation: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, item calibration is time-consuming and costly, requiring a sufficient number of respondents for the response process. We explore using six different LLMs (GPT-3.5, GPT-4, Llama 2, Llama 3, Gemini-Pro, and Cohere Command R Plus) and various combinations of them using sampling methods to produce responses with psychometric properties similar to human answers.	Yunting Liu; Shreya Bhandari; Zachary A. Pardos;	arxiv-cs.CY	2024-07-15
678	GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images Via VLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4o can decode hand gestures from forearm ultrasound data even with no fine-tuning, and improves with few-shot, in-context learning.	Keshav Bimbraw; Ye Wang; Jing Liu; Toshiaki Koike-Akino;	arxiv-cs.CV	2024-07-15
679	Transformer-based Drum-level Prediction in A Boiler Plant with Delayed Relations Among Multivariates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging the capabilities of Transformer architectures, this study aims to develop an accurate and robust predictive framework to anticipate water level fluctuations and facilitate proactive control strategies.	Gang Su; Sun Yang; Zhishuai Li;	arxiv-cs.LG	2024-07-15
680	Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present novel approaches that use a generative pretrained transformer (GPT) to identify paraphasias from transcripts as well as two end-to-end approaches that focus on modeling both automatic speech recognition (ASR) and paraphasia classification as multiple sequences vs. a single sequence.	Matthew Perez; Aneesha Sampath; Minxue Niu; Emily Mower Provost;	arxiv-cs.CL	2024-07-15
681	Hierarchical Local Temporal Feature Enhancing for Transformer-Based 3D Human Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advancements in transformer-based methods have yielded substantial success in 2D-to-3D human pose estimation. Transformer-based estimators have their inherent advantages …	Xin Yan; Chi-Man Pun; Haolun Li; Mengqi Liu; Hao Gao;	2024 IEEE International Conference on Multimedia and Expo …	2024-07-15
682	DistillSeq: A Framework for Safety Alignment Testing in Large Language Models Using Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we deploy two distinct strategies for generating malicious queries: one based on a syntax tree approach, and the other leveraging an LLM-based method.	Mingke Yang; Yuqi Chen; Yi Liu; Ling Shi;	arxiv-cs.SE	2024-07-14
683	Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, current works use GPT-4 to only predict the correct option without providing any explanation and thus do not provide any insight into the thinking process and reasoning used by GPT-4 or other LLMs. Therefore, we introduce a new domain-specific error taxonomy derived from collaboration with medical students.	SOUMYADEEP ROY et. al.	sigir	2024-07-14
684	Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.	Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu;	sigir	2024-07-14
685	Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4).	GE GAO et. al.	arxiv-cs.CL	2024-07-14
686	TransOptAS: Transformer-Based Algorithm Selection for Single-Objective Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View	Gjorgjina Cenikj; G. Petelin; T. Eftimov;	GECCO Companion	2024-07-14
687	Reflections on The Coding Ability of LLMs for Analyzing Market Research Surveys Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first systematic study of applying large language models (in our case, GPT-3.5 and GPT-4) for the automatic coding (multi-class classification) problem in market research.	Shi Zong; Santosh Kolagati; Amit Chaudhary; Josh Seltzer; Jimmy Lin;	sigir	2024-07-14
688	CodeV: Empowering LLMs for Verilog Generation Through Multi-Level Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs.	YANG ZHAO et. al.	arxiv-cs.PL	2024-07-14
689	Legal Statute Identification: A Case Study Using State-of-the-Art Datasets and Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reproduce several LSI models on two popular LSI datasets and study the effect of the above-mentioned challenges.	Shounak Paul; Rajas Bhatt; Pawan Goyal; Saptarshi Ghosh;	sigir	2024-07-14
690	Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking Over Larger Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages.	Vinay Setty;	sigir	2024-07-14
691	Generalizable Tip-of-the-Tongue Retrieval with LLM Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the generalization capabilities of existing retrieval methods with ToT queries in multiple domains.	Lu\'{\i}s Borges; Rohan Jha; Jamie Callan; Bruno Martins;	sigir	2024-07-14
692	Causality Extraction from Medical Text Using Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of natural language models, including large language models, to extract causal relations from medical texts, specifically from Clinical Practice Guidelines (CPGs).	Seethalakshmi Gopalakrishnan; Luciana Garbayo; Wlodek Zadrozny;	arxiv-cs.CL	2024-07-13
693	Document-level Clinical Entity and Relation Extraction Via Knowledge Base-Guided Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability.	Kriti Bhattarai; Inez Y. Oh; Zachary B. Abrams; Albert M. Lai;	arxiv-cs.CL	2024-07-13
694	Robustness of LLMs to Perturbations in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs’ robustness against the corrupt variations of the original text.	Ayush Singh; Navpreet Singh; Shubham Vatsal;	arxiv-cs.CL	2024-07-12
695	A Survey on Symbolic Knowledge Distillation of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This survey paper delves into the emerging and critical area of symbolic knowledge distillation in Large Language Models (LLMs). As LLMs like Generative Pre-trained Transformer-3 …	Kamal Acharya; Alvaro Velasquez; H. Song;	ArXiv	2024-07-12
696	ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a reinforcement learning formulation of the LLM red-teaming task that allows us to discover prompts that both (1) trigger toxic outputs from a frozen defender and (2) have low perplexity as scored by that defender.	Amelia F. Hardy; Houjun Liu; Bernard Lange; Mykel J. Kochenderfer;	arxiv-cs.CL	2024-07-12
697	EVOLVE: Predicting User Evolution and Network Dynamics in Social Media Using Fine-Tuned GPT-like Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we propose a predictive method to understand how a user evolves on social media throughout their life and to forecast the next stage of their evolution.	Ismail Hossain; Md Jahangir Alam; Sai Puppala; Sajedul Talukder;	arxiv-cs.SI	2024-07-12
698	Movie Recommendation with Poster Attention Via Multi-modal Transformer Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal movie recommendation system by extract features of the well designed posters for each movie and the narrative text description of the movie.	Linhan Xia; Yicheng Yang; Ziou Chen; Zheng Yang; Shengxin Zhu;	arxiv-cs.IR	2024-07-12
699	The Two Sides of The Coin: Hallucination Generation and Detection with LLMs As Evaluators for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content.	ANH THU MARIA BUI et. al.	arxiv-cs.AI	2024-07-12
700	On Exact Bit-level Reversible Transformers Without Changing Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference.	Guoqiang Zhang; J. P. Lewis; W. B. Kleijn;	arxiv-cs.LG	2024-07-12
701	Show, Don’t Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the models’ ability to generalize beyond their training data, we introduce two additional games.	Gonçalo Hora de Carvalho; Oscar Knap; Robert Pollice;	arxiv-cs.AI	2024-07-12
702	Detect Llama — Finding Vulnerabilities in Smart Contracts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we test the hypothesis that although OpenAI’s GPT-4 performs well generally, we can fine-tune open-source models to outperform GPT-4 in smart contract vulnerability detection.	Peter Ince; Xiapu Luo; Jiangshan Yu; Joseph K. Liu; Xiaoning Du;	arxiv-cs.CR	2024-07-11
703	GPT-4 Is Judged More Human Than Humans in Displaced and Inverted Turing Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall.	Ishika Rathi; Sydney Taylor; Benjamin K. Bergen; Cameron R. Jones;	arxiv-cs.HC	2024-07-11
704	LLMs’ Morphological Analyses of Complex FST-generated Finnish Words Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms.	Anssi Moisio; Mathias Creutz; Mikko Kurimo;	arxiv-cs.CL	2024-07-11
705	Teaching Transformers Causal Reasoning Through Axiomatic Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data.	Aniket Vashishtha; Abhinav Kumar; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma;	arxiv-cs.LG	2024-07-10
706	FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the previous approaches has investigated the efficiency of LLM-based few-shot learning in domain-specific scenarios. To address this gap, we introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets, with a focus on industrial manufacturing and maintenance, while using multiple LLMs — GPT-4-32K, GPT-3.5-Turbo, LLaMA 2-chat, and Vicuna.	Yongjian Tang; Rakebul Hasan; Thomas Runkler;	arxiv-cs.CL	2024-07-10
707	Transformer Neural Networks with Spatiotemporal Attention for Predictive Control and Optimization of Industrial Processes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the context of real-time optimization and model predictive control of industrial systems, machine learning, and neural networks represent cutting-edge tools that hold promise …	Ethan R. Gallup; Jacob F. Tuttle; Jake Immonen; Blake W. Billings; Kody M. Powell;	2024 American Control Conference (ACC)	2024-07-10
708	ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose Random Subspace Adaptation (ROSA), a method that outperforms previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time.	Marawan Gamal Abdel Hameed; Aristides Milios; Siva Reddy; Guillaume Rabusseau;	arxiv-cs.LG	2024-07-10
709	Mixture-of-Modules: Reinventing Transformers As Dynamic Assemblies of Modules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization.	ZHUOCHENG GONG et. al.	arxiv-cs.CL	2024-07-09
710	Prompting Techniques for Secure Code Generation: A Systematic Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OBJECTIVE: In this study, we investigate the impact of different prompting techniques on the security of code generated from NL instructions by LLMs.	Catherine Tony; Nicolás E. Díaz Ferreyra; Markus Mutas; Salem Dhiff; Riccardo Scandariato;	arxiv-cs.SE	2024-07-09
711	A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine five feature extraction methods (TF-IDF, LSI, BERT, MiniLM, RoBERTa) and find that Term Frequency-Inverse Document Frequency (TF-IDF) outperforms the other four methods with a precision of 75\% and an F1 score of 64\%.	Refat Othman; Bruno Rossi; Russo Barbara;	arxiv-cs.CR	2024-07-09
712	Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce Multilingual Blending, a mixed-language query-response scheme designed to evaluate the safety alignment of various state-of-the-art LLMs (e.g., GPT-4o, GPT-3.5, Llama3) under sophisticated, multilingual conditions.	Jiayang Song; Yuheng Huang; Zhehua Zhou; Lei Ma;	arxiv-cs.CL	2024-07-09
713	Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To assess the LLM output, we propose completeness, soundness, clarity, syntax, and functioning code as metrics.	Inwon Kang; William Van Woensel; Oshani Seneviratne;	arxiv-cs.CL	2024-07-09
714	PEER: Expertizing Domain-Specific Tasks with A Multi-Agent Framework and Tuning Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework.	YIYING WANG et. al.	arxiv-cs.AI	2024-07-09
715	Short Answer Scoring with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View	Lan Jiang; Nigel Bosch;	ACM Conference on Learning @ Scale	2024-07-09
716	Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations.	Moritz Reuss; Ömer Erdinç Yağmurlu; Fabian Wenzel; Rudolf Lioutikov;	arxiv-cs.RO	2024-07-08
717	Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR).	YAOZONG GAN et. al.	arxiv-cs.CV	2024-07-08
718	Intent Aware Data Augmentation By Leveraging Generative AI for Stress Detection in Social Media Texts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Stress is a major issue in modern society. Researchers focus on identifying stress in individuals, linking language with mental health, and often utilizing social media posts. …	Minhah Saleem; Jihie Kim;	PeerJ Comput. Sci.	2024-07-08
719	Surprising Gender Biases in GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present seven experiments exploring gender biases in GPT.	Raluca Alexandra Fulgu; Valerio Capraro;	arxiv-cs.CY	2024-07-08
720	On The Power of Convolution Augmented Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent architectural recipes, such as state-space models, have bridged the performance gap. Motivated by this, we examine the benefits of Convolution-Augmented Transformer (CAT) for recall, copying, and length generalization tasks.	Mingchen Li; Xuechen Zhang; Yixiao Huang; Samet Oymak;	arxiv-cs.LG	2024-07-08
721	Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study underscores the crucial role of prompt engineering in maximizing the educational benefits of LLMs. By systematically categorizing and testing these strategies, we provide a comprehensive framework for both educators and students to optimize LLM-based learning experiences.	Tianyu Wang; Nianjun Zhou; Zhixiong Chen;	arxiv-cs.AI	2024-07-07
722	A Novel Automated Urban Building Analysis Framework Based on GPT and SAM Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Rapid urban development necessitates advanced methodologies for efficiently acquiring and analyzing detailed building information. This study proposes an automated framework, …	Yuchao Sun; Xianping Ma; Yizhen Yan; Man-On Pun; Bo Huang;	IGARSS 2024 – 2024 IEEE International Geoscience and Remote …	2024-07-07
723	Image-Conditional Diffusion Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT).	XINGYANG NIE et. al.	arxiv-cs.CV	2024-07-07
724	MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid advancement of Large Language Models (LLMs) and Large Multimodal Models (LMMs) has heightened the demand for AI-based scientific assistants capable of understanding …	ZEKUN LI et. al.	ArXiv	2024-07-06
725	Associative Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step.	Ivan Rodkin; Yuri Kuratov; Aydar Bulatov; Mikhail Burtsev;	arxiv-cs.CL	2024-07-05
726	Using LLMs to Label Medical Papers According to The CIViC Evidence Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP.	Markus Hisch; Xing David Wang;	arxiv-cs.CL	2024-07-05
727	Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored.	Nathan Herr; Fernando Acero; Roberta Raileanu; María Pérez-Ortiz; Zhibin Li;	arxiv-cs.AI	2024-07-05
728	Generalists Vs. Specialists: Evaluating Large Language Models for Urdu Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models–XLM-Roberta-large, mT5-large, and Llama-3-8b–that have been fine-tuned on specific tasks.	Samee Arif; Abdul Hameed Azeemi; Agha Ali Raza; Awais Athar;	arxiv-cs.CL	2024-07-05
729	Enhancing Multi-Agent Communication Collaboration Through GPT-Based Semantic Information Extraction and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	Xinfeng Deng; Li Zhou; Dezun Dong; Jibo Wei;	ACM Turing Award Celebration Conference 2024	2024-07-05
730	Adaptive Step-size Perception Unfolding Network with Non-local Hybrid Attention for Hyperspectral Image Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the aforementioned drawbacks, We proposed an adaptive step-size perception unfolding network (ASPUN), a deep unfolding network based on FISTA algorithm, which uses an adaptive step-size perception module to estimate the update step-size of each spectral channel.	Yanan Yang; Like Xin;	arxiv-cs.CV	2024-07-04
731	HYBRINFOX at CheckThat! 2024 – Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! competition. The specificity of the method is to use a hybrid …	MORGANE CASANOVA et. al.	Conference and Labs of the Evaluation Forum	2024-07-04
732	From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with a focus on their abilities in reasoning and explainability.	Stefanie Krause; Frieder Stolzenburg;	arxiv-cs.AI	2024-07-04
733	HYBRINFOX at CheckThat! 2024 — Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat!	MORGANE CASANOVA et. al.	arxiv-cs.CL	2024-07-04
734	GPT-4 Vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains.	JIANHAO YAN et. al.	arxiv-cs.CL	2024-07-04
735	TrackPGD: Efficient Adversarial Attack Using Object Binary Masks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce TrackPGD, a novel white-box attack that utilizes predicted object binary masks to target robust transformer trackers.	Fatemeh Nourilenjan Nokabadi; Yann Batiste Pequignot; Jean-Francois Lalonde; Christian Gagné;	arxiv-cs.CV	2024-07-04
736	Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task.	Sachin Yadav; Tejaswi Choppa; Dominik Schlechtweg;	arxiv-cs.CL	2024-07-04
737	CATT: Character-based Arabic Tashkeel Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new approach to training ATD models.	Faris Alasmary; Orjuwan Zaafarani; Ahmad Ghannam;	arxiv-cs.CL	2024-07-03
738	Large Language Models As Evaluators for Scientific Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study explores how well the state-of-the-art Large Language Models (LLMs), like GPT-4 and Mistral, can assess the quality of scientific summaries or, more fittingly, scientific syntheses, comparing their evaluations to those of human annotators.	Julia Evans; Jennifer D’Souza; Sören Auer;	arxiv-cs.CL	2024-07-03
739	Mast Kalandar at SemEval-2024 Task 8: On The Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness.	Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava; Radhika Mamidi;	arxiv-cs.CL	2024-07-03
740	RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG.	YUE YU et. al.	arxiv-cs.CL	2024-07-02
741	Assessing The Code Clone Detection Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection.	Zixian Zhang; Takfarinas Saber;	arxiv-cs.SE	2024-07-02
742	Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.	MENGLIN YANG et. al.	arxiv-cs.LG	2024-07-01
743	FATFusion: A Functional-anatomical Transformer for Medical Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View	Wei Tang; Fazhi He;	Inf. Process. Manag.	2024-07-01
744	Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, LLMs struggle to converge even when explicitly prompted to do so and are sensitive to prompt variations. To overcome these issues, we introduce a hybrid algorithm: LLM-Enhanced Adaptive Dueling (LEAD), which takes advantage of both in-context decision-making capabilities of LLMs and theoretical guarantees inherited from classic DB algorithms.	Fanzeng Xia; Hao Liu; Yisong Yue; Tongxin Li;	arxiv-cs.LG	2024-07-01
745	Image-to-Text Logic Jailbreak: Your Imagination Can Help You Do Anything Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the integration of visual and text inputs in VLMs, new security issues emerge, as malicious attackers can exploit multiple modalities to achieve their objectives.	Xiaotian Zou; Ke Li; Yongkang Chen;	arxiv-cs.CR	2024-07-01
746	Prompting GPT -4 to Support Automatic Safety Case Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti;	Expert Syst. Appl.	2024-07-01
747	Transformer Autoencoder for K-means Efficient Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View	Wenhao Wu; Weiwei Wang; Xixi Jia; Xiangchu Feng;	Eng. Appl. Artif. Intell.	2024-07-01
748	TextCheater: A Query-Efficient Textual Adversarial Attack in The Hard-Label Setting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing a query-efficient attack strategy to generate high-quality adversarial examples under the hard-label black-box setting is a fundamental yet challenging problem, …	HAO PENG et. al.	IEEE Transactions on Dependable and Secure Computing	2024-07-01
749	MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions.	YUBO MA et. al.	arxiv-cs.CV	2024-07-01
750	Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based models have made significant advancements across various domains, largely due to the self-attention mechanism’s ability to capture contextual relationships in …	HULIN WANG et. al.	IEEE Transactions on Computers	2024-07-01
751	Dynamic Region-aware Transformer Backbone Network for Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	Jun Wang; Shuai Yang; Yuanyun Wang;	Eng. Appl. Artif. Intell.	2024-07-01
752	Token-disentangling Mutual Transformer for Multimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	GUANGHAO YIN et. al.	Eng. Appl. Artif. Intell.	2024-07-01
753	ITFuse: An Interactive Transformer for Infrared and Visible Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View	Wei Tang; Fazhi He; Yu Liu;	Pattern Recognit.	2024-07-01
754	TE-Spikformer:Temporal-enhanced Spiking Neural Network with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	SHOUWEI GAO et. al.	Neurocomputing	2024-07-01
755	Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts.	Kota Shamanth Ramanath Nayak; Leila Kosseim;	arxiv-cs.CL	2024-07-01
756	DC Bias Content Extraction of Power Transformer Under AC and DC Environment and Its Suppression Measures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The phenomenon of transformer dc bias (TDB) will saturate the transformer core, resulting in the local overheating, accelerating the ageing of insulating material, and even …	ZHIWEI CHEN et. al.	IEEE Transactions on Industrial Electronics	2024-07-01
757	Multi-Turn Hidden Backdoor in Large Language Model-powered Chatbot Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Model (LLM)-powered chatbot services like GPTs, simulating human-to-human conversation via machine-generated text, are used in numerous fields. They are enhanced by …	Bocheng Chen; Nikolay Ivanov; Guangjing Wang; Qiben Yan;	Proceedings of the 19th ACM Asia Conference on Computer and …	2024-07-01
758	LegalTurk Optimized BERT for Multi-Label Text Classification and NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies.	Farnaz Zeidi; Mehmet Fatih Amasyali; Çiğdem Erol;	arxiv-cs.CL	2024-06-30
759	WallFacer: Harnessing Multi-dimensional Ring Parallelism for Efficient Long Sequence Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods are either constrained by the number of attention heads or excessive communication overheads. To address this problem, we propose WallFacer, a multi-dimensional distributed training system for long sequences, fostering an efficient communication paradigm and providing additional tuning flexibility for communication arrangements.	ZIMING LIU et. al.	arxiv-cs.DC	2024-06-30
760	Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning has gained significant traction in natural language processing due to the emergence of state-of-the-art pre-trained language models (P.L.M.s). Unlike traditional …	Shadi Jaradat; Richi Nayak; Alexander Paz; Mohammed Elhenawy;	Algorithms	2024-06-30
761	LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, prior research harbors two primary concerns: firstly, a lack of contemplation regarding whether the natural language generated by LLM (LLMNL) truly aligns with human natural language (HNL), a critical foundational question; secondly, an oversight that augmented data is randomly generated by LLM, implying that not all data may possess equal training value, that could impede the performance of classifiers. To address these challenges, we introduce the scaling laws to intrinsically calculate LLMNL and HNL.	Zhenhua Wang; Guang Xu; Ming Ren;	arxiv-cs.CL	2024-06-29
762	Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users’ quit-vaping intentions.	SAI KRISHNA REVANTH VURUMA et. al.	arxiv-cs.CL	2024-06-28
763	Machine Learning Predictors for Min-Entropy Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing data from Generalized Binary Autoregressive Models, a subset of Markov processes, we demonstrate that machine learning models (including a hybrid of convolutional and recurrent Long Short-Term Memory layers and the transformer-based GPT-2 model) outperform traditional NIST SP 800-90B predictors in certain scenarios.	Javier Blanco-Romero; Vicente Lorenzo; Florina Almenares Mendoza; Daniel Díaz-Sánchez;	arxiv-cs.LG	2024-06-28
764	Optimizing Uyghur Speech Synthesis By Combining Pretrained Cross-Lingual Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: End-to-end speech synthesis methodologies have exhibited considerable advancements for languages with abundant corpus resources. Nevertheless, such achievements are yet to be …	Kexin Lu; Zhihua Huang; Mingming Yin; Ke Chen;	ACM Transactions on Asian and Low-Resource Language …	2024-06-28
765	The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP).	Xiliang Zhu; Shayna Gardiner; Tere Roldán; David Rossouw;	arxiv-cs.CL	2024-06-27
766	Fine-tuned Network Relies on Generic Representation to Solve Unseen Cognitive Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning pretrained language models has shown promising results on a wide range of tasks, but when encountering a novel task, do they rely more on generic pretrained representation, or develop brand new task-specific solutions?	Dongyan Lin;	arxiv-cs.LG	2024-06-27
767	FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FRED, a wafer-scale interconnect that is tailored for the high-BW requirements of wafer-scale networks and can efficiently execute communication patterns of different parallelization strategies.	Saeed Rashidi; William Won; Sudarshan Srinivasan; Puneet Gupta; Tushar Krishna;	arxiv-cs.AR	2024-06-27
768	BADGE: BADminton Report Generation and Evaluation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel framework named BADGE, designed for this purpose using LLM.	Shang-Hsuan Chiang; Lin-Wei Chao; Kuang-Da Wang; Chih-Chuan Wang; Wen-Chih Peng;	arxiv-cs.CL	2024-06-26
769	Automating Clinical Trial Eligibility Screening: Quantitative Analysis of GPT Models Versus Human Expertise Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Objective: This study quantitatively assesses the performance of GPT model in classifying patient eligibility for clinical trials, aiming to minimize the need for expert clinical …	ARTI DEVI et. al.	Proceedings of the 17th International Conference on …	2024-06-26
770	This Paper Had The Smartest Reviewers — Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of speech and train machine learning models for automatic flattery detection.	LUKAS CHRIST et. al.	arxiv-cs.SD	2024-06-25
771	SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR).	Quan Mai; Susan Gauch; Douglas Adams;	arxiv-cs.CL	2024-06-25
772	Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora.	YIMING LI et. al.	arxiv-cs.CL	2024-06-25
773	CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design.	NAFIS NEEHAL et. al.	arxiv-cs.CL	2024-06-25
774	Autonomous Prompt Engineering in Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prompt engineering is a crucial yet challenging task for optimizing the performance of large language models (LLMs) on customized tasks. This pioneering research introduces the …	Daan Kepel; Konstantina Valogianni;	ArXiv	2024-06-25
775	This Paper Had The Smartest Reviewers – Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Flattery is an important aspect of human communication that facilitates social bonding, shapes perceptions, and influences behavior through strategic compliments and praise, …	LUKAS CHRIST et. al.	ArXiv	2024-06-25
776	Unambiguous Recognition Should Not Rely Solely on Natural Language Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This bias stems from the inherent characteristics of the dataset. To mitigate this bias, we propose a LaTeX printed text recognition model trained on a mixed dataset of pseudo-formulas and pseudo-text.	Renqing Luo; Yuhan Xu;	arxiv-cs.CV	2024-06-24
777	Exploring The Capability of Mamba in Speech Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compared Mamba with state-of-the-art Transformer variants for various speech applications, including ASR, text-to-speech, spoken language understanding, and speech summarization.	Koichi Miyazaki; Yoshiki Masuyama; Masato Murata;	arxiv-cs.SD	2024-06-24
778	GPT-4V Explorations: Mining Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the application of the GPT-4V(ision) large visual language model to autonomous driving in mining environments, where traditional systems often falter in understanding intentions and making accurate decisions during emergencies.	Zixuan Li;	arxiv-cs.CV	2024-06-24
779	Exploring The Capabilities of Large Language Models for The Generation of Safety Cases: The Case of GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of large language models (LLMs) and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across …	Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti;	2024 IEEE 32nd International Requirements Engineering …	2024-06-24
780	Using GPT-4 Turbo to Automatically Identify Defeaters in Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are convincing arguments, supported by a body of evidence and aiming at demonstrating that a system will function as intended. Producers of systems can rely …	K. K. SHAHANDASHTI et. al.	2024 IEEE 32nd International Requirements Engineering …	2024-06-24
781	Exploring Factual Entailment with NLI: A News Media Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel — a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles.	Guy Mor-Lan; Effi Levi;	arxiv-cs.CL	2024-06-24
782	DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models.	YUANG PENG et. al.	arxiv-cs.CV	2024-06-24
783	The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions.	Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz;	arxiv-cs.CL	2024-06-24
784	OlympicArena Medal Ranks: Who Is The Most Intelligent AI So Far? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)?	Zhen Huang; Zengzhi Wang; Shijie Xia; Pengfei Liu;	arxiv-cs.CL	2024-06-24
785	Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT).	Ni Wang; Dongliang Liao; Xing Xu;	arxiv-cs.CV	2024-06-23
786	GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent studies have identified limitations in LLMs’ ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph data structure problems along with 2000 test cases.	Qiming Wu; Zichen Chen; Will Corcoran; Misha Sra; Ambuj K. Singh;	arxiv-cs.AI	2024-06-23
787	Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate a broader view of knowledge location, that of concepts or clusters of related information, instead of disparate individual facts.	Christopher Burger; Yifan Hu; Thai Le;	arxiv-cs.LG	2024-06-22
788	The Role of Generative AI in Qualitative Research: GPT-4’s Contributions to A Grounded Theory Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present reflections on our experience using a generative AI model in qualitative research, to illuminate the AI’s contributions to our analytic process. Our analytic focus was …	Ravi Sinha; Idris Solola; Ha Nguyen; H. Swanson; LuEttaMae Lawrence;	Proceedings of the 2024 Symposium on Learning, Design and …	2024-06-21
789	How Effective Is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom’s Revised Taxonomy? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2024-06-21
790	VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack.	Jonathan Rusert;	naacl	2024-06-20
791	Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang.	Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu;	naacl	2024-06-20
792	MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3.	SANCHIT AHUJA et. al.	naacl	2024-06-20
793	Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3.	Sindhu Kishore; Hangfeng He;	naacl	2024-06-20
794	Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults.	Afonso de Sá Delgado Neto; Maximilian Egger; Mayank Bakshi; Rawad Bitar;	arxiv-cs.LG	2024-06-20
795	Does GPT Really Get It? A Hierarchical Scale to Quantify Human Vs AI’s Understanding of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences.	Mirabel Reid; Santosh S. Vempala;	arxiv-cs.AI	2024-06-20
796	CryptoGPT: A 7B Model Rivaling GPT-4 in The Task of Analyzing and Classifying Real-time Financial News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CryptoGPT: a 7B model competing with GPT-4 in a specific task — The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT.	Ying Zhang; Matthieu Petit Guillaume; Aurélien Krauth; Manel Labidi;	arxiv-cs.AI	2024-06-20
797	Removing RLHF Protections in GPT-4 Via Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate.	QIUSI ZHAN et. al.	naacl	2024-06-20
798	Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance.	XIANGRU TANG et. al.	naacl	2024-06-20
799	A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing.	DONG YUAN et. al.	naacl	2024-06-20
800	A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems.	Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas;	naacl	2024-06-20
801	Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature.	Anshuman Chhabra; Hadi Askari; Prasant Mohapatra;	naacl	2024-06-20
802	ChatGPT As Research Scientist: Probing GPT’s Capabilities As A Research Librarian, Research Ethicist, Data Generator and Data Predictor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research …	Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R. Banaji;	arxiv-cs.AI	2024-06-20
803	CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively.	Ming Jiang; Mansi Joshi;	naacl	2024-06-20
804	On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck.	TING-RUI CHIANG et. al.	naacl	2024-06-20
805	Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models.	Anej Svete; Ryan Cotterell;	naacl	2024-06-20
806	Does GPT-4 Pass The Turing Test? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness.	Cameron Jones; Ben Bergen;	naacl	2024-06-20
807	SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning.	ARASH ARDAKANI et. al.	naacl	2024-06-20
808	SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs.	Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng;	naacl	2024-06-20
809	Metacognitive Prompting Improves Understanding in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes.	Yuqing Wang; Yun Zhao;	naacl	2024-06-20
810	Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles.	Hamdireza Rouzegar; Masoud Makrehchi;	arxiv-cs.CL	2024-06-19
811	A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem.	JIAQI LIU et. al.	arxiv-cs.RO	2024-06-19
812	Fine-Tuning BERTs for Definition Extraction from Mathematical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fine-tuned three pre-trained BERT models on the task of definition extraction from mathematical English written in LaTeX.	Lucy Horowitz; Ryan Hathaway;	arxiv-cs.CL	2024-06-19
813	Putting GPT-4o to The Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study …	SAKIB SHAHRIAR et. al.	ArXiv	2024-06-19
814	ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time.	TEAM GLM et. al.	arxiv-cs.CL	2024-06-18
815	SwinStyleformer Is A Favorable Choice for Image Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects.	Jiawei Mao; Guangyi Zhao; Xuesong Yin; Yuanqi Chang;	arxiv-cs.CV	2024-06-18
816	Reality Check: Assessing GPT-4 in Fixing Real-World Software Vulnerabilities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Discovering and mitigating software vulnerabilities is a challenging task. These vulnerabilities are often caused by simple, otherwise (and in other contexts) harmless code …	ZOLTÁN SÁGODI et. al.	Proceedings of the 28th International Conference on …	2024-06-18
817	Generating Educational Materials with Different Levels of Readability Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning.	Chieh-Yang Huang; Jing Wei; Ting-Hao ‘Kenneth’ Huang;	arxiv-cs.CL	2024-06-18
818	What Makes Two Language Models Think Alike? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question.	Jeanne Salle; Louis Jalouzot; Nur Lan; Emmanuel Chemla; Yair Lakretz;	arxiv-cs.CL	2024-06-18
819	ChatGPT: Perspectives from Human–computer Interaction and Psychology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The release of GPT-4 has garnered widespread attention across various fields, signaling the impending widespread adoption and application of Large Language Models (LLMs). However, …	Jiaxi Liu;	Frontiers in Artificial Intelligence	2024-06-18
820	Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a thorough analysis and discussion of the results.	ANKIT AICH et. al.	arxiv-cs.CL	2024-06-18
821	Minimal Self in Humanoid Robot Alter3 Driven By Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Alter3, a humanoid robot that demonstrates spontaneous motion generation through the integration of GPT-4, Large Language Model (LLM).	Takahide Yoshida; Suzune Baba; Atsushi Masumori; Takashi Ikegami;	arxiv-cs.RO	2024-06-17
822	Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify a pitfall of vanilla iterative DPO – improved response quality can lead to increased verbosity.	JIE LIU et. al.	arxiv-cs.CL	2024-06-17
823	GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts.	Binnur Görer; Fatma Başak Aydemir;	arxiv-cs.SE	2024-06-17
824	A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a two-dimensional zero-shot evaluation method for DST using GPT-4, which divides the evaluation into two dimensions: accuracy and completeness.	Ming Gu; Yan Yang;	arxiv-cs.CL	2024-06-17
825	Cultural Conditioning or Placebo? On The Effectiveness of Socio-Demographic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI) or neutral (MMLU and ETHICS).	SAGNIK MUKHERJEE et. al.	arxiv-cs.CL	2024-06-17
826	Problematic Tokens: Tokenizer Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This misrepresentation results in the propagation of under-trained or untrained tokens, which perpetuate biases and pose serious concerns related to data security and ethical standards. We aim to dissect the tokenization mechanics of GPT-4o, illustrating how its simplified token-handling methods amplify these risks and offer strategic solutions to mitigate associated security and ethical issues.	Jin Yang; Zhiqiang Wang; Yanbin Lin; Zunduo Zhao;	arxiv-cs.CL	2024-06-17
827	Significant Productivity Gains Through Programming with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models like GPT and Codex drastically alter many daily tasks, including programming, where they can rapidly generate code from natural language or informal …	Thomas Weber; Maximilian Brandmaier; Albrecht Schmidt; Sven Mayer;	Proceedings of the ACM on Human-Computer Interaction	2024-06-17
828	Look Further Ahead: Testing The Limits of GPT-4 in Path Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still face challenges with long-horizon planning. To study this, we propose path planning tasks as a platform to evaluate LLMs’ ability to navigate long trajectories under geometric constraints.	Mohamed Aghzal; Erion Plaku; Ziyu Yao;	arxiv-cs.AI	2024-06-17
829	DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales.	FAN ZHOU et. al.	arxiv-cs.DB	2024-06-17
830	WellDunn: On The Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model’s utility in clinical practice.	SEYEDALI MOHAMMADI et. al.	arxiv-cs.AI	2024-06-17
831	Exposing The Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dataset MWP-MISTAKE, incorporating MWPs with both correct and incorrect reasoning steps generated through rule-based methods and smaller language models.	Joykirat Singh; Akshay Nambi; Vibhav Vineet;	arxiv-cs.CL	2024-06-16
832	Large Language Models for Automatic Milestone Detection in Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate an LLM’s performance on recordings of a group oral communication task in which utterances are often truncated or not well-formed.	ZHUOXU DUAN et. al.	arxiv-cs.CL	2024-06-16
833	Generating Tables from The Parametric Knowledge of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables.	Yevgeni Berkovitch; Oren Glickman; Amit Somech; Tomer Wolfson;	arxiv-cs.CL	2024-06-16
834	Breaking Boundaries: Investigating The Effects of Model Editing on Cross-linguistic Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts.	SOMNATH BANERJEE et. al.	arxiv-cs.CL	2024-06-16
835	ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we present Video Diffusion GPT (ViD-GPT).	Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao;	arxiv-cs.CV	2024-06-16
836	Distilling Opinions at Scale: Incremental Opinion Summarization Using XL-OPSUMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally.	SRI RAGHAVA MUDDU et. al.	arxiv-cs.CL	2024-06-16
837	KGPA: Robustness Evaluation for Large Language Models Via Cross-Domain Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs).	AIHUA PEI et. al.	arxiv-cs.CL	2024-06-16
838	ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning Via Shared Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an approach to optimize Parameter Efficient Fine Tuning (PEFT) for Pretrained Language Models (PLMs) by implementing a Shared Low Rank Adaptation (ShareLoRA).	Yurun Song; Junchen Zhao; Ian G. Harris; Sangeetha Abdu Jyothi;	arxiv-cs.CL	2024-06-15
839	Multilingual Large Language Models and Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks.	Daniil Gurgurov; Tanja Bäumel; Tatiana Anikina;	arxiv-cs.CL	2024-06-15
840	GPT-Fabric: Folding and Smoothing Fabric By Leveraging Pre-Trained Foundation Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Fabric manipulation has applications in folding blankets, handling patient clothing, and protecting items with covers. It is challenging for robots to perform fabric manipulation …	Vedant Raval; Enyu Zhao; Hejia Zhang; S. Nikolaidis; Daniel Seita;	ArXiv	2024-06-14
841	GPT-4o: Visual Perception Performance of Multimodal Large Language Models in Piglet Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The initial evaluation experiments in this study validate the potential of multimodal large language models in livestock scene video understanding and provide new directions and references for future research on animal behavior video understanding.	Yiqi Wu; Xiaodan Hu; Ziming Fu; Siling Zhou; Jiangong Li;	arxiv-cs.CV	2024-06-14
842	Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work enables extensive hardware/mapping exploration by extending the DSE framework Stream towards support for transformers across a wide variety of hardware architectures and different execution schedules.	Steven Colleman; Arne Symons; Victor J. B. Jung; Marian Verhelst;	arxiv-cs.AR	2024-06-14
843	General Point Model Pretraining with Autoencoding and Autoregressive Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the General Language Model we propose a General Point Model (GPM) that seamlessly integrates autoencoding and autoregressive tasks in a point cloud transformer.	ZHE LI et. al.	cvpr	2024-06-13
844	GPT-ology, Computational Models, Silicon Sampling: How Should We Think About LLMs in Cognitive Science? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models have taken the cognitive science world by storm. It is perhaps timely now to take stock of the various research paradigms that have been used to make …	Desmond C. Ong;	ArXiv	2024-06-13
845	GPT-Fabric: Smoothing and Folding Fabric By Leveraging Pre-Trained Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-Fabric for the canonical tasks of fabric smoothing and folding, where GPT directly outputs an action informing a robot where to grasp and pull a fabric.	Vedant Raval; Enyu Zhao; Hejia Zhang; Stefanos Nikolaidis; Daniel Seita;	arxiv-cs.RO	2024-06-13
846	Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze a mechanism used in two LMs to selectively inhibit items in a context in one task, and find that it underlies a commonly used abstraction across many context-retrieval behaviors.	Jack Merullo; Carsten Eickhoff; Ellie Pavlick;	arxiv-cs.CL	2024-06-13
847	MoMask: Generative Masked Modeling of 3D Human Motions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MoMask a novel masked modeling framework for text-driven 3D human motion generation.	Chuan Guo; Yuxuan Mu; Muhammad Gohar Javed; Sen Wang; Li Cheng;	cvpr	2024-06-13
848	SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum we introduce SDPose a new self-distillation method for improving the performance of small transformer-based models.	SICHEN CHEN et. al.	cvpr	2024-06-13
849	Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally bidirectional feature propagation is highly sensitive to inaccurate optical flow in blurry frames leading to error accumulation during the propagation process. To address these issues we propose BSSTNet Blur-aware Spatio-temporal Sparse Transformer Network.	Huicong Zhang; Haozhe Xie; Hongxun Yao;	cvpr	2024-06-13
850	MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.	Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi;	cvpr	2024-06-13
851	OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions without a large-scale animal text-motion dataset.	ZHANGSIHAO YANG et. al.	cvpr	2024-06-13
852	Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models.	Han Cai; Muyang Li; Qinsheng Zhang; Ming-Yu Liu; Song Han;	cvpr	2024-06-13
853	ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing studies are devoted to designing vision-specific transformers to solve the above problems which introduce additional pre-training costs. Therefore we present a plain pre-training-free and feature-enhanced ViT backbone with Convolutional Multi-scale feature interaction named ViT-CoMer which facilitates bidirectional interaction between CNN and transformer.	Chunlong Xia; Xinliang Wang; Feng Lv; Xin Hao; Yifeng Shi;	cvpr	2024-06-13
854	Permutation Equivariance of Transformers and Its Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks.	HENGYUAN XU et. al.	cvpr	2024-06-13
855	GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic versatile and human-aligned evaluation metric for text-to-3D generative models.	TONG WU et. al.	cvpr	2024-06-13
856	Complex Image-Generative Diffusion Transformer for Audio Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance audio denoising performance, this paper introduces a complex image-generative diffusion transformer that captures more information from the complex Fourier domain.	Junhui Li; Pu Wang; Jialu Li; Youshan Zhang;	arxiv-cs.SD	2024-06-13
857	Mean-Shift Feature Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models developed in NLP make a great impact on computer vision fields producing promising performance on various tasks.	Takumi Kobayashi;	cvpr	2024-06-13
858	What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth.	Xingwu Chen; Difan Zou;	icml	2024-06-12
859	Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2.	NICHOLAS CARLINI et. al.	icml	2024-06-12
860	Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we first introduce LoCoV1, a 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long.	Jon Saad-Falcon; Daniel Y Fu; Simran Arora; Neel Guha; Christopher Re;	icml	2024-06-12
861	Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning.	Jaehoon Kim; Seungwan Jin; Sohyun Park; Someen Park; Kyungsik Han;	arxiv-cs.CL	2024-06-12
862	Trainable Transformer in Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new efficient construction, Transformer in Transformer (in short, TINT), that allows a transformer to simulate and fine-tune more complex models during inference (e.g., pre-trained language models).	Abhishek Panigrahi; Sadhika Malladi; Mengzhou Xia; Sanjeev Arora;	icml	2024-06-12
863	Asymmetry in Low-Rank Adapters of Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices.	JIACHENG ZHU et. al.	icml	2024-06-12
864	Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by previous theoretical study of static version of the attention multiplication problem [Zandieh, Han, Daliri, and Karbasi ICML 2023, Alman and Song NeurIPS 2023], we formally define a dynamic version of attention matrix multiplication problem.	Jan van den Brand; Zhao Song; Tianyi Zhou;	icml	2024-06-12
865	AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively.	REDUAN ACHTIBAT et. al.	icml	2024-06-12
866	How Language Model Hallucinations Can Snowball IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim.	Muru Zhang; Ofir Press; William Merrill; Alisa Liu; Noah A. Smith;	icml	2024-06-12
867	Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses—that intuitively contain more learnable information and are harder to overfit—from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge.	Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion;	icml	2024-06-12
868	PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality.	Praneeth Kacham; Vahab Mirrokni; Peilin Zhong;	icml	2024-06-12
869	Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference.	HAOQI WU et. al.	icml	2024-06-12
870	Entropy-Reinforced Planning with Large Language Models for Drug Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose ERP, Entropy-Reinforced Planning for Transformer Decoding, which employs an entropy-reinforced planning algorithm to enhance the Transformer decoding process and strike a balance between exploitation and exploration.	Xuefeng Liu; Chih-chan Tien; Peng Ding; Songhao Jiang; Rick L. Stevens;	icml	2024-06-12
871	InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval.	BOXIN WANG et. al.	icml	2024-06-12
872	Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly supervise superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model?	COLLIN BURNS et. al.	icml	2024-06-12
873	Timer: Generative Pre-trained Transformers Are Large Time Series Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM).	YONG LIU et. al.	icml	2024-06-12
874	Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias terms.	Brian K Chen; Tianyang Hu; Hui Jin; Hwee Kuan Lee; Kenji Kawaguchi;	icml	2024-06-12
875	GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website.	Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su;	icml	2024-06-12
876	In-Context Principle Learning from Mistakes IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples.	TIANJUN ZHANG et. al.	icml	2024-06-12
877	Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed `OutEffHop`) and use it to address the outlier inefficiency problem of training gigantic transformer-based models.	JERRY YAO-CHIEH HU et. al.	icml	2024-06-12
878	Prodigy: An Expeditiously Adaptive Parameter-Free Learner IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally.	Konstantin Mishchenko; Aaron Defazio;	icml	2024-06-12
879	Fine-Tuned ‘Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification.	Martin Juan José Bucher; Marco Martini;	arxiv-cs.CL	2024-06-12
880	Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge.	Fangyun Wei; Xi Chen; Lin Luo;	icml	2024-06-12
881	Position: On The Possibilities of AI-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds.	SOURADIP CHAKRABORTY et. al.	icml	2024-06-12
882	Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.	CHENG HAN et. al.	icml	2024-06-12
883	Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Differentiable Channel Selection, or DCS-Transformer.	Yancheng Wang; Ping Li; Yingzhen Yang;	icml	2024-06-12
884	SpikeZIP-TF: Conversion Is All You Need for Transformer-based SNN Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel ANN-to-SNN conversion method called SpikeZIP-TF, where ANN and SNN are exactly equivalent, thus incurring no accuracy degradation.	KANG YOU et. al.	icml	2024-06-12
885	Anomaly Detection on Unstable Logs with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs.	Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand;	arxiv-cs.SE	2024-06-11
886	LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approaches for the SMM4H24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders.	Dasun Athukoralage; Thushari Atapattu; Menasha Thilakaratne; Katrina Falkner;	arxiv-cs.CL	2024-06-11
887	Improving Autoformalization Using Type Checking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis shows that the performance of these models is largely limited by their inability to generate formal statements that successfully type-check (i.e., are syntactically correct and consistent with types) – with a whopping 86.6% of GPT-4o errors starting from a type-check failure. In this work, we propose a method to fix this issue through decoding with type-check filtering, where we initially sample a diverse set of candidate formalizations for an informal statement, then use the Lean proof assistant to filter out candidates that do not type-check.	Auguste Poiroux; Gail Weiss; Viktor Kunčak; Antoine Bosselut;	arxiv-cs.CL	2024-06-11
888	Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models.	AmirMohammad Azadi; Baktash Ansari; Sina Zamani;	arxiv-cs.CL	2024-06-11
889	LLM-Powered Multimodal AI Conversations for Diabetes Prevention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The global prevalence of diabetes remains high despite rising life expectancy with improved quality and access to healthcare services. The significant burden that diabetes imposes …	Dung Dao; Jun Yi Claire Teo; Wenru Wang; Hoang D. Nguyen;	Proceedings of the 1st ACM Workshop on AI-Powered Q&A …	2024-06-10
890	Unveiling The Safety of GPT-4o: An Empirical Study Using Jailbreak Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this paper adopts a series of multi-modal and uni-modal jailbreak attacks on 4 commonly used benchmarks encompassing three modalities (ie, text, speech, and image), which involves the optimization of over 4,000 initial text queries and the analysis and statistical evaluation of nearly 8,000+ response on GPT-4o.	Zonghao Ying; Aishan Liu; Xianglong Liu; Dacheng Tao;	arxiv-cs.CR	2024-06-10
891	LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent.	SIMRANJIT SINGH et. al.	arxiv-cs.DC	2024-06-10
892	Improving ROUGE-1 By 6%: A Novel Multilingual Transformer for Abstractive News Summarization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language processing (NLP) has undergone a significant transformation, evolving from manually crafted rules to powerful deep learning techniques such as transformers. These …	Sandeep Kumar; Arun Solanki;	Concurr. Comput. Pract. Exp.	2024-06-10
893	Validating LLM-Generated Programs with Metamorphic Prompt Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research is required to comprehensively explore these critical concerns surrounding LLM-generated code. In this paper, we propose a novel solution called metamorphic prompt testing to address these challenges.	Xiaoyin Wang; Dakai Zhu;	arxiv-cs.SE	2024-06-10
894	In-Context Learning and Fine-Tuning GPT for Argument Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an ICL strategy for ATC combining kNN-based examples selection and majority vote ensembling.	Jérémie Cabessa; Hugo Hernault; Umer Mushtaq;	arxiv-cs.CL	2024-06-10
895	Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture.	Martin Courtois; Malte Ostendorff; Leonhard Hennig; Georg Rehm;	arxiv-cs.CL	2024-06-10
896	Hidden Holes: Topological Aspects of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods developed in this paper are novel in the field and based on mathematical apparatus that might be unfamiliar to the target audience.	Stephen Fitz; Peter Romero; Jiyan Jonas Schneider;	arxiv-cs.CL	2024-06-09
897	Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, resource-intensive VTs updating and high mobility of vehicles require intensive computation, communication, and storage resources, especially for their migration among RSUs with limited coverages. To address these issues, we propose an attribute-aware auction-based mechanism to optimize resource allocation during VTs migration by considering both price and non-monetary attributes, e.g., location and reputation.	YONGJU TONG et. al.	arxiv-cs.AI	2024-06-08
898	MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature.	GYEONG HOON YI et. al.	arxiv-cs.CL	2024-06-08
899	G-Transformer: Counterfactual Outcome Prediction Under Dynamic and Time-varying Treatment Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present G-Transformer for counterfactual outcome prediction under dynamic and time-varying treatment strategies.	Hong Xiong; Feng Wu; Leon Deng; Megan Su; Li-wei H Lehman;	arxiv-cs.LG	2024-06-08
900	Do LLMs Recognize Me, When I Is Not Me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first study examining indexical shift in any language, releasing a Turkish dataset specifically designed for this purpose.	Metehan Oğuz; Yusuf Umut Ciftci; Yavuz Faruk Bakman;	arxiv-cs.CL	2024-06-08
901	Automata Extraction from Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automata extraction algorithm specifically designed for Transformer models.	Yihao Zhang; Zeming Wei; Meng Sun;	arxiv-cs.LG	2024-06-08
902	VTrans: Accelerating Transformer Compression with Variational Information Bottleneck Based Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges, we propose VTrans, an iterative pruning framework guided by the Variational Information Bottleneck (VIB) principle.	Oshin Dutta; Ritvik Gupta; Sumeet Agarwal;	arxiv-cs.LG	2024-06-07
903	BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense.	Baktash Ansari; Mohammadmostafa Rostamkhani; Sauleh Eetemadi;	arxiv-cs.CL	2024-06-07
904	Transformer Conformal Prediction for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a conformal prediction method for time series using the Transformer architecture to capture long-memory and long-range dependencies.	Junghwan Lee; Chen Xu; Yao Xie;	arxiv-cs.LG	2024-06-07
905	Low-Resource Cross-Lingual Summarization Through Few-Shot Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4.	Gyutae Park; Seojin Hwang; Hwanhee Lee;	arxiv-cs.CL	2024-06-07
906	Mixture-of-Agents Enhances Large Language Model Capabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology.	Junlin Wang; Jue Wang; Ben Athiwaratkun; Ce Zhang; James Zou;	arxiv-cs.CL	2024-06-07
907	Logic Synthesis with Generative Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named ctrw (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting.	XIHAN LI et. al.	arxiv-cs.LO	2024-06-07
908	Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations.	NEEMESH YADAV et. al.	arxiv-cs.CL	2024-06-06
909	GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents’ performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of LLM agents.	ANTHONY COSTARELLI et. al.	arxiv-cs.CL	2024-06-06
910	Exploring The Latest LLMs for Leaderboard Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document).	Salomon Kabongo; Jennifer D’Souza; Sören Auer;	arxiv-cs.CL	2024-06-06
911	MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Multi-path Enhanced Taylor (MET) Transformer based U-net for Speech Enhancement (MUSE), a lightweight speech enhancement network built upon the Unet architecture.	Zizhen Lin; Xiaoting Chen; Junyu Wang;	arxiv-cs.SD	2024-06-06
912	From Tarzan to Tolkien: Controlling The Language Proficiency Level of LLMs for Content Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of controlling the difficulty level of text generated by Large Language Models (LLMs) for contexts where end-users are not fully proficient, such as language learners.	Ali Malik; Stephen Mayhew; Chris Piech; Klinton Bicknell;	arxiv-cs.CL	2024-06-05
913	The Good, The Bad, and The Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states.	MIKHAIL MOZIKOV et. al.	arxiv-cs.AI	2024-06-05
914	CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition.	YE ZENG et. al.	arxiv-cs.IT	2024-06-05
915	Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Global Clipper and Global Hybrid Clipper, effective mitigation strategies specifically designed for transformer-based models.	QUTUB SYED SHA et. al.	arxiv-cs.CV	2024-06-05
916	Probing The Category of Verbal Aspect in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian.	Anisia Katinskaia; Roman Yangarber;	arxiv-cs.CL	2024-06-04
917	Learning to Grok: Emergence of In-context Learning and Skill Composition in Modular Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks.	Tianyu He; Darshil Doshi; Aritra Das; Andrey Gromov;	arxiv-cs.LG	2024-06-04
918	Multi-layer Learnable Attention Mask for Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Comprehensive experimental validation on various datasets, such as MADv2, QVHighlights, ImageNet 1K, and MSRVTT, demonstrates the efficacy of the LAM, exemplifying its ability to enhance model performance while mitigating redundant computations. This pioneering approach presents a significant advancement in enhancing the understanding of complex scenarios, such as in movie understanding.	Wayner Barrios; SouYoung Jin;	arxiv-cs.CV	2024-06-04
919	A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs).	Remi Genet; Hugo Inzirillo;	arxiv-cs.LG	2024-06-04
920	Randomized Geometric Algebra Methods for Convex Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce randomized algorithms to Clifford’s Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces.	Yifei Wang; Sungyoon Kim; Paul Chu; Indu Subramaniam; Mert Pilanci;	arxiv-cs.LG	2024-06-04
921	Too Big to Fail: Larger Language Models Are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous findings show that changes in PPL when masking attention layers in pre-trained transformer-based NLMs reflect linguistic anomalies associated with Alzheimer’s disease dementia. Building upon this, we explore a novel bidirectional attention head ablation method that exhibits properties attributed to the concepts of cognitive and brain reserve in human brain studies, which postulate that people with more neurons in the brain and more efficient processing are more resilient to neurodegeneration.	Changye Li; Zhecheng Sheng; Trevor Cohen; Serguei Pakhomov;	arxiv-cs.CL	2024-06-04
922	Eliciting The Priors of Large Language Models Using Iterated In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a prompt-based workflow for eliciting prior distributions from LLMs.	Jian-Qiao Zhu; Thomas L. Griffiths;	arxiv-cs.CL	2024-06-03
923	SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for semantic understanding for complex tasks like debugging and program repair.	YANGRUIBO DING et. al.	arxiv-cs.CL	2024-06-03
924	Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our empirical study focuses on evaluating adversarial robustness of object trackers based on bounding box versus binary mask predictions, and attack methods at different levels of perturbations.	Fatemeh Nourilenjan Nokabadi; Jean-François Lalonde; Christian Gagné;	arxiv-cs.CV	2024-06-03
925	Performance Evaluation of Multimodal Large Language Models (LLaVA and GPT-4-based ChatGPT) in Medical Image Classification Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have gained significant attention due to their prospective applications in medicine. Utilizing multimodal LLMs can potentially assist clinicians in …	Yuhang Guo; Zhiyu Wan;	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
926	Seeing Beyond Borders: Evaluating LLMs in Multilingual Ophthalmological Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs), such as GPT-3.5 [1] and GPT-4 [2], have significant potential for transforming several aspects of patient care from clinical note summarization to …	DAVID RESTREPO et. al.	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
927	Superhuman Performance in Urology Board Questions By An Explainable Large Language Model Enabled for Context Integration of The European Association of Urology Guidelines: The UroBot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: UroBot was developed using OpenAI’s GPT-3.5, GPT-4, and GPT-4o models, employing retrieval-augmented generation (RAG) and the latest 2023 guidelines from the European Association of Urology (EAU).	MARTIN J. HETZ et. al.	arxiv-cs.CL	2024-06-03
928	Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.	CHENG HAN et. al.	arxiv-cs.CV	2024-06-03
929	In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties?	GRZEGORZ KASZUBA et. al.	arxiv-cs.LG	2024-06-03
930	Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs.	SHIQI LIU et. al.	arxiv-cs.CL	2024-06-02
931	Drive As Veteran: Fine-tuning of An Onboard Large Language Model for Highway Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Due to the limitations of network communication conditions for online calling GPT, the onboard deployment of Large Language Models for autonomous driving is in need. In this …	YUJIN WANG et. al.	2024 IEEE Intelligent Vehicles Symposium (IV)	2024-06-02
932	Transformer-Based Adversarial Network for Semi-supervised Face Sketch Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhihua Shi; Weiguo Wan;	J. Vis. Commun. Image Represent.	2024-06-01
933	RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks.	Md. Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Md. Ashad Alam;	arxiv-cs.CL	2024-06-01
934	Multimodal Metadata Assignment for Cultural Heritage Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset.	LUIS REI et. al.	arxiv-cs.CV	2024-06-01
935	Dual-branch Network Based on Transformer for Texture Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	Yangqi Liu; Hao Dong; Guodong Wang; Chenglizhao Chen;	Digit. Signal Process.	2024-06-01
936	Transformer-based Fall Detection in Videos Related Papers Related Patents Related Grants Related Venues Related Experts View	Adrián Núñez-Marcos; I. Arganda-Carreras;	Eng. Appl. Artif. Intell.	2024-06-01
937	SwinFG: A Fine-grained Recognition Scheme Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhipeng Ma; Xiaoyu Wu; Anzhuo Chu; Lei Huang; Zhiqiang Wei;	Expert Syst. Appl.	2024-06-01
938	Multi-granularity Cross Transformer Network for Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View	Yanping Li; Duoqian Miao; Hongyun Zhang; Jie Zhou; Cairong Zhao;	Pattern Recognit.	2024-06-01
939	Hyneter:Hybrid Network Transformer for Multiple Computer Vision Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this article, we point out that the essential differences between convolutional neural network (CNN)-based and transformer-based detectors, which cause worse performance of …	Dong Chen; Duoqian Miao; Xuerong Zhao;	IEEE Transactions on Industrial Informatics	2024-06-01
940	EdgeTran: Device-Aware Co-Search of Transformers for Efficient Inference on Mobile Edge Platforms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while …	Shikhar Tuli; N. Jha;	IEEE Transactions on Mobile Computing	2024-06-01
941	Low-Contrast Medical Image Segmentation Via Transformer and Boundary Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Low-contrast medical image segmentation is a challenging task that requires full use of local details and global context. However, existing convolutional neural networks (CNNs) …	YINGLIN ZHANG et. al.	IEEE Transactions on Emerging Topics in Computational …	2024-06-01
942	Transformer in Reinforcement Learning for Decision-making: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View	WEILIN YUAN et. al.	Frontiers Inf. Technol. Electron. Eng.	2024-06-01
943	Bidirectional Interaction of CNN and Transformer for Image Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View	Jialu Liu; Maoguo Gong; Yuan Gao; Yihe Lu; Hao Li;	Knowl. Based Syst.	2024-06-01
944	TSD: Random Feature Query Design for Transformer-based Shrimp Detector Related Papers Related Patents Related Grants Related Venues Related Experts View	Bo Gong; Ling Jing; Yingyi Chen;	Comput. Electron. Agric.	2024-06-01
945	A Transformer and Convolution-Based Learning Framework for Automatic Modulation Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic modulation classification (AMC) is a typical pattern classification task that is an intermediate process between signal detection and demodulation. Deep learning methods …	Wenxuan Ma; Zhuoran Cai; Chuan Wang;	IEEE Communications Letters	2024-06-01
946	Beyond Boundaries: A Human-like Approach for Question Answering Over Structured and Unstructured Information Sources Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering …	Jens Lehmann; Dhananjay Bhandiwad; Preetam Gattogi; S. Vahdati;	Transactions of the Association for Computational …	2024-06-01
947	Reinforcement Learning and Transformer for Fast Magnetic Resonance Imaging Scan Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A major drawback in Magnetic Resonance Imaging (MRI) is the long scan times necessary to acquire complete K-space matrices using phase encoding. This paper proposes a …	Yiming Liu; Yanwei Pang; Ruiqi Jin; Yonghong Hou; Xuelong Li;	IEEE Transactions on Emerging Topics in Computational …	2024-06-01
948	FuzzyTP-BERT: Enhancing Extractive Text Summarization with Fuzzy Topic Modeling and Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View	Aytuğ Onan; Hesham A. Alhumyani;	J. King Saud Univ. Comput. Inf. Sci.	2024-06-01
949	Explainable Attention Pruning: A Metalearning-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Pruning, as a technique to reduce the complexity and size of transformer-based models, has gained significant attention in recent years. While various models have been …	P. Rajapaksha; Noel Crespi;	IEEE Transactions on Artificial Intelligence	2024-06-01
950	Beyond Metrics: Evaluating LLMs’ Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our evaluation includes both quantitative analysis using metrics like F1 score and qualitative assessment of LLMs’ explanations for their predictions. We find that, while Mistral-7b and Mixtral-8x7b achieved high F1 scores, they and other LLMs such as GPT-3.5-Turbo, Llama-2-70b, and Gemma-7b struggled with understanding linguistic and contextual nuances, as well as lack of transparency in their decision-making process as observed from their explanations.	MILLICENT OCHIENG et. al.	arxiv-cs.CL	2024-06-01
951	LiteFormer: A Lightweight and Efficient Transformer for Rotating Machine Fault Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, …	WENJUN SUN et. al.	IEEE Transactions on Reliability	2024-06-01
952	How Random Is Random? Evaluating The Randomness and Humaness of LLMs’ Coin Flips Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One uniquely human trait is our inability to be random. We see and produce patterns where there should not be any and we do so in a predictable way. LLMs are supplied with human …	K. V. Koevering; Jon Kleinberg;	ArXiv	2024-05-31
953	A Comparison of Correspondence Analysis with PMI-based Word Embedding Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we link correspondence analysis (CA) to the factorization of the PMI matrix.	Qianqian Qi; Ayoub Bagheri; David J. Hessen; Peter G. M. van der Heijden;	arxiv-cs.CL	2024-05-31
954	Learning General Policies for Planning Through GPT Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning …	NICHOLAS ROSSETTI et. al.	International Conference on Automated Planning and …	2024-05-30
955	Bi-Directional Transformers Vs. Word2vec: Discovering Vulnerabilities in Lifted Compiled Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting vulnerabilities within compiled binaries is challenging due to lost high-level code structures and other factors such as architectural dependencies, compilers, and optimization options. To address these obstacles, this research explores vulnerability detection using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa to learn semantics from intermediate representation (LLVM IR) code.	Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier;	arxiv-cs.CR	2024-05-30
956	Divide-and-Conquer Meets Consensus: Unleashing The Power of Functions in Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus.	JINGCHANG CHEN et. al.	arxiv-cs.CL	2024-05-30
957	The Point of View of A Sentiment: Towards Clinician Bias Detection in Psychiatric Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging large language models, this work aims to identify the sentiment expressed in psychiatric clinical notes based on the reader’s point of view.	Alissa A. Valentine; Lauren A. Lepow; Alexander W. Charney; Isotta Landi;	arxiv-cs.CL	2024-05-30
958	Automatic Graph Topology-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers.	CHAO WANG et. al.	arxiv-cs.NE	2024-05-30
959	DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances.	JIA LI et. al.	arxiv-cs.CL	2024-05-30
960	Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RIR robustly improves knowledge-intensive visual question answering (VQA) of GPT-4V by 37-43%, GPT-4 Turbo by 25-27%, and GPT-4o by 18-20% in terms of open-ended VQA evaluation metrics. To our surprise, we discover that RIR helps the model to better access its own world knowledge.	Jialiang Xu; Michael Moor; Jure Leskovec;	arxiv-cs.CL	2024-05-29
961	Multi-objective Cross-task Learning Via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new learning-based framework by leveraging the strong reasoning capability of the GPT-based architecture to automate surgical robotic tasks.	Jiawei Fu; Yonghao Long; Kai Chen; Wang Wei; Qi Dou;	arxiv-cs.RO	2024-05-29
962	Towards Next-Generation Urban Decision Support Systems Through AI-Powered Generation of Scientific Ontology Using Large Language Models – A Case in Optimizing Intermodal Freight Transportation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management …	JOSE TUPAYACHI et. al.	ArXiv	2024-05-29
963	MDS-ViTNet: Improving Saliency Prediction for Eye-Tracking with Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking.	Polezhaev Ignat; Goncharenko Igor; Iurina Natalya;	arxiv-cs.CV	2024-05-29
964	Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As interest in reformulating the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA.	Simranjit Singh; Georgios Pavlakos; Dimitrios Stamoulis;	arxiv-cs.CV	2024-05-29
965	Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Language models, such as GPT-3 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks, using instruction fine-tuning. …	PENG LI et. al.	Proc. ACM Manag. Data	2024-05-29
966	A Multi-Source Retrieval Question Answering Framework Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information.	RIDONG WU et. al.	arxiv-cs.IR	2024-05-29
967	AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we rethink the approach to jailbreaking LLMs and formally define three essential properties from the attacker’ s perspective, which contributes to guiding the design of jailbreak methods.	JIAWEI CHEN et. al.	arxiv-cs.CV	2024-05-29
968	LMO-DP: Optimizing The Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$).	QIN YANG et. al.	arxiv-cs.CR	2024-05-29
969	Beyond Agreement: Diagnosing The Rationale Alignment of Automated Essay Scoring Methods Based on Linguistically-informed Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that BERT-like models primarily focus on sentence-level features, whereas LLMs such as GPT-3.5, GPT-4 and Llama-3 are sensitive to conventions & accuracy, language complexity, and organization, indicating a more comprehensive rationale alignment with scoring rubrics.	Yupei Wang; Renfen Hu; Zhe Zhao;	arxiv-cs.CL	2024-05-29
970	Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Repeat Ranking method – where we evaluate the same responses multiple times and train only on those responses which are consistently ranked.	Peter Devine;	arxiv-cs.CL	2024-05-29
971	Voice Jailbreak Attacks Against GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first systematic measurement of jailbreak attacks against the voice mode of GPT-4o.	Xinyue Shen; Yixin Wu; Michael Backes; Yang Zhang;	arxiv-cs.CR	2024-05-29
972	Data-Efficient Approach to Humanoid Control Via Fine-Tuning A Pre-Trained GPT on Action Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories.	Siddharth Padmanabhan; Kazuki Miyazawa; Takato Horii; Takayuki Nagai;	arxiv-cs.RO	2024-05-28
973	Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks.	Shubham Vatsal; Ayush Singh;	arxiv-cs.CL	2024-05-28
974	Notes on Applicability of GPT-4 to Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics.	Łukasz Borchmann;	arxiv-cs.CL	2024-05-28
975	Delving Into Differentially Private Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such `reduction’ is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively.	YOULONG DING et. al.	arxiv-cs.LG	2024-05-28
976	I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach aims to revolutionize teachers’ assessment of students’ practices by leveraging Generative Artificial Intelligence (GenAI) to offer detailed insights into classroom dynamics.	UNGGI LEE et. al.	arxiv-cs.HC	2024-05-28
977	Look Ahead Text Understanding and LLM Stitching Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper proposes a look ahead text understanding problem with look ahead section identification (LASI) as an example. This problem may appear in generative AI as well as human …	Junlin Julian Jiang; Xin Li;	International Conference on Web and Social Media	2024-05-28
978	Deployment of Large Language Models to Control Mobile Robots at The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated.	PASCAL SIKORSKI et. al.	arxiv-cs.RO	2024-05-27
979	How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they …	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	ArXiv	2024-05-27
980	Multi-objective Representation for Numbers in Clinical Narratives Using CamemBERT-bio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to classify numerical values extracted from medical documents across seven distinct physiological categories, employing CamemBERT-bio.	Boammani Aser Lompo; Thanh-Dung Le;	arxiv-cs.CL	2024-05-27
981	Vision-and-Language Navigation Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposal, the Vision-and-Language Navigation Generative Pretrained Transformer (VLN-GPT), adopts a transformer decoder model (GPT2) to model trajectory sequence dependencies, bypassing the need for historical encoding modules.	Wen Hanlin;	arxiv-cs.AI	2024-05-27
982	Toward A Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered a …	M. EMANI et. al.	2024 IEEE International Parallel and Distributed Processing …	2024-05-27
983	Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous approaches to 3D human motion generation have achieved notable success, they often rely on extensive training and are limited to specific tasks. To address these challenges, we introduce Motion-Agent, an efficient conversational framework designed for general human motion generation, editing, and understanding.	QI WU et. al.	arxiv-cs.CV	2024-05-27
984	RLAIF-V: Aligning MLLMs Through Open-Source AI Feedback for Super GPT-4V Trustworthiness IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm for super GPT-4V trustworthiness.	TIANYU YU et. al.	arxiv-cs.CL	2024-05-27
985	InversionView: A General-Purpose Method for Reading Information from Neural Activations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activations.	Xinting Huang; Madhur Panwar; Navin Goyal; Michael Hahn;	arxiv-cs.LG	2024-05-27
986	LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker.	Shaohua Dong; Yunhe Feng; Qing Yang; Yuewei Lin; Heng Fan;	arxiv-cs.CV	2024-05-27
987	Assessing LLMs Suitability for Knowledge Graph Completion Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Recent work has shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot …	Vasile Ionut Remus Iga; Gheorghe Cosmin Silaghi;	arxiv-cs.CL	2024-05-27
988	Performance Evaluation of Reddit Comments Using Machine Learning and Natural Language Processing Methods in Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments.	Xiaoxia Zhang; Xiuyuan Qi; Zixin Teng;	arxiv-cs.CL	2024-05-26
989	M3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three …	MINGSHUANG LUO et. al.	ArXiv	2024-05-25
990	M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation.	MINGSHUANG LUO et. al.	arxiv-cs.CV	2024-05-25
991	Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens.	HOAI-CHAU TRAN et. al.	arxiv-cs.LG	2024-05-25
992	Activator: GLU Activation Function As The Core Component of A Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimental assessments conducted by this research show that both proposed modifications and reductions offer competitive performance in relation to baseline architectures, in support of the aims of this work in establishing a more efficient yet capable alternative to the traditional attention mechanism as the core component in designing transformer architectures.	Abdullah Nazhat Abdullah; Tarkan Aydin;	arxiv-cs.CV	2024-05-24
993	Incremental Comprehension of Garden-Path Sentences By Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa.	ANDREW LI et. al.	arxiv-cs.CL	2024-05-24
994	PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis.	ZICHENG WANG et. al.	arxiv-cs.CV	2024-05-24
995	Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD).	Gao Tianci;	arxiv-cs.RO	2024-05-24
996	Enhancing Non-player Characters in Unity 3D Using GPT-3.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This case study presents a comprehensive integration process of OpenAI’s GPT-3.5 large language model (LLM) into Unity 3D to enhance non-player characters (NPCs) in video games …	John Sissler;	ACM Games: Research and Practice	2024-05-24
997	GPTZoo: A Large-scale Dataset of GPTs for The Research Community Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support academic research on GPTs, we introduce GPTZoo, a large-scale dataset comprising 730,420 GPT instances.	Xinyi Hou; Yanjie Zhao; Shenao Wang; Haoyu Wang;	arxiv-cs.SE	2024-05-24
998	A Comparative Analysis of Distributed Training Strategies for GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques developed to address these challenges, enabling the efficient and scalable training of Large Language Models.	Ishan Patwardhan; Shubham Gandhi; Om Khare; Amit Joshi; Suraj Sawant;	arxiv-cs.DC	2024-05-24
999	GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey.	Virginia K. Felkner; Jennifer A. Thompson; Jonathan May;	arxiv-cs.CL	2024-05-24
1000	SMART: Scalable Multi-agent Real-time Motion Generation Via Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens.	Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan;	arxiv-cs.RO	2024-05-24
1001	Comet: A Communication-efficient and Performant Approximation for Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel plug-in method Comet to effectively reduce the communication cost without compromising the inference performance.	Xiangrui Xu; Qiao Zhang; Rui Ning; Chunsheng Xin; Hongyi Wu;	arxiv-cs.LG	2024-05-24
1002	The Buffer Mechanism for Multi-Step Information Reasoning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy based on their inherent structure and horizontal thinking strategy based on Chain of Thought to achieve multi-step reasoning.	ZHIWEI WANG et. al.	arxiv-cs.AI	2024-05-24
1003	Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS).	Barış Büyüktaş; Kenneth Weitzel; Sebastian Völkers; Felix Zailskas; Begüm Demir;	arxiv-cs.CV	2024-05-24
1004	SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings.	GUIBAO SHEN et. al.	arxiv-cs.CV	2024-05-24
1005	SMART: Scalable Multi-agent Real-time Simulation Via Next-token Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their …	Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan;	ArXiv	2024-05-24
1006	CulturePark: Boosting Cross-cultural Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.	CHENG LI et. al.	arxiv-cs.AI	2024-05-23
1007	An Evaluation of Estimative Uncertainty in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares estimative uncertainty in commonly used large language models (LLMs) like GPT-4 and ERNIE-4 to that of humans, and to each other.	Zhisheng Tang; Ke Shen; Mayank Kejriwal;	arxiv-cs.CL	2024-05-23
1008	CEEBERT: Cross-Domain Inference in Early Exit BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point.	Divya Jyoti Bajpai; Manjesh Kumar Hanawal;	arxiv-cs.CL	2024-05-23
1009	JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.	KUN ZHOU et. al.	arxiv-cs.CL	2024-05-23
1010	Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to The Edge of Generalization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with.	Boshi Wang; Xiang Yue; Yu Su; Huan Sun;	arxiv-cs.CL	2024-05-23
1011	AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce AutoCoder, the first Large Language Model to surpass GPT-4 Turbo (April 2024) and GPT-4o in pass@1 on the Human Eval benchmark test ($\mathbf{90.9\%}$ vs. …	Bin Lei; Yuchen Li; Qiuwu Chen;	ArXiv	2024-05-23
1012	ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD.	Luan Thanh Nguyen;	arxiv-cs.CL	2024-05-22
1013	Transformer in Touch: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to comprehensively outline the application and development of Transformers in tactile technology.	Jing Gao; Ning Cheng; Bin Fang; Wenjuan Han;	arxiv-cs.LG	2024-05-21
1014	Towards Authoring Open-Ended Behaviors for Narrative Puzzle Games with Large Language Model Support Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing games with branching story lines, object annotations, scene details, and dialog can be challenging due to the intensive authoring required. We investigate the potential …	Britney Ngaw; Grishma Jena; João Sedoc; Aline Normoyle;	Proceedings of the 19th International Conference on the …	2024-05-21
1015	How Reliable AI Chatbots Are for Disease Prediction from Patient Complaints? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making.	Ayesha Siddika Nipu; K M Sajjadul Islam; Praveen Madiraju;	arxiv-cs.AI	2024-05-21
1016	Advancing Web Science Through Foundation Model for Tabular Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As the landscape of web science expands, handling the vast datasets collected from the Web while preserving computational efficiency and privacy remains a significant challenge. …	Inwon Kang;	Companion Publication of the 16th ACM Web Science Conference	2024-05-21
1017	Generative AI and Large Language Models for Cyber Security: All Insights You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs).	MOHAMED AMINE FERRAG et. al.	arxiv-cs.CR	2024-05-21
1018	Cardistry: Exploring A GPT Model Workflow As An Adapted Method of Gaminiscing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cardistry is an application that enables users to create their own playing cards for use in evocative storytelling games. It is driven by OpenAI’s Generative Pre-trained …	BRANDON LYMAN et. al.	Proceedings of the 19th International Conference on the …	2024-05-21
1019	Exploring The Gap: The Challenge of Achieving Human-like Generalization for Concept-based Translation Instruction Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study utilizes concept description instructions and few-shot learning examples to examine the effectiveness of a large language model (GPT-4) in generating Chinese-to-English …	Ming Qian; Chuiqing Kong;	AAAI Spring Symposia	2024-05-20
1020	Automated Hardware Logic Obfuscation Framework Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process.	BANAFSHEH SABER LATIBARI et. al.	arxiv-cs.CR	2024-05-20
1021	From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, …	PRIYANKA NANAYAKKARA et. al.	2024 IEEE Symposium on Security and Privacy (SP)	2024-05-19
1022	DaVinci at SemEval-2024 Task 9: Few-shot Prompting GPT-3.5 for Unconventional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types.	Suyash Vardhan Mathur; Akshett Rai Jindal; Manish Shrivastava;	arxiv-cs.CL	2024-05-19
1023	Zero-Shot Stance Detection Using Contextual Data Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this approach, we aim to fine-tune an existing model at test time.	Ghazaleh Mahmoudi; Babak Behkamkia; Sauleh Eetemadi;	arxiv-cs.CL	2024-05-19
1024	Enhancing User Experience in Large Language Models Through Human-centered Design: Integrating Theoretical Insights with An Experimental Study to Meet Diverse Software Learning Needs with A Single Document Knowledge Base Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The experimental results demonstrate the effect of different elements’ forms and organizational methods in the document, as well as GPT’s relevant configurations, on the interaction effectiveness between GPT and software learners.	Yuchen Wang; Yin-Shan Lin; Ruixin Huang; Jinyin Wang; Sensen Liu;	arxiv-cs.HC	2024-05-19
1025	Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques.	Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; Chen Hajaj;	arxiv-cs.CR	2024-05-18
1026	Benchmarking Large Language Models on CFLUE — A Chinese Financial Language Understanding Evaluation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CFLUE, the Chinese Financial Language Understanding Evaluation benchmark, designed to assess the capability of LLMs across various dimensions.	Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo;	arxiv-cs.CL	2024-05-17
1027	GPTs Window Shopping: An Analysis of The Landscape of Custom ChatGPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Customization comes in the form of prompt-tuning, analysis of reference resources, browsing, and external API interactions, alongside a promise of revenue sharing for created custom GPTs. In this work, we peer into the window of the GPT Store and measure its impact.	Benjamin Zi Hao Zhao; Muhammad Ikram; Mohamed Ali Kaafar;	arxiv-cs.SI	2024-05-17
1028	Benchmarking Large Language Models on CFLUE – A Chinese Financial Language Understanding Evaluation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In light of recent breakthroughs in large language models (LLMs) that have revolutionized natural language processing (NLP), there is an urgent need for new benchmarks to keep …	Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo;	Annual Meeting of the Association for Computational …	2024-05-17
1029	GPT Store Mining and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings aim to enhance understanding of the GPT ecosystem, providing valuable insights for future research, development, and policy-making in generative AI.	Dongxun Su; Yanjie Zhao; Xinyi Hou; Shenao Wang; Haoyu Wang;	arxiv-cs.LG	2024-05-16
1030	Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, no comparative study examining different LLMs has yet been reported for web-form-test generation.	TAO LI et. al.	arxiv-cs.SE	2024-05-16
1031	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 1.55B parameters.	RHEA SANJAY SUKTHANKER et. al.	arxiv-cs.LG	2024-05-16
1032	Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP).	Tong Zhan; Chenxi Shi; Yadong Shi; Huixiang Li; Yiyu Lin;	arxiv-cs.CL	2024-05-15
1033	Comparing The Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to compare the performance of two large language models, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings.	Birger Moell;	arxiv-cs.CL	2024-05-15
1034	GPT-3.5 for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models.	Anisia Katinskaia; Roman Yangarber;	arxiv-cs.CL	2024-05-14
1035	Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis Between Emotional Stimuli Prompt, Fine-Tuning, and In-Context Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Textual emotion recognition (TER) has significant commercial potential since it can be used as an excellent tool to monitor a brand/business reputation, understand customer …	E. Nfaoui; Hanane Elfaik;	J. Theor. Appl. Electron. Commer. Res.	2024-05-14
1036	Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Continual learning, which acts as an effective tool for detecting newly emerged deepfake audio while maintaining performance on older types, lacks a well-constructed and user-friendly evaluation framework. To address this gap, we introduce EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection.	Xiaohui Zhang; Jiangyan Yi; Jianhua Tao;	arxiv-cs.SD	2024-05-14
1037	Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes a concurrent programming framework for quantitatively analyzing the efficiency challenges in serving multiple long-context requests under limited size of GPU high-bandwidth memory (HBM) regime.	Yao Fu;	arxiv-cs.LG	2024-05-14
1038	Open-vocabulary Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the \textbf{Brain Prompt GPT (BP-GPT)}.	Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He;	arxiv-cs.HC	2024-05-13
1039	Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs.	CHENGYUE WU et. al.	arxiv-cs.CL	2024-05-13
1040	Can GNN Be Good Adapter for LLMs? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs.	XUANWEN HUANG et. al.	www	2024-05-13
1041	Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Memes are important because they serve as conduits for expressing emotions, opinions, and social commentary online, providing valuable insight into public sentiment, trends, and …	F. Abdullakutty; Usman Naseem;	Companion Proceedings of the ACM on Web Conference 2024	2024-05-13
1042	Relationalizing Tables with Large Language Models: The Promise and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural …	Zezhou Huang; Eugene Wu;	2024 IEEE 40th International Conference on Data Engineering …	2024-05-13
1043	Large Language Models: Principles and Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The last few years have been marked by several breakthroughs in the domain of generative AI. Large language models such as GPT-4 are able to solve a plethora of tasks, ranging …	Immanuel Trummer;	2024 IEEE 40th International Conference on Data Engineering …	2024-05-13
1044	Decision Mamba Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models.	André Correia; Luís A. Alexandre;	arxiv-cs.LG	2024-05-13
1045	PRECYSE: Predicting Cybersickness Using Transformer for Multimodal Time-Series Sensor Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cybersickness, a factor that hinders user immersion in VR, has been the subject of ongoing attempts to predict it using AI. Previous studies have used CNN and LSTM for prediction …	Dayoung Jeong; Kyungsik Han;	Proceedings of the ACM on Interactive, Mobile, Wearable and …	2024-05-13
1046	Lgt: Long-range Graph Transformer for Early Rumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Jinghong Xia; Yuling Li; Kui Yu;	Social Network Analysis and Mining	2024-05-13
1047	The Personality Dimensions GPT-3 Expresses During Human-Chatbot Interactions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models such as GPT-3 and ChatGPT can mimic human-to-human conversation with unprecedented fidelity, which enables many applications such as conversational agents …	N. Kovačević; Christian Holz; Markus Gross; Rafael Wampfler;	Proceedings of the ACM on Interactive, Mobile, Wearable and …	2024-05-13
1048	COLA: Cross-city Mobility Transformer for Human Trajectory Simulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to explore the intriguing problem of mobility transfer across cities, grasping the universal patterns of human trajectories to augment the powerful Transformer with external mobility data.	Yu Wang; Tongya Zheng; Yuxuan Liang; Shunyu Liu; Mingli Song;	www	2024-05-13
1049	Coding Historical Causes of Death Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death.	BJØRN PEDERSEN et. al.	arxiv-cs.LG	2024-05-13
1050	L(u)PIN: LLM-based Political Ideology Nowcasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method to analyze ideological positions of individual parliamentary representatives by leveraging the latent knowledge of LLMs.	Ken Kato; Annabelle Purnomo; Christopher Cochrane; Raeid Saqur;	arxiv-cs.CL	2024-05-12
1051	Can Language Models Explain Their Own Classification Behavior? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules.	Dane Sherburn; Bilal Chughtai; Owain Evans;	arxiv-cs.LG	2024-05-12
1052	Limited Ability of LLMs to Simulate Human Psychological Behaviours: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we prompt OpenAI’s flagship models, GPT-3.5 and GPT-4, to assume different personas and respond to a range of standardized measures of personality constructs.	Nikolay B Petrov; Gregory Serapio-García; Jason Rentfrow;	arxiv-cs.CL	2024-05-12
1053	Retrieval Enhanced Zero-Shot Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to take advantage of existing pre-trained large-scale vision and language models to directly generate captions with test time adaptation.	YUNCHUAN MA et. al.	arxiv-cs.CV	2024-05-11
1054	ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Peer influence plays a crucial role in promoting classroom participation, where behaviors from active students can contribute to a collective classroom learning experience. …	ZIYI LIU et. al.	Proceedings of the CHI Conference on Human Factors in …	2024-05-11
1055	GPTs in Mafia-like Game Simulation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this research, we explore the potential of Generative AI models, focusing on their application in role-playing simulations through Spyfall, a renowned mafia-style game. By …	Munyeong Kim;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-11
1056	Integrating Expertise in LLMs: Crafting A Customized Nutrition Assistant with Refined Template Instructions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have the potential to contribute to the fields of nutrition and dietetics in generating food product explanations that facilitate informed food …	Annalisa Szymanski; Brianna L Wimer; Oghenemaro Anuyah; H. Eicher-Miller; Ronald A Metoyer;	Proceedings of the CHI Conference on Human Factors in …	2024-05-11
1057	An Autoethnographic Reflection of Prompting A Custom GPT Based on Oneself Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: What if you could have a chat with yourself? OpenAI’s introduction of custom GPTs in November 2023 provides an opportunity for non-technical users to create specialized generative …	Priscilla Y. Lo;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-11
1058	Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution.	IBAI RAMIREZ et. al.	arxiv-cs.LG	2024-05-10
1059	TacoERE: Cluster-aware Compression for Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm.	YONG GUAN et. al.	arxiv-cs.CL	2024-05-10
1060	Multimodal LLMs Struggle with Basic Visual Network Analysis: A VNA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose.	Evan M. Williams; Kathleen M. Carley;	arxiv-cs.CV	2024-05-10
1061	A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task.	Dongwei Sun; Yajie Bao; Junmin Liu; Xiangyong Cao;	arxiv-cs.CV	2024-05-10
1062	ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design.	Francisco Olivos; Minhui Liu;	arxiv-cs.CY	2024-05-10
1063	Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder.	YAO GE et. al.	arxiv-cs.CL	2024-05-09
1064	People Cannot Distinguish GPT-4 from A Human in A Turing Test IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or …	Cameron R. Jones; Benjamin K. Bergen;	ArXiv	2024-05-09
1065	Optimizing Software Vulnerability Detection Using RoBERTa and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View	Cho Xuan Do; Nguyen Trong Luu; Phuong Thi Lan Nguyen;	Autom. Softw. Eng.	2024-05-08
1066	Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals.	Aylin Gunal; Baihan Lin; Djallel Bouneffouf;	arxiv-cs.CL	2024-05-08
1067	Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference.	HAOQI WU et. al.	arxiv-cs.CR	2024-05-08
1068	Integrating Pepper Robot and GPT for Neuromyth Educational Conversation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of neuromyths, or false beliefs about brain function and learning, has been a significant challenge in the field of education. These myths often hinders the learning …	Abdelhadi Hireche; Abdelkader Nasreddine Belkacem;	2024 IEEE Global Engineering Education Conference (EDUCON)	2024-05-08
1069	Few-Shot Class Incremental Learning Via Robust Transformer Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View	NAEEM PAEEDEH et. al.	Inf. Sci.	2024-05-08
1070	Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by recent work that has utilised very powerful LLMs, such as GPT-4, to evaluate the outputs produced by less powerful models, we conduct an automated analysis of the quality of the feedback produced by several open source models using a dataset from an introductory programming course.	CHARLES KOUTCHEME et. al.	arxiv-cs.CL	2024-05-08
1071	A Transformer with Stack Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism.	Jiaoda Li; Jennifer C. White; Mrinmaya Sachan; Ryan Cotterell;	arxiv-cs.CL	2024-05-07
1072	Evaluating Text Summaries Generated By Large Language Models Using OpenAI’s GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research examines the effectiveness of OpenAI’s GPT models as independent evaluators of text summaries generated by six transformer-based models from Hugging Face: DistilBART, BERT, ProphetNet, T5, BART, and PEGASUS.	Hassan Shakil; Atqiya Munawara Mahi; Phuoc Nguyen; Zeydy Ortiz; Mamoun T. Mardini;	arxiv-cs.CL	2024-05-07
1073	Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries.	Hassan Shakil; Zeydy Ortiz; Grant C. Forbes;	arxiv-cs.CL	2024-05-07
1074	GPT-Enabled Cybersecurity Training: A Tailored Approach for Effective Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the limitations of traditional Cybersecurity Awareness and Training (CSAT) programs and proposes an innovative solution using Generative Pre-Trained Transformers (GPT) to address these shortcomings.	Nabil Al-Dhamari; Nathan Clarke;	arxiv-cs.CR	2024-05-07
1075	How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms.	Jorge García-Carrasco; Alejandro Maté; Juan Trujillo;	arxiv-cs.LG	2024-05-07
1076	Structured Click Control in Transformer-based Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness of the response, we propose a structured click intent model based on graph neural networks, which adaptively obtains graph nodes via the global similarity of user-clicked Transformer tokens.	Long Xu; Yongquan Chen; Rui Huang; Feng Wu; Shiwu Lai;	arxiv-cs.CV	2024-05-07
1077	The Silicon Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness.	Lena Armstrong; Abbey Liu; Stephen MacNeil; Danaë Metaxa;	arxiv-cs.CY	2024-05-07
1078	Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This anchored bias challenges the integrity of GPT-2’s decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias.	Ruizhe Li; Yanjun Gao;	arxiv-cs.CL	2024-05-06
1079	Hire Me or Not? Examining Language Model’s Behavior with Occupation Attributes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the impressive performance in various downstream tasks, large language models (LLMs) have been widely integrated into production pipelines, like recruitment and recommendation systems.	Damin Zhang; Yi Zhang; Geetanjali Bihani; Julia Rayz;	arxiv-cs.CL	2024-05-06
1080	Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs have not yet been used to characterize synergistic learning in students’ collaborative discourse. In this exploratory work, we take a first step towards adopting a human-in-the-loop prompt engineering approach with GPT-4-Turbo to summarize and categorize students’ synergistic learning during collaborative discourse.	Clayton Cohn; Caitlin Snyder; Justin Montenegro; Gautam Biswas;	arxiv-cs.CL	2024-05-06
1081	Addressing Data Scarcity in The Medical Domain: A GPT-Based Approach for Synthetic Data Generation and Feature Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research confronts the persistent challenge of data scarcity in medical machine learning by introducing a pioneering methodology that harnesses the capabilities of Generative …	F. Sufi;	Inf.	2024-05-06
1082	Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation.	Keith Burghardt; Kai Chen; Kristina Lerman;	arxiv-cs.CL	2024-05-06
1083	Detecting Anti-Semitic Hate Speech Using Transformer-based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we developed a new data labeling technique and established a proof of concept targeting anti-Semitic hate speech, utilizing a variety of transformer models such as BERT (arXiv:1810.04805), DistillBERT (arXiv:1910.01108), RoBERTa (arXiv:1907.11692), and LLaMA-2 (arXiv:2307.09288), complemented by the LoRA fine-tuning approach (arXiv:2106.09685).	Dengyi Liu; Minghao Wang; Andrew G. Catlin;	arxiv-cs.CL	2024-05-06
1084	Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems.	Bingzhang Wang; Muhammad Monjurul Karim; Chenxi Liu; Yinhai Wang;	arxiv-cs.MA	2024-05-05
1085	Can Large Language Models Make The Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents reports on a series of experiments with a novel dataset evaluating how well Large Language Models (LLMs) can mark (i.e. grade) open text responses to short answer questions, Specifically, we explore how well different combinations of GPT version and prompt engineering strategies performed at marking real student answers to short answer across different domain areas (Science and History) and grade-levels (spanning ages 5-16) using a new, never-used-before dataset from Carousel, a quizzing platform.	Owen Henkel; Adam Boxer; Libby Hills; Bill Roberts;	arxiv-cs.CL	2024-05-05
1086	Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, …	Sven Jacobs; Steffen Jaschke;	2024 36th International Conference on Software Engineering …	2024-05-05
1087	Unraveling The Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the underexplored area of evaluating LLMs in low-resourced languages such as Bengali.	Fatema Tuj Johora Faria; Mukaffi Bin Moin; Asif Iftekher Fahim; Pronay Debnath; Faisal Muhammad Shah;	arxiv-cs.CL	2024-05-05
1088	SCATT: Transformer Tracking with Symmetric Cross-attention Related Papers Related Patents Related Grants Related Venues Related Experts View	Jianming Zhang; Wentao Chen; Jiangxin Dai; Jin Zhang;	Appl. Intell.	2024-05-04
1089	A Combination of BERT and Transformer for Vietnamese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction.	Hieu Ngo Trung; Duong Tran Ham; Tin Huynh; Kiem Hoang;	arxiv-cs.CL	2024-05-04
1090	U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on self-attention with downsampled tokens, we propose a series of U-shaped DiTs (U-DiTs) in the paper and conduct extensive experiments to demonstrate the extraordinary performance of U-DiT models.	YUCHUAN TIAN et. al.	arxiv-cs.CV	2024-05-04
1091	Structural Pruning of Pre-trained Language Models Via Neural Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process.	Aaron Klein; Jacek Golebiowski; Xingchen Ma; Valerio Perrone; Cedric Archambeau;	arxiv-cs.LG	2024-05-03
1092	Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on The Travelling Salesman Problem Using GPT-3.5 Turbo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP).	Mahmoud Masoud; Ahmed Abdelhay; Mohammed Elhenawy;	arxiv-cs.CL	2024-05-03
1093	Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to Test BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing.	PATRICK KRAUSS et. al.	arxiv-cs.CL	2024-05-03
1094	REASONS: A Benchmark for REtrieval and Automated CitationS Of ScieNtific Sentences Using Public and Proprietary LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article.	DEEPA TILWANI et. al.	arxiv-cs.CL	2024-05-03
1095	The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This study introduces a systematic framework to compare the efficacy of Large Language Models (LLMs) for fine-tuning across various cheminformatics tasks. Employing a uniform …	Youngmin Lee; Andrew S. I. D. Lang; Duoduo Cai; Wheat R. Stephen;	ArXiv	2024-05-02
1096	UQA: Corpus for Urdu Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers.	Samee Arif; Sualeha Farid; Awais Athar; Agha Ali Raza;	arxiv-cs.CL	2024-05-02
1097	Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. We investigate the ability of …	TOLGA BUZ et. al.	STARSEM	2024-05-02
1098	The Effectiveness of LLMs As Annotators: A Comparative Overview and Empirical Analysis of Direct Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data.	Maja Pavlovic; Massimo Poesio;	arxiv-cs.CL	2024-05-02
1099	Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing.	TOLGA BUZ et. al.	arxiv-cs.CL	2024-05-02
1100	Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements.	SEUNGONE KIM et. al.	arxiv-cs.CL	2024-05-02
1101	Empowering IoT with Generative AI: Applications, Case Studies, and Limitations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rise of the Generative Pre-Trained Transformer(GPT) language model, more commonly used as ChatGPT has brought a spotlight on the ever-developing field of Generative AI (GAI).} …	Siva Sai; Mizaan Kanadia; V. Chamola;	IEEE Internet of Things Magazine	2024-05-01
1102	Transformer Dense Center Network for Liver Tumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	JINLIN MA et. al.	Biomed. Signal Process. Control.	2024-05-01
1103	PWLT: Pyramid Window-based Lightweight Transformer for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	YUWEI MO et. al.	Comput. Electr. Eng.	2024-05-01
1104	A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges.	Ayaz Mehmood; Muhammad Tayyab Zamir; Muhammad Asif Ayub; Nasir Ahmad; Kashif Ahmad;	arxiv-cs.CL	2024-05-01
1105	Chat-GPT; Validating Technology Acceptance Model (TAM) in Education Sector Via Ubiquitous Learning Mechanism IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	N. SAIF et. al.	Comput. Hum. Behav.	2024-05-01
1106	FedViT: Federated Continual Learning of Vision Transformer at Edge Related Papers Related Patents Related Grants Related Venues Related Experts View	XIAOJIANG ZUO et. al.	Future Gener. Comput. Syst.	2024-05-01
1107	CSPFormer: A Cross-spatial Pyramid Transformer for Visual Place Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhenyu Li; Pengjie Xu;	Neurocomputing	2024-05-01
1108	Collaborative Compensative Transformer Network for Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Jun Chen; Heye Zhang; Mingming Gong; Zhifan Gao;	Pattern Recognit.	2024-05-01
1109	Font Transformer for Few-shot Font Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Xu Chen; Lei Wu; Yongliang Su; Lei Meng; Xiangxu Meng;	Comput. Vis. Image Underst.	2024-05-01
1110	Reinforced Res-Unet Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View	Peitong Li; Jiaying Chen; Chengtao Cai;	Signal Process. Image Commun.	2024-05-01
1111	Vision Transformer: To Discover The four Secrets of Image Patches Related Papers Related Patents Related Grants Related Venues Related Experts View	TAO ZHOU et. al.	Inf. Fusion	2024-05-01
1112	How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system.	JIONGHAO LIN et. al.	arxiv-cs.CL	2024-05-01
1113	Dynamic Spatial Aware Graph Transformer for Spatiotemporal Traffic Flow Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View	Zequan Li; Jinglin Zhou; Zhizhe Lin; Teng Zhou;	Knowl. Based Syst.	2024-05-01
1114	Structural and Positional Ensembled Encoding for Graph Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Jeyoon Yeom; Taero Kim; Rakwoo Chang; Kyungwoo Song;	Pattern Recognit. Lett.	2024-05-01
1115	Energy-informed Graph Transformer Model for Solid Mechanical Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View	Bo Feng; Xiaoping Zhou;	Commun. Nonlinear Sci. Numer. Simul.	2024-05-01
1116	Semantic Perceptive Infrared and Visible Image Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	XIN YANG et. al.	Pattern Recognit.	2024-05-01
1117	On Compositional Generalization of Transformer-based Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View	Yongjing Yin; Lian Fu; Yafu Li; Yue Zhang;	Inf. Fusion	2024-05-01
1118	Do Large Language Models Understand Conversational Implicature — A Case Study with A Chinese Sitcom Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $\textit{My Own Swordsman}$.	Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu;	arxiv-cs.CL	2024-04-30
1119	Harmonic LLMs Are Trustworthy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an intuitive method to test the robustness (stability and explainability) of any black-box LLM in real-time via its local deviation from harmoniticity, denoted as $\gamma$.	Nicholas S. Kersting; Mohammad Rahman; Suchismitha Vedala; Yang Wang;	arxiv-cs.LG	2024-04-30
1120	Do Large Language Models Understand Conversational Implicature – A Case Study with A Chinese Sitcom Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce …	Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu;	ArXiv	2024-04-30
1121	How Can I Improve? Using GPT to Highlight The Desired and Undesired Parts of Open-ended Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our aim is to equip tutors with actionable, explanatory feedback during online training lessons.	JIONGHAO LIN et. al.	arxiv-cs.CL	2024-04-30
1122	Ethical Reasoning and Moral Value Alignment of LLMs Depend on The Language We Prompt Them in Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs — GPT-4, ChatGPT, and Llama2-70B-Chat — perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted.	Utkarsh Agarwal; Kumar Tanmay; Aditi Khandelwal; Monojit Choudhury;	arxiv-cs.CL	2024-04-29
1123	RSCaMa: Remote Sensing Image Change Captioning with State Space Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite previous methods progressing in the spatial change perception, there are still weaknesses in joint spatial-temporal modeling. To address this, in this paper, we propose a novel RSCaMa model, which achieves efficient joint spatial-temporal modeling through multiple CaMa layers, enabling iterative refinement of bi-temporal features.	CHENYANG LIU et. al.	arxiv-cs.CV	2024-04-29
1124	Can GPT-4 Do L2 Analytic Assessment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components.	Stefano Bannò; Hari Krishna Vydana; Kate M. Knill; Mark J. F. Gales;	arxiv-cs.CL	2024-04-29
1125	Normalization of Arabic Dialects Into Modern Standard Arabic Using BERT and GPT-2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present an encoder-decored based model for normalization of Arabic dialects using both BERT and GPT-2 based models. Arabic is a language of many dialects that not only differ …	Khalid Alnajjar; Mika Hämäläinen;	J. Data Min. Digit. Humanit.	2024-04-29
1126	GPT-4 Passes Most of The 297 Written Polish Board Certification Examinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: We developed a software program to download and process PES exams and tested the performance of GPT models using OpenAI Application Programming Interface.	Jakub Pokrywka; Jeremi Kaczmarek; Edward Gorzelańczyk;	arxiv-cs.CL	2024-04-29
1127	Time Machine GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative.	Felix Drinkall; Eghbal Rahimikia; Janet B. Pierrehumbert; Stefan Zohren;	arxiv-cs.CL	2024-04-29
1128	PatentGPT: A Large Language Model for Intellectual Property Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain.	ZILONG BAI et. al.	arxiv-cs.CL	2024-04-28
1129	Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the study of how subwording affects the understanding capacity of language models has been very few and only limited to a handful of languages. To reduce this gap we used 6 different tokenization schemes to pretrain relatively small language models in Nepali and used the representations learned to finetune on several downstream tasks.	Nishant Luitel; Nirajan Bekoju; Anand Kumar Sah; Subarna Shakya;	arxiv-cs.CL	2024-04-28
1130	Transfer Learning and Transformer Architecture for Financial Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View	Tohida Rehman; Raghubir Bose; S. Chattopadhyay; Debarshi Kumar Sanyal;	ArXiv	2024-04-28
1131	GPT for Games: A Scoping Review (2020-2023) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a scoping review of 55 articles to explore GPT’s potential for games, offering researchers a comprehensive understanding of the current applications and identifying both emerging trends and unexplored areas.	Daijin Yang; Erica Kleinman; Casper Harteveld;	arxiv-cs.HC	2024-04-27
1132	Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work addresses the task of detecting conspiracy theories in German Telegram messages.	Milena Pustet; Elisabeth Steffen; Helena Mihaljević;	arxiv-cs.CL	2024-04-27
1133	MRScore: Evaluating Radiology Report Generation with LLM-based Reward System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MRScore, an automatic evaluation metric tailored for radiology report generation by leveraging Large Language Models (LLMs).	YUNYI LIU et. al.	arxiv-cs.CL	2024-04-27
1134	CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments.	KAIXUAN HUANG et. al.	arxiv-cs.AI	2024-04-27
1135	UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our team’s participation in the MEDIQA-ClinicalNLP 2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating …	PARTH VASHISHT et. al.	ArXiv	2024-04-27
1136	ChatGPT Is Here to Help, Not to Replace Anybody — An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, 52 first-year CS students were surveyed in order to assess their views on technologies with code-generation capabilities, both from academic and professional perspectives.	Bruno Pereira Cipriano; Pedro Alves;	arxiv-cs.ET	2024-04-26
1137	Enhancing Legal Compliance and Regulation Analysis with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks.	Shabnam Hassani;	arxiv-cs.SE	2024-04-26
1138	UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt — A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework.	PARTH VASHISHT et. al.	arxiv-cs.AI	2024-04-26
1139	ChatGPT Is Here to Help, Not to Replace Anybody – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) like GPT and Bard are capable of producing code based on textual descriptions, with remarkable efficacy. Such technology will have profound …	Bruno Pereira Cipriano; P. Alves;	ArXiv	2024-04-26
1140	Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT As A Pivot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.	Michelle Terblanche; Kayode Olaleye; Vukosi Marivate;	arxiv-cs.CL	2024-04-26
1141	Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative artificial intelligences, particularly large language models (LLMs), play an increasingly prominent role in human decision-making contexts, necessitating transparency …	Lydia Uhler; Verena Jordan; Jürgen Buder; Markus Huff; Frank Papenmeier;	arxiv-cs.CL	2024-04-25
1142	Exploring Internal Numeracy in Language Models: A Case Study on ALBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models.	Ulme Wennberg; Gustav Eje Henter;	arxiv-cs.CL	2024-04-25
1143	Player-Driven Emergence in LLM-Driven Game Narrative Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives.	XIANGYU PENG et. al.	arxiv-cs.CL	2024-04-25
1144	TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters.	LIANG ZHANG et. al.	arxiv-cs.CV	2024-04-25
1145	Towards Efficient Patient Recruitment for Clinical Trials: Application of A Prompt-Based Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR.	Mojdeh Rahmanian; Seyed Mostafa Fakhrahmad; Seyedeh Zahra Mousavi;	arxiv-cs.CL	2024-04-24
1146	An Automated Learning Model for Twitter Sentiment Analysis Using Ranger AdaBelief Optimizer Based Bidirectional Long Short Term Memory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment analysis is an automated approach which is utilized in process of analysing textual data to describe public opinion. The sentiment analysis has major role in creating …	Sasirekha Natarajan; Smitha Kurian; P. Divakarachari; Przemysław Falkowski‐Gilski;	Expert Syst. J. Knowl. Eng.	2024-04-24
1147	GeckOpt: LLM System Efficiency Via Intent-Based Tool Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By …	Michael Fore; Simranjit Singh; Dimitrios Stamoulis;	Proceedings of the Great Lakes Symposium on VLSI 2024	2024-04-24
1148	A Comprehensive Survey on Evaluating Large Language Model Applications in The Medical Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective and ethical deployment.	Yining Huang; Keke Tang; Meilian Chen; Boyuan Wang;	arxiv-cs.CL	2024-04-24
1149	The Promise and Challenges of Using LLMs to Accelerate The Screening Process of Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening.	Aleksi Huotala; Miikka Kuutila; Paul Ralph; Mika Mäntylä;	arxiv-cs.CL	2024-04-24
1150	Automated Creation of Source Code Variants of A Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CR	2024-04-24
1151	Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on a specific use case, pharmaceutical manufacturing investigations, and propose that leveraging historical records of manufacturing incidents and deviations in an organization can be beneficial for addressing and closing new cases, or de-risking new manufacturing campaigns.	Hossein Salami; Brandye Smith-Goettler; Vijay Yadav;	arxiv-cs.CL	2024-04-23
1152	PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs.	SHASHI KANT GUPTA et. al.	arxiv-cs.CL	2024-04-23
1153	Transformers Can Represent $n$-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models.	Anej Svete; Ryan Cotterell;	arxiv-cs.CL	2024-04-23
1154	Pyramid Hierarchical Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer).	Muhammad Ahmad; Muhammad Hassaan Farooq Butt; Manuel Mazzara; Salvatore Distifano;	arxiv-cs.CV	2024-04-23
1155	Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance.	Qianru Meng; Xiao Zhang; Guus Ramackers; Visser Joost;	arxiv-cs.SE	2024-04-23
1156	From Complexity to Clarity: How AI Enhances Perceptions of Scientists and The Public’s Understanding of Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public’s understanding of science.	David M. Markowitz;	arxiv-cs.CL	2024-04-23
1157	Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates GPT-4V’s ability to interpret meteorological charts and communicate weather hazards appropriately to the user, despite challenges of hallucinations, where generative AI delivers coherent, confident, but incorrect responses. We assess GPT-4V’s competence via its web interface ChatGPT in two tasks: (1) generating a severe-weather outlook from weather-chart analysis and conducting self-evaluation, revealing an outlook that corresponds well with a Storm Prediction Center human-issued forecast; and (2) producing hazard summaries in Spanish and English from weather charts.	JOHN R. LAWSON et. al.	arxiv-cs.CL	2024-04-22
1158	Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Marking, a novel grading task that enhances automated grading systems by performing an in-depth analysis of student responses and providing students with visual highlights.	Shashank Sonkar; Naiming Liu; Debshila B. Mallick; Richard G. Baraniuk;	arxiv-cs.CL	2024-04-22
1159	How Well Can LLMs Echo Us? Evaluating AI Chatbots’ Role-Play Ability with ECHO Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an oversight limits the potential for advancements in digital human clones and non-player characters in video games. To bridge this gap, we introduce ECHO, an evaluative framework inspired by the Turing test.	MAN TIK NG et. al.	arxiv-cs.CL	2024-04-22
1160	Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CL	2024-04-22
1161	Pre-Calc: Learning to Use The Calculator Improves Numeracy in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively.	Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate;	arxiv-cs.CL	2024-04-22
1162	Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This paper presents a preliminary evaluation of GPT-4-Vision, a state-of-the-art deep learning model, and its capabilities in transforming Unified Modeling Language (UML) class diagrams into fully operating Java class files.	Gábor Antal; Richárd Vozár; Rudolf Ferenc;	arxiv-cs.SE	2024-04-22
1163	What Do Transformers Know About Government? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence.	JUE HOU et. al.	arxiv-cs.CL	2024-04-22
1164	Transformer-Driven Resource Allocation for Enhanced Multi-Carrier NOMA Downlink Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a transformer-driven resource allocation strategy to optimize channel assignment and power allocation in multi-carrier non-orthogonal multiple access (NOMA) …	Liang Leon Dong;	2024 IEEE Wireless Communications and Networking Conference …	2024-04-21
1165	SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM’s SVG Editing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For quantitative evaluation of LLMs’ ability to edit SVG, we propose SVGEditBench.	Kunato Nishina; Yusuke Matsui;	arxiv-cs.CV	2024-04-21
1166	Automated Text Mining of Experimental Methodologies from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the fine-tuned DistilBERT, a methodology-specific, pre-trained generative classification language model for mining biomedicine texts.	Ziqing Guo;	arxiv-cs.CL	2024-04-21
1167	Do English Named Entity Recognizers Work Well on Global Englishes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As such, it is unclear whether they generalize for analyzing use of English globally. To test this, we build a newswire dataset, the Worldwide English NER Dataset, to analyze NER model performance on low-resource English variants from around the world.	Alexander Shan; John Bauer; Riley Carlson; Christopher Manning;	arxiv-cs.CL	2024-04-20
1168	Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a solution, we propose a combined intrinsic-extrinsic evaluation framework for subword tokenization.	KHUYAGBAATAR BATSUREN et. al.	arxiv-cs.CL	2024-04-20
1169	Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism.	Danqing Ma; Meng Wang; Ao Xiang; Zongqing Qi; Qin Yang;	arxiv-cs.CV	2024-04-19
1170	Crowdsourcing Public Attitudes Toward Local Services Through The Lens of Google Maps Reviews: An Urban Density-based Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel data source and methodological framework that can be easily adapted to different regions, offering useful insights into public sentiment toward the built environment and shedding light on how planning policies can be designed to handle related challenges.	Lingyao Li; Songhua Hu; Atiyya Shaw; Libby Hemphill;	arxiv-cs.SI	2024-04-19
1171	TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling.	Aleksei Dorkin; Kairit Sirts;	arxiv-cs.CL	2024-04-19
1172	Enabling Natural Zero-Shot Prompting on Encoder Models Via Statement-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label.	Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji;	arxiv-cs.CL	2024-04-19
1173	Linearly-evolved Transformer for Pan-sharpening Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource satellites.To address this challenge between favorable performance and expensive computation, we tailor an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework.	JUNMING HOU et. al.	arxiv-cs.CV	2024-04-19
1174	Enhancing Child Safety in Online Gaming: The Development and Application of Protectbot, An AI-Powered Chatbot Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study introduces Protectbot, an innovative chatbot framework designed to improve safety in children’s online gaming environments. At its core, Protectbot incorporates …	Anum Faraz; Fardin Ahsan; Jinane Mounsef; Ioannis Karamitsos; A. Kanavos;	Inf.	2024-04-19
1175	Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation.	Yucheng Lin; Yuhan Xia; Yunfei Long;	arxiv-cs.CL	2024-04-18
1176	MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm.	Jinwu Wang; Wei Mao; Miaomiao Liu;	arxiv-cs.SD	2024-04-18
1177	Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. …	Nils Graef;	arxiv-cs.LG	2024-04-18
1178	Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles.	Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch;	arxiv-cs.CL	2024-04-18
1179	EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research.	Jimenez Eladio; Hao Wu;	arxiv-cs.CL	2024-04-18
1180	Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce two new methods, Dubo-SQL v1 and v2.	Dayton G. Thorpe; Andrew J. Duberstein; Ian A. Kinsey;	arxiv-cs.CL	2024-04-18
1181	Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications.	Wei Chen; Zhiyuan Li;	arxiv-cs.CL	2024-04-17
1182	CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties.	Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu;	arxiv-cs.AI	2024-04-17
1183	CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions.	Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak;	arxiv-cs.CL	2024-04-16
1184	Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios.	PEIYUAN ZHI et. al.	arxiv-cs.RO	2024-04-15
1185	AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance.	Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara;	arxiv-cs.CV	2024-04-15
1186	Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, I argue that we can extract from the way the transformer architecture works a theory of the relationship between context and meaning.	Jumbly Grindrod;	arxiv-cs.CL	2024-04-15
1187	Leveraging GPT-like LLMs to Automate Issue Labeling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Issue labeling is a crucial task for the effective management of software projects. To date, several approaches have been put forth for the automatic assignment of labels to issue …	Giuseppe Colavito; F. Lanubile; Nicole Novielli; L. Quaranta;	2024 IEEE/ACM 21st International Conference on Mining …	2024-04-15
1188	Zero-shot Building Age Classification from Facade Image Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. A building’s age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images …	ZICHAO ZENG et. al.	ArXiv	2024-04-15
1189	Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper introduces fourteen novel datasets for the evaluation of Large Language Models’ safety in the context of enterprise tasks. A method was devised to evaluate a model’s …	David Nadeau; Mike Kroutikov; Karen McNeil; Simon Baribeau;	ArXiv	2024-04-15
1190	Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility.	SIQIAO XUE et. al.	arxiv-cs.AI	2024-04-15
1191	Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain.	Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu;	arxiv-cs.CV	2024-04-15
1192	Few-shot Name Entity Recognition on StackOverflow IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning.	Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo;	arxiv-cs.CL	2024-04-14
1193	Improving Domain Generalization in Speech Emotion Recognition with Whisper Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers have been used successfully in a variety of settings, including Speech Emotion Recognition (SER). However, use of the latest transformer base models in domain …	Erik Goron; Lena Asai; Elias Rut; Martin Dinov;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1194	Hybrid Convolution-Transformer for Lightweight Single Image Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid development of deep learning has driven the breakthrough in performance of single image super-resolution (SISR). However, many existing works deepen the network to …	Jiuqiang Li; Yutong Ke;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1195	TD-GPT: Target Protein-Specific Drug Molecule Generation GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Drug discovery faces challenges due to the vast chemical space and complex drug-target interactions. This paper proposes a novel deep learning framework TD-GPT for targeted drug …	ZHENGDA HE et. al.	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1196	A Scalable Sparse Transformer Model for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extracting the melody of a singing voice is an essential task within the realm of music information retrieval (MIR). Recently, transformer based models have drawn great attention …	Shuai Yu; Jun Liu; Yi Yu; Wei Li;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1197	GPT-4 Driven Cinematic Music Generation Through Text Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech …	Muhammad Taimoor Haseeb; Ahmad Hammoudeh; Gus G. Xia;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1198	Inducing Inductive Bias in Vision Transformer for EEG Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Human brain signals are highly complex and dynamic in nature. Electroencephalogram (EEG) devices capture some of this complexity, both in space and in time, with a certain …	Rabindra Khadka; Pedro G. Lind; G. Mello; M. Riegler; Anis Yazidi;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1199	A Hybrid CNN-Transformer for Focal Liver Lesion Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The early diagnosis of focal liver lesions (FLLs) plays a key role in the successful treatment of liver cancer. To effectively diagnose focal liver lesions, we used …	LING ZHAO et. al.	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1200	Assessing The Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are structured arguments that allow verifying the correct implementation of the created systems’ non-functional requirements (e.g., safety, security). This …	KIMYA KHAKZAD SHAHANDASHTI et. al.	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1201	Planning to Guide LLM for Code Coverage Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Code coverage serves as a crucial metric to assess testing effectiveness, measuring the degree to which a test suite exercises different facets of the code, such as statements, …	Hridya Dhulipala; Aashish Yadavally; Tien N. Nguyen;	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1202	Fine Tuning Large Language Model for Secure Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI pair programmers, such as GitHub’s Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the …	Junjie Li; Aseem Sangalay; Cheng Cheng; Yuan Tian; Jinqiu Yang;	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1203	OpenTE: Open-Structure Table Extraction From Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents an Open-Structure Table Extraction (OpenTE) task, which aims to extract a table with intrinsic semantic, calculational, and hierarchical structure from …	Haoyu Dong; Mengkang Hu; Qinyu Xu; Haocheng Wang; Yue Hu;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1204	LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NER Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Flat-LAttice Transformer (FLAT) has achieved notable success in Chinese named entity recognition (NER) by integrating lexical information into the widely-used Transformer …	Zongcheng Ji; Yinlong Xiao;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1205	The Impact of Knowledge Distillation on The Energy Consumption and Runtime Efficiency of NLP Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Context. While models like BERT and GPT are powerful, they require substantial resources. Knowledge distillation can be employed as a technique to enhance their efficiency. Yet, …	YE YUAN et. al.	2024 IEEE/ACM 3rd International Conference on AI …	2024-04-14
1206	A Lightweight Transformer-based Neural Network for Large-scale Masonry Arch Bridge Point Cloud Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to …	Yixiong Jing; Brian Sheil; S. Acikgoz;	Comput. Aided Civ. Infrastructure Eng.	2024-04-14
1207	CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs.	Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran;	arxiv-cs.CL	2024-04-12
1208	Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels.	William Gantt; Aaron Steven White;	arxiv-cs.CL	2024-04-12
1209	Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap.	Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych;	arxiv-cs.CL	2024-04-12
1210	Inheritune: Training Smaller Yet More Attentive Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Layers in this state are unable to learn anything meaningful and mostly redundant; we refer to these as lazy layers. The goal of this paper is to train smaller models by eliminating this structural inefficiency without compromising performance.	Sunny Sanyal; Ravid Shwartz-Ziv; Alexandros G. Dimakis; Sujay Sanghavi;	arxiv-cs.CL	2024-04-12
1211	Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code.	MENGNAN QI et. al.	arxiv-cs.PL	2024-04-12
1212	Measuring Geographic Diversity of Foundation Models with A Natural Language-based Geo-guessing Experiment on GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. Generative AI based on foundation models provides a first glimpse into the world represented by machines trained on vast amounts of multimodal data ingested by these …	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	ArXiv	2024-04-11
1213	Map Reading and Analysis with GPT-4V(ision) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In late 2023, the image-reading capability added to a Generative Pre-trained Transformer (GPT) framework provided the opportunity to potentially revolutionize the way we view and …	Jinwen Xu; Ran Tao;	ISPRS Int. J. Geo Inf.	2024-04-11
1214	LLM Agents Can Autonomously Exploit One-day Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems.	Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang;	arxiv-cs.CR	2024-04-11
1215	From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates.	Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu;	arxiv-cs.CL	2024-04-11
1216	Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented.	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	arxiv-cs.CY	2024-04-11
1217	Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items.	Andreas Säuberli; Simon Clematide;	arxiv-cs.CL	2024-04-11
1218	Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images.	Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi;	arxiv-cs.RO	2024-04-11
1219	Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs.	Miriam Anschütz; Edoardo Mosca; Georg Groh;	arxiv-cs.CL	2024-04-10
1220	Automated Mapping of Common Vulnerabilities and Exposures to MITRE ATT&CK Tactics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Effectively understanding and categorizing vulnerabilities is vital in the ever-evolving cybersecurity landscape, since only one exposure can have a devastating effect on the …	Ioana Branescu; Octavian Grigorescu; Mihai Dascălu;	Inf.	2024-04-10
1221	Learning A Multimodal Feature Transformer for RGBT Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	Hui-ling Shi; Xiaodong Mu; Danyao Shen; Chengliang Zhong;	Signal Image Video Process.	2024-04-09
1222	Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere.	Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan;	arxiv-cs.CL	2024-04-09
1223	Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data.	YANJIE LI et. al.	arxiv-cs.LG	2024-04-09
1224	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	arxiv-cs.CV	2024-04-09
1225	PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English.	Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar;	arxiv-cs.CL	2024-04-08
1226	OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets.	MEHRAN SAFAYANI et. al.	arxiv-cs.CL	2024-04-08
1227	VulnHunt-GPT: A Smart Contract Vulnerabilities Detector Based on OpenAI ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Smart contracts are self-executing programs that can run on a blockchain. Due to the fact of being immutable after their deployment on blockchain, it is crucial to ensure their …	Biagio Boi; Christian Esposito; Sokjoon Lee;	Proceedings of the 39th ACM/SIGAPP Symposium on Applied …	2024-04-08
1228	Use of A Structured Knowledge Base Enhances Metadata Curation By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the potential of large language models (LLMs), specifically GPT-4, to improve adherence to metadata standards.	SOWMYA S. SUNDARAM et. al.	arxiv-cs.AI	2024-04-08
1229	Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data.	AHMAD IDRISSI-YAGHIR et. al.	arxiv-cs.CL	2024-04-08
1230	Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to compare the performance of GPT with traditional deep learning models (Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT)) in extracting acupoint-related location relations and assess the impact of pretraining and fine-tuning on GPT’s performance.	YIMING LI et. al.	arxiv-cs.CL	2024-04-08
1231	Clinical Trials Protocol Authoring Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies.	Morteza Maleki; SeyedAli Ghahari;	arxiv-cs.CE	2024-04-07
1232	Initial Exploration of Zero-Shot Privacy Utility Tradeoffs in Tabular Data Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We investigate the application of large language models (LLMs), specifically GPT-4, to scenarios involving the tradeoff between privacy and utility in tabular data. Our approach …	Bishwas Mandal; G. Amariucai; Shuangqing Wei;	2024 International Joint Conference on Neural Networks …	2024-04-07
1233	PagPassGPT: Pattern Guided Password Guessing Via Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT).	XINGYU SU et. al.	arxiv-cs.CR	2024-04-07
1234	RecGPT: Generative Personalized Prompts for Sequential Recommendation Via ChatGPT Training Paradigm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as …	YABIN ZHANG et. al.	ArXiv	2024-04-06
1235	Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel approach, Joint Visual and Text Prompting (VTPrompt), that employs fine-grained visual information to enhance the capability of MLLMs in VQA, especially for object-oriented perception.	SONGTAO JIANG et. al.	arxiv-cs.CL	2024-04-06
1236	Scope Ambiguities in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models — GPT-2, GPT-3/3.5, Llama 2 and GPT-4 — treat scope ambiguous sentences, and compare this with human judgments.	Gaurav Kamath; Sebastian Schuster; Sowmya Vajjala; Siva Reddy;	arxiv-cs.CL	2024-04-05
1237	Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models.	JERRY YAO-CHIEH HU et. al.	arxiv-cs.LG	2024-04-04
1238	Hierarchical Patch Aggregation Transformer for Motion Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View	Yujie Wu; Lei Liang; Siyao Ling; Zhisheng Gao;	Neural Process. Lett.	2024-04-04
1239	Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the …	SHUO CHEN et. al.	arxiv-cs.LG	2024-04-04
1240	NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation Using Few-Shot Multi-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate.	Anish Pahilajani; Samyak Rajesh Jain; Devasha Trivedi;	arxiv-cs.CL	2024-04-03
1241	UTeBC-NLP at SemEval-2024 Task 9: Can LLMs Be Lateral Thinkers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT) and direct prompting, enhancing with informative descriptions, and employing contextualizing prompts using a retrieval augmented generation (RAG) pipeline.	Pouya Sadeghi; Amirhossein Abaskohi; Yadollah Yaghoobzadeh;	arxiv-cs.CL	2024-04-03
1242	FGeo-TP: A Language Model-Enhanced Solver for Euclidean Geometry Problems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The application of contemporary artificial intelligence techniques to address geometric problems and automated deductive proofs has always been a grand challenge to the …	Yiming He; Jia Zou; Xiaokai Zhang; Na Zhu; Tuo Leng;	Symmetry	2024-04-03
1243	Task Agnostic Architecture for Algorithm Induction Via Implicit Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking this trend of generalization to the extreme suggests the possibility of a single deep network architecture capable of solving all tasks. This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed.	Sahil J. Sindhi; Ignas Budvytis;	arxiv-cs.LG	2024-04-03
1244	BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this, our team engaged in SemEval-2024 Task 4, a hierarchical multi-label classification task designed to identify rhetorical and psychological persuasion techniques embedded within memes. To tackle this problem, we introduced a caption generation step to assess the modality gap and the impact of additional semantic information from images, which improved our result.	Amirhossein Abaskohi; Amirhossein Dabiriaghdam; Lele Wang; Giuseppe Carenini;	arxiv-cs.CL	2024-04-03
1245	GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo.	Ali Pesaranghader; Nikhil Verma; Manasa Bharadwaj;	arxiv-cs.CL	2024-04-03
1246	METAL: Towards Multilingual Meta-Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a framework for an end-to-end assessment of LLMs as evaluators in multilingual scenarios.	Rishav Hada; Varun Gumma; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram;	arxiv-cs.CL	2024-04-02
1247	Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this way, we achieve 100% attack success rate — according to GPT-4 as a judge — on Vicuna-13B, Mistral-7B, Phi-3-Mini, Nemotron-4-340B, Llama-2-Chat-7B/13B/70B, Llama-3-Instruct-8B, Gemma-7B, GPT-3.5, GPT-4o, and R2D2 from HarmBench that was adversarially trained against the GCG attack.	Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion;	arxiv-cs.CR	2024-04-02
1248	SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SGSH–a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG.	SHASHA GUO et. al.	arxiv-cs.CL	2024-04-02
1249	Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks.	AMIRHOSSEIN ABASKOHI et. al.	arxiv-cs.CL	2024-04-02
1250	Release of Pre-Trained Models for The Japanese Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI democratization aims to create a world in which the average person can utilize AI techniques.	KEI SAWADA et. al.	arxiv-cs.CL	2024-04-02
1251	GPT-COPE: A Graph-Guided Point Transformer for Category-Level Object Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Category-level object pose estimation aims to predict the 6D pose and 3D metric size of objects from given categories. Due to significant intra-class shape variations among …	Lu Zou; Zhangjin Huang; Naijie Gu; Guoping Wang;	IEEE Transactions on Circuits and Systems for Video …	2024-04-01
1252	Vision Transformer Models for Mobile/edge Devices: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View	SEUNG IL LEE et. al.	Multim. Syst.	2024-04-01
1253	Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models.	HAN CAI et. al.	arxiv-cs.CV	2024-04-01
1254	Syntactic Robustness for LLM-based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on prompts that ask for code that generates solutions to variables in an equation, when given coefficients of the equation as input.	Laboni Sarker; Mara Downing; Achintya Desai; Tevfik Bultan;	arxiv-cs.SE	2024-04-01
1255	TQRFormer: Tubelet Query Recollection Transformer for Action Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Xiangyang Wang; Kun Yang; Qiang Ding; Rui Wang; Jinhua Sun;	Image Vis. Comput.	2024-04-01
1256	RDTN: Residual Densely Transformer Network for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Yan Li; Xiaofei Yang; Dong Tang; Zheng-yang Zhou;	Expert Syst. Appl.	2024-04-01
1257	ScopeViT: Scale-Aware Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	XUESONG NIE et. al.	Pattern Recognit.	2024-04-01
1258	Large Language Model Evaluation Via Multi AI Agents: Preliminary Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite extensive efforts to examine LLMs from various perspectives, there is a noticeable lack of multi-agent AI models specifically designed to evaluate the performance of different LLMs. To address this gap, we introduce a novel multi-agent AI model that aims to assess and compare the performance of various LLMs.	Zeeshan Rasheed; Muhammad Waseem; Kari Systä; Pekka Abrahamsson;	arxiv-cs.SE	2024-04-01
1259	Time Domain Speech Enhancement with CNN and Time-attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	N. Saleem; T. S. Gunawan; Sami Dhahbi; Sami Bourouis;	Digit. Signal Process.	2024-04-01
1260	Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify and address ethical issues through empirical studies.	Richard Kimera; Yun-Seon Kim; Heeyoul Choi;	arxiv-cs.CL	2024-04-01
1261	BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research.	Jiarong Xian; Jibao Yuan; Peiwei Zheng; Dexian Chen; Nie yuntao;	arxiv-cs.CL	2024-04-01
1262	LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.	Zilong Wang; Xufang Luo; Xinyang Jiang; Dongsheng Li; Lili Qiu;	arxiv-cs.CL	2024-04-01
1263	Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of zero-shot Large Multimodal Models (LMMs) in the domain of drone perception.	Christian Limberg; Artur Gonçalves; Bastien Rigault; Helmut Prendinger;	arxiv-cs.CV	2024-04-01
1264	An Innovative GPT-based Open-source Intelligence Using Historical Cyber Incident Reports Related Papers Related Patents Related Grants Related Venues Related Experts View	F. Sufi;	Nat. Lang. Process. J.	2024-04-01
1265	SIF-TF: A Scene-Interaction Fusion Transformer for Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	Fei Gao; Wanjun Huang; Libo Weng; Yuanming Zhang;	Knowl. Based Syst.	2024-04-01
1266	TWIN-GPT: Digital Twins for Clinical Trials Via Large Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a large language model-based digital twin creation approach, called TWIN-GPT.	YUE WANG et. al.	arxiv-cs.LG	2024-04-01
1267	SwinSOD: Salient Object Detection Using Swin-transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Shuang Wu; Guangjian Zhang; Xuefeng Liu;	Image Vis. Comput.	2024-04-01
1268	EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new benchmark – EvoCodeBench to address the preceding problems, which has three primary advances.	Jia Li; Ge Li; Xuanming Zhang; Yihong Dong; Zhi Jin;	arxiv-cs.CL	2024-03-31
1269	Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune Your Model Unless You Have Access to GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a PEFT method to improve the consistency of LLMs by merging adapters that were fine-tuned separately using triplet and language modelling objectives.	Aryo Pradipta Gema; Giwon Hong; Pasquale Minervini; Luke Daines; Beatrice Alex;	arxiv-cs.CL	2024-03-30
1270	Cross-lingual Named Entity Corpus for Slavic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a corpus manually annotated with named entities for six Slavic languages – Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian.	Jakub Piskorski; Michał Marcińczuk; Roman Yangarber;	arxiv-cs.CL	2024-03-30
1271	Jetsons at FinNLP 2024: Towards Understanding The ESG Impact of A News Article Using Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task.	PARAG PRAVIN DAKLE et. al.	arxiv-cs.CL	2024-03-30
1272	Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite efforts to create open-source models, their limited parameters often result in insufficient multi-step reasoning capabilities required for solving complex medical problems. To address this, we introduce Meerkat, a new family of medical AI systems ranging from 7 to 70 billion parameters.	HYUNJAE KIM et. al.	arxiv-cs.CL	2024-03-30
1273	A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In pursuit of suitable data augmentation methods, this study explores both established legacy approaches and contemporary practices such as Large Language Models (LLM), including GPT in Hate Speech detection.	Md Saroar Jahan; Mourad Oussalah; Djamila Romaissa Beddia; Jhuma kabir Mim; Nabil Arhab;	arxiv-cs.CL	2024-03-30
1274	A Hybrid Transformer and Attention Based Recurrent Neural Network for Robust and Interpretable Sentiment Analysis of Tweets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this.	Md Abrar Jahin; Md Sakib Hossain Shovon; M. F. Mridha; Md Rashedul Islam; Yutaka Watanobe;	arxiv-cs.CL	2024-03-30
1275	Transformer Based Pluralistic Image Completion with Reduced Information Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer. To mitigate these issues, we propose a new transformer based framework called PUT.	QIANKUN LIU et. al.	arxiv-cs.CV	2024-03-30
1276	Shallow Cross-Encoders for Low-Latency Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, keeping search latencies low is important for user satisfaction and energy usage. In this paper, we show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings since they can estimate the relevance of more documents in the same time budget.	Aleksandr V. Petrov; Sean MacAvaney; Craig Macdonald;	arxiv-cs.IR	2024-03-29
1277	ChatGPT V.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can ChatGPT detect media bias? This study seeks to answer this question by leveraging the Media Bias Identification Benchmark (MBIB) to assess ChatGPT’s competency in distinguishing six categories of media bias, juxtaposed against fine-tuned models such as BART, ConvBERT, and GPT-2.	Zehao Wen; Rabih Younes;	arxiv-cs.CL	2024-03-29
1278	ReALM: Reference Resolution As Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality.	JOEL RUBEN ANTONY MONIZ et. al.	arxiv-cs.CL	2024-03-29
1279	Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive.	Ahmad Diab; Rr. Nefriana; Yu-Ru Lin;	arxiv-cs.CL	2024-03-29
1280	Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks.	ANG LV et. al.	arxiv-cs.CL	2024-03-28
1281	A Review of Multi-Modal Large Language and Vision Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have recently emerged as a focal point of research and application, driven by their unprecedented ability to understand and generate text with …	Kilian Carolan; Laura Fennelly; A. Smeaton;	ArXiv	2024-03-28
1282	Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator’s behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT).	Norman Di Palo; Edward Johns;	arxiv-cs.RO	2024-03-28
1283	TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-modal large language models (MLLMs), such as GPT-4, exhibit great comprehension capabilities on human instruction, as well as zero-shot ability on new downstream multi-modal …	YUNKAI CHEN et. al.	ACM Transactions on Knowledge Discovery from Data	2024-03-28
1284	Decision Mamba: Reinforcement Learning Via Sequence Modeling with Selective State Spaces IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios.	Toshihiro Ota;	arxiv-cs.LG	2024-03-28
1285	Reshaping Free-Text Radiology Notes Into Structured Reports With Generative Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a pipeline to extract information from free-text radiology reports, that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma.	LAURA BERGOMI et. al.	arxiv-cs.CL	2024-03-27
1286	AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data.	FELIX VIRGO et. al.	arxiv-cs.CL	2024-03-27
1287	Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed three approaches for leveraging LLMs for text classification: employing LLMs as zero-shot classifiers, us-ing LLMs as annotators to annotate training data for supervised classifiers, and utilizing LLMs with few-shot examples for augmentation of manually annotated data.	Yuting Guo; Anthony Ovadje; Mohammed Ali Al-Garadi; Abeed Sarker;	arxiv-cs.CL	2024-03-27
1288	3P-LLM: Probabilistic Path Planning Using Large Language Model for Autonomous Robot Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning.	Ehsan Latif;	arxiv-cs.RO	2024-03-27
1289	PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students’ physics labs.	Ehsan Latif; Ramviyas Parasuraman; Xiaoming Zhai;	arxiv-cs.RO	2024-03-27
1290	The Topos of Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a theoretical analysis of the expressivity of the transformer architecture through the lens of topos theory.	Mattia Jacopo Villani; Peter McBurney;	arxiv-cs.LG	2024-03-27
1291	SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs.	Brian Formento; Wenjie Feng; Chuan Sheng Foo; Luu Anh Tuan; See-Kiong Ng;	arxiv-cs.CL	2024-03-27
1292	RankMamba: Benchmarking Mamba’s Document Ranking Performance in The Era of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine \mamba’s efficacy through the lens of a classical IR task — document ranking.	Zhichao Xu;	arxiv-cs.IR	2024-03-27
1293	A Survey on Large Language Models from Concept to Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series.	Chen Wang; Jin Zhao; Jiaqi Gong;	arxiv-cs.CL	2024-03-27
1294	From Traditional Recommender Systems to GPT-Based Chatbots: A Survey of Recent Developments and Future Directions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recommender systems are a key technology for many applications, such as e-commerce, streaming media, and social media. Traditional recommender systems rely on collaborative …	TAMIM M. AL-HASAN et. al.	Big Data Cogn. Comput.	2024-03-27
1295	Evaluating The Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of Large Language Models (LLMs) has led to a parallel rise in the development of Large Multimodal Models (LMMs), which have begun to transform a variety of applications. These sophisticated multimodal models are designed to interpret and analyze complex data by integrating multiple modalities such as text and images, thereby opening new avenues for a range of applications.	Fouad Trad; Ali Chehab;	arxiv-cs.AI	2024-03-26
1296	Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking.	HAI-LONG NGUYEN et. al.	arxiv-cs.CL	2024-03-26
1297	Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a task for testing lexical-syntactic flexibility — the degree to which models can generalize over words in a construction with a non-prototypical part of speech.	David R. Mortensen; Valentina Izrailevitch; Yunze Xiao; Hinrich Schütze; Leonie Weissweiler;	arxiv-cs.CL	2024-03-26
1298	Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models Using Minimal Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing’ method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer.	Linyang He; Peili Chen; Ercong Nie; Yuanning Li; Jonathan R. Brennan;	arxiv-cs.CL	2024-03-25
1299	State Space Models As Foundation Models: A Control Theoretic Overview Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by …	Carmen Amo Alonso; Jerome Sieber; M. Zeilinger;	ArXiv	2024-03-25
1300	Grammatical Vs Spelling Error Correction: An Investigation Into The Responsiveness of Transformer-based Language Models Using BART and MarianMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims at analyzing different kinds of error that occurs in text documents.	Rohit Raju; Peeta Basa Pati; SA Gandheesh; Gayatri Sanjana Sannala; Suriya KS;	arxiv-cs.CL	2024-03-25
1301	LLM-Guided Formal Verification Coupled with Mutation Testing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The increasing complexity of modern hardware designs poses significant challenges for design verification, particularly defining and verifying properties and invariants manually. …	Muhammad Hassan; Sallar Ahmadi-Pour; Khushboo Qayyum; C. Jha; Rolf Drechsler;	2024 Design, Automation & Test in Europe Conference & …	2024-03-25
1302	CYGENT: A Cybersecurity Conversational Agent with Log Summarization Powered By GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability.	Prasasthy Balasubramanian; Justin Seby; Panos Kostakos;	arxiv-cs.CR	2024-03-25
1303	Towards Algorithmic Fidelity: Mental Health Representation Across Demographics in Synthetic Vs. Human-generated Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we …	Shinka Mori; Oana Ignat; Andrew Lee; Rada Mihalcea;	International Conference on Language Resources and …	2024-03-25
1304	Reflective Microresonator Based Microwave Photonic Sensor Assisted By Sparse Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We demonstrate a sparse transformer assisted microwave photonic sensor using a microring cascaded with an inverse designed reflector. Even with a small dataset, the …	XIAOYI TIAN et. al.	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1305	Automatic Short Answer Grading for Finnish with ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic short answer grading (ASAG) seeks to mitigate the burden on teachers by leveraging computational methods to evaluate student-constructed text responses. Large language …	Li-Hsin Chang; Filip Ginter;	AAAI Conference on Artificial Intelligence	2024-03-24
1306	Proxyformer: Nyström-Based Linear Transformer with Trainable Proxy Tokens Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based models have demonstrated remarkable performance in various domains, including natural language processing, image processing and generative modeling. The most …	Sangho Lee; Hayun Lee; Dongkun Shin;	AAAI Conference on Artificial Intelligence	2024-03-24
1307	GPT-Enabled Digital Twin Assistant for Multi-task Cooperative Management in Autonomous Optical Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A GPT-enabled digital twin (DT) assistant is implemented with the capabilities of intention understanding, analysis, reasoning, and complex multi-task collaboration, which …	YAO ZHANG et. al.	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1308	Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL).	MINYU CHEN et. al.	arxiv-cs.AI	2024-03-24
1309	Anomaly Detection and Localization in Optical Networks Using Vision Transformer and SOP Monitoring Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce an innovative vision transformer approach to identify and precisely locate high-risk events, including fiber cut precursors, in state-of-polarization derived …	K. ABDELLI et. al.	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1310	A Transformer Approach for Electricity Price Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach to electricity price forecasting (EPF) using a pure Transformer model.	Oscar Llorente; Jose Portela;	arxiv-cs.LG	2024-03-24
1311	LlamBERT: Large-scale Low-cost Data Annotation in NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LlamBERT, a hybrid approach that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer encoders like BERT and RoBERTa.	Bálint Csanády; Lajos Muzsai; Péter Vedres; Zoltán Nádasdy; András Lukács;	arxiv-cs.CL	2024-03-23
1312	Using Large Language Models for OntoClean-based Ontology Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology.	Yihang Zhao; Neil Vetter; Kaveh Aryan;	arxiv-cs.AI	2024-03-23
1313	VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the purpose of future research, CafeBERT is made publicly available for research purposes.	Phong Nguyen-Thuan Do; Son Quoc Tran; Phu Gia Hoang; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-03-23
1314	Evaluating GPT-4’s Proficiency in Addressing Cryptography Examinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: . In the rapidly advancing domain of artificial intelligence, ChatGPT, powered by the GPT-4 model, has emerged as a state-of-the-art interactive agent, exhibiting substantial …	Vasily Mikhalev; Nils Kopal; B. Esslinger;	IACR Cryptol. ePrint Arch.	2024-03-23
1315	GPT-Connect: Interaction Between Text-Driven Human Motion Generator and 3D Scenes in A Training-free Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, intuitively training a separate scene-aware motion generator in a supervised way can require a large amount of motion samples to be troublesomely collected and annotated in a large scale of different 3D scenes. To handle this task rather in a relatively convenient manner, in this paper, we propose a novel GPT-connect framework.	Haoxuan Qu; Ziyan Guo; Jun Liu;	arxiv-cs.CV	2024-03-22
1316	Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.	Akshay Krishnamurthy; Keegan Harris; Dylan J. Foster; Cyril Zhang; Aleksandrs Slivkins;	arxiv-cs.LG	2024-03-22
1317	Geometry-aware 3D Pose Transfer Using Transformer Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View	Shanghuan Liu; Shaoyan Gai; Feipeng Da; Fazal Waris;	Comput. Vis. Media	2024-03-22
1318	On Zero-Shot Counterspeech Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech – counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind.	Punyajoy Saha; Aalok Agrawal; Abhik Jana; Chris Biemann; Animesh Mukherjee;	arxiv-cs.CL	2024-03-22
1319	MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 – Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection.	SADIYA SAYARA CHOWDHURY PUSPO et. al.	arxiv-cs.CL	2024-03-22
1320	Comprehensive Evaluation and Insights Into The Use of Large Language Models in The Automation of Behavior-Driven Development Acceptance Test Formulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this manuscript, we propose a novel approach to enhance BDD practices using large language models (LLMs) to automate acceptance test generation.	SHANTHI KARPURAPU et. al.	arxiv-cs.SE	2024-03-22
1321	Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish.	Lanxin Xu; Shuo Wang;	arxiv-cs.CV	2024-03-22
1322	LLM-based Extraction of Contradictions from Patents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI’s GPT-4.	Stefan Trapp; Joachim Warschat;	arxiv-cs.CL	2024-03-21
1323	K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In many literary texts, emotions are indirectly conveyed through descriptions of actions, facial expressions, and appearances, necessitating emotion inference for narrative understanding. In this paper, we introduce K-Act2Emo, a Korean commonsense knowledge graph (CSKG) comprising 1,900 indirect emotional expressions and the emotions inferable from them.	Kyuhee Kim; Surin Lee; Sangah Lee;	arxiv-cs.CL	2024-03-21
1324	Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature.	Sheetal Harris; Jinshuo Liu; Hassan Jalil Hadi; Yue Cao;	arxiv-cs.CL	2024-03-20
1325	Extracting Emotion Phrases from Tweets Using BART Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we applied an approach to sentiment analysis based on a question-answering framework.	Mahdi Rezapour;	arxiv-cs.CL	2024-03-20
1326	Evaluate Chat-GPT’s Programming Capability in Swift Through Real University Exam Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this study, we evaluate the programming capabilities of OpenAI’s GPT‐3.5 and GPT‐4 models using Swift‐based exam questions from a third‐year university course. The results …	Zizhuo Zhang; Lian Wen; Yanfei Jiang; Yongli Liu;	Softw. Pract. Exp.	2024-03-20
1327	Open Access NAO (OAN): A ROS2-based Software Framework for HRI Applications with The NAO Robot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new software framework for HRI experimentation with the sixth version of the common NAO robot produced by the United Robotics Group.	Antonio Bono; Kenji Brameld; Luigi D’Alfonso; Giuseppe Fedele;	arxiv-cs.RO	2024-03-20
1328	Retina Vision Transformer (RetinaViT): Introducing Scaled Patches Into Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Humans see low and high spatial frequency components at the same time, and combine the information from both to form a visual scene. Drawing on this neuroscientific inspiration, we propose an altered Vision Transformer architecture where patches from scaled down versions of the input image are added to the input of the first Transformer Encoder layer.	Yuyang Shu; Michael E. Bain;	arxiv-cs.CV	2024-03-20
1329	Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks where multiple outputs are required for a single shared input.	BO-RU LU et. al.	arxiv-cs.CL	2024-03-19
1330	Automated Data Curation for Robust Language Model Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an automated data curation pipeline CLEAR (Confidence-based LLM Evaluation And Rectification) for instruction tuning datasets, that can be used with any LLM and fine-tuning procedure.	Jiuhai Chen; Jonas Mueller;	arxiv-cs.CL	2024-03-19
1331	A Hyperspectral Unmixing Model Using Convolutional Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Sreejam Muraleedhara Bhakthan; L. Agilandeeswari;	Earth Sci. Informatics	2024-03-19
1332	Generating Automatic Feedback on UI Mockups with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the potential of using large language models for automatic feedback.	Peitong Duan; Jeremy Warner; Yang Li; Bjoern Hartmann;	arxiv-cs.HC	2024-03-19
1333	TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an end-to-end model called TT-BLIP that applies the bootstrapping language-image pretraining for unified vision-language understanding and generation (BLIP) for three types of information: BERT and BLIP\textsubscript{Txt} for text, ResNet and BLIP\textsubscript{Img} for images, and bidirectional BLIP encoders for multimodal information.	Eunjee Choi; Jong-Kook Kim;	arxiv-cs.LG	2024-03-19
1334	LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is not aligned with the prompt compression objective. To address these issues, we propose a data distillation procedure to derive knowledge from an LLM to compress prompts without losing crucial information, and meantime, introduce an extractive text compression dataset.	ZHUOSHI PAN et. al.	arxiv-cs.CL	2024-03-19
1335	Navigating Compiler Errors with AI Assistance — A Study of GPT Hints in An Introductory Programming Course Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler errors within a platform for automated assessment of programming assignments.	Maciej Pankiewicz; Ryan S. Baker;	arxiv-cs.SE	2024-03-19
1336	Navigating Compiler Errors with AI Assistance – A Study of GPT Hints in An Introductory Programming Course Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler …	Maciej Pankiewicz; Ryan S. Baker;	Proceedings of the 2024 on Innovation and Technology in …	2024-03-19
1337	End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a neuro-symbolic framework for jointly learning structured states and symbolic policies, whose key idea is to distill the vision foundation model into an efficient perception module and refine it during policy learning.	LIRUI LUO et. al.	arxiv-cs.AI	2024-03-19
1338	CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the dataset and benchmark naive, traditional, and Transformer models.	Korbinian Randl; John Pavlopoulos; Aron Henriksson; Tony Lindgren;	arxiv-cs.CL	2024-03-18
1339	GPT-4 As Evaluator: Evaluating Large Language Models on Pest Management in Agriculture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the rapidly evolving field of artificial intelligence (AI), the application of large language models (LLMs) in agriculture, particularly in pest management, remains nascent. We …	SHANGLONG YANG et. al.	ArXiv	2024-03-18
1340	Leveraging Large Language Models to Detect Npm Malicious Packages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this study is to aid security analysts in detecting malicious packages by empirically studying the effectiveness of Large Language Models (LLMs) in detecting malicious code.	Nusrat Zahan; Philipp Burckhardt; Mikola Lysenko; Feross Aboukhadijeh; Laurie Williams;	arxiv-cs.CR	2024-03-18
1341	Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its impressive capabilities, the financial cost associated with GPT-4V’s inference presents a substantial barrier for its wide use. To address this challenge, our work introduces Collage Prompting, a budget-friendly prompting approach that concatenates multiple images into a single visual input.	Siyu Xu; Yunke Wang; Daochang Liu; Chang Xu;	arxiv-cs.CV	2024-03-18
1342	How Far Are We on The Decision-Making of LLMs? Evaluating LLMs’ Gaming Ability in Multi-Agent Environments IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GAMA($\gamma$)-Bench, a new framework for evaluating LLMs’ Gaming Ability in Multi-Agent environments.	JEN-TSE HUANG et. al.	arxiv-cs.AI	2024-03-18
1343	NotebookGPT – Facilitating and Monitoring Explicit Lightweight Student GPT Help Requests During Programming Exercises Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The success of GPT with coding tasks has made it important to consider the impact of GPT and similar models on teaching programming. Students’ use of GPT to solve programming …	Samuel D George; P. Dewan;	Companion Proceedings of the 29th International Conference …	2024-03-18
1344	Human-AI Collaboration in A Student Discussion Forum Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The recent public releases of AI tools such as ChatGPT have forced computer science educators to reconsider how they teach. These tools have demonstrated considerable ability to …	Mason Laney; P. Dewan;	Companion Proceedings of the 29th International Conference …	2024-03-18
1345	Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social …	CHENYU HOU et. al.	Proceedings of the 14th Learning Analytics and Knowledge …	2024-03-18
1346	AI-Generated Text Detector for Arabic Language Using Encoder-Based Transformer Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The effectiveness of existing AI detectors is notably hampered when processing Arabic texts. This study introduces a novel AI text classifier designed specifically for Arabic, …	Hamed Alshammari; Ahmed El-Sayed; Khaled Elleithy;	Big Data Cogn. Comput.	2024-03-18
1347	Evaluating Named Entity Recognition: A Comparative Analysis of Mono- and Multilingual Transformer Models on A Novel Brazilian Corporate Earnings Call Transcripts Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our study aimed to evaluate their performance on a financial Named Entity Recognition (NER) task and determine the computational requirements for fine-tuning and inference.	Ramon Abilio; Guilherme Palermo Coelho; Ana Estela Antunes da Silva;	arxiv-cs.CL	2024-03-18
1348	An Empirical Study on JIT Defect Prediction Based on BERT-style Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction.	Yuxiang Guo; Xiaopeng Gao; Bo Jiang;	arxiv-cs.SE	2024-03-17
1349	Embracing The Generative AI Revolution: Advancing Tertiary Education in Cybersecurity with GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigated the impact of GPTs, specifically ChatGPT, on tertiary education in cybersecurity, and provided recommendations for universities to adapt their curricula to meet the evolving needs of the industry.	Raza Nowrozy; David Jam;	arxiv-cs.CY	2024-03-17
1350	Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data.	Daniel Enström; Viktor Kjellberg; Moa Johansson;	arxiv-cs.LG	2024-03-17
1351	Large Language Model-powered Chatbots for Internationalizing Student Support in Higher Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research explores the integration of chatbot technology powered by GPT-3.5 and GPT-4 Turbo into higher education to enhance internationalization and leverage digital …	Achraf Hsain; H. E. Housni;	ArXiv	2024-03-16
1352	Using An LLM to Turn Sign Spottings Into Spoken Language Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a hybrid SLT approach, Spotter+GPT, that utilizes a sign spotter and a powerful Large Language Model (LLM) to improve SLT performance.	Ozge Mercanoglu Sincan; Necati Cihan Camgoz; Richard Bowden;	arxiv-cs.CV	2024-03-15
1353	ATOM: Asynchronous Training of Massive Models for Deep Learning in A Decentralized Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce \atom, a resilient distributed training framework designed for asynchronous training of vast models in a decentralized setting using cost-effective hardware, including consumer-grade GPUs and Ethernet.	Xiaofeng Wu; Jia Rao; Wei Chen;	arxiv-cs.DC	2024-03-15
1354	Sabiá-2: A New Generation of Portuguese Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Sabi\’a-2, a family of large language models trained on Portuguese texts.	Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira; Ramon Pires;	arxiv-cs.CL	2024-03-14
1355	Evaluating LLMs for Gender Disparities in Notable Persons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect hallucinated responses or to altogether decline to even answer prompt at all.	Lauren Rhue; Sofie Goethals; Arun Sundararajan;	arxiv-cs.CL	2024-03-14
1356	ViTCN: Vision Transformer Contrastive Network For Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Zhang et al proposed a dataset called RAVEN which can be used to test Machine Learning model abstract reasoning ability. In this paper, we purposed Vision Transformer Contrastive Network which build on previous work with the Contrastive Perceptual Inference network (CoPiNet), which set a new benchmark for permutationinvariant models Raven Progressive Matrices by incorporating contrast effects from psychology, cognition, and education, and extends this foundation by leveraging the cutting-edge Vision Transformer architecture.	Bo Song; Yuanhao Xu; Yichao Wu;	arxiv-cs.CV	2024-03-14
1357	The Future of The Error Message: Comparing Large Language Models and Novice Programmer Effectiveness in Fixing Errors Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Research on enhancing error message presentation is of great interest to teachers and developers alike because improving Integrated Development Environments (IDEs) increases early …	Brij Howard-Sarin;	Proceedings of the 55th ACM Technical Symposium on Computer …	2024-03-14
1358	Evaluating Large Language Model Code Generation As An Autograding Mechanism for Explain in Plain English Questions IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The ability of students to ”Explain in Plain English” (EiPE) the purpose of code is a critical skill for students in introductory programming courses to develop. EiPE questions …	David H. Smith; C. Zilles;	Proceedings of the 55th ACM Technical Symposium on Computer …	2024-03-14
1359	Automatic Classification of Multi-attributes from Person Images Using GPT-4 Vision Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Classifying multi-attributes is gaining interest in the research and business community, especially for person re-identification (ReID) and fashion trend analysis. However, manual …	Yusei Fujimoto; Khayrul Bashar;	Proceedings of the 2024 6th International Conference on …	2024-03-14
1360	A Hierarchical Underwater Acoustic Target Recognition Method Based on Transformer and Transfer Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Underwater acoustic target recognition (UATR) is one of the essential research directions in the underwater acoustic signal processing field. The machine learning-based …	Lu Chen; Xinwei Luo; Hanlu Zhou;	Proceedings of the 2024 6th International Conference on …	2024-03-14
1361	AI on AI: Exploring The Utility of GPT As An Expert Annotator of AI Publications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results indicate that with effective prompt engineering, chatbots can be used as reliable data annotators even where subject-area expertise is required. To evaluate the utility of chatbot-annotated datasets on downstream classification tasks, we train a new classifier on GPT-labeled data and compare its performance to the arXiv-trained model.	Autumn Toney-Wails; Christian Schoeberl; James Dunham;	arxiv-cs.CL	2024-03-14
1362	Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Targeting at VL PEFT tasks, we propose a family of operations, called routing functions, to enhance VL alignment in the low-rank bottlenecks.	Tingyu Qu; Tinne Tuytelaars; Marie-Francine Moens;	arxiv-cs.CV	2024-03-14
1363	Evaluating The Application of Large Language Models to Generate Feedback in Programming Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study investigates the application of large language models, specifically GPT-4, to enhance programming education. The research outlines the design of a web application that …	Sven Jacobs; Steffen Jaschke;	2024 IEEE Global Engineering Education Conference (EDUCON)	2024-03-13
1364	Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare four of the currently most relevant large, web-crawled corpora (CC100, MaCoCu, mC4 and OSCAR) across eleven lower-resourced European languages.	RIK VAN NOORD et. al.	arxiv-cs.CL	2024-03-13
1365	A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing.	DONG YUAN et. al.	arxiv-cs.CL	2024-03-13
1366	GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored By Compliance, Context and Attribute Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security.	Raza Nowrozy; Khandakar Ahmed; Hua Wang;	arxiv-cs.CY	2024-03-13
1367	Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language processing (NLP) practitioners are leveraging large language models (LLM) to create structured datasets from semi-structured and unstructured data sources such as …	Jesse Atuhurra; Seiveright Cargill Dujohn; Hidetaka Kamigaito; Hiroyuki Shindo; Taro Watanabe;	ArXiv	2024-03-13
1368	Rethinking ASTE: A Minimalist Tagging Scheme Alongside Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches to ASTE often complicate the task with additional structures or external data. In this research, we propose a novel tagging scheme and employ a contrastive learning approach to mitigate these challenges.	Qiao Sun; Liujia Yang; Minghao Ma; Nanyang Ye; Qinying Gu;	arxiv-cs.CL	2024-03-12
1369	Pose Pattern Mining Using Transformer for Motion Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Seo-El Lee; Hyun Yoo; Kyungyong Chung;	Appl. Intell.	2024-03-12
1370	Using Generative AI to Improve The Performance and Interpretability of Rule-Based Diagnosis of Type 2 Diabetes Mellitus Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates …	Leon Kopitar; Iztok Fister; Gregor Stiglic;	Inf.	2024-03-12
1371	The Future of Document Indexing: GPT and Donut Revolutionize Table of Content Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model.	Degaga Wolde Feyisa; Haylemicheal Berihun; Amanuel Zewdu; Mahsa Najimoghadam; Marzieh Zare;	arxiv-cs.IR	2024-03-12
1372	GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present GPT Reddit Dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset designed to assess the performance of detection models in identifying generated responses from ChatGPT.	Zubair Qazi; William Shiao; Evangelos E. Papalexakis;	arxiv-cs.CL	2024-03-12
1373	In-context Learning Enables Multimodal Large Language Models to Classify Cancer Pathology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates.	DYKE FERBER et. al.	arxiv-cs.CV	2024-03-12
1374	SIFiD: Reassess Summary Factual Inconsistency Detection with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4.	JIUDING YANG et. al.	arxiv-cs.CL	2024-03-12
1375	Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2.	NICHOLAS CARLINI et. al.	arxiv-cs.CR	2024-03-11
1376	Development of A Reliable and Accessible Caregiving Language Model (CaLM) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we focused on caregivers of individuals with Alzheimer’s Disease Related Dementias.	BAMBANG PARMANTO et. al.	arxiv-cs.CL	2024-03-11
1377	Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Context-observant LLM-Enabled Autonomous Robots (CLEAR) platform offers a general solution for large language model (LLM)-enabled robot autonomy. CLEAR-controlled robots use …	JACOB P. MACDONALD et. al.	Companion of the 2024 ACM/IEEE International Conference on …	2024-03-11
1378	QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}.	JIUN-MAN CHEN et. al.	arxiv-cs.CV	2024-03-11
1379	Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Which we use in another set of transformer encoder layers to learn the inter-chunk representations. We analyze the adaptability of Large Language Models (LLMs) with multi-billion parameters (GPT-Neo, and GPT-J) with the hierarchical framework of MESc and compare them with their standalone performance on legal texts.	Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki;	arxiv-cs.CL	2024-03-11
1380	LLMs Still Can’t Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard’s Capacity to Handle Object-Oriented Programming Assignments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we experimented with three prominent LLMs – GPT-3.5, GPT-4, and Bard – to solve real-world OOP exercises used in educational settings, subsequently validating their solutions using an Automatic Assessment Tool (AAT).	Bruno Pereira Cipriano; Pedro Alves;	arxiv-cs.SE	2024-03-10
1381	JayBot — Aiding University Students and Admission with An LLM-based Chatbot Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This demo paper presents JayBot, an LLM-based chatbot system aimed at enhancing the user experience of prospective and current students, faculty, and staff at a UK university. The …	Julius Odede; Ingo Frommholz;	Proceedings of the 2024 Conference on Human Information …	2024-03-10
1382	Enhancing Human Annotation: Leveraging Large Language Models and Efficient Batch Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) are capable of assessing document and query characteristics, including relevance, and are now being used for a variety of different classification …	Oleg Zendel; J. Culpepper; Falk Scholer; Paul Thomas;	Proceedings of the 2024 Conference on Human Information …	2024-03-10
1383	GPT As Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos.	HAO LU et. al.	arxiv-cs.CV	2024-03-09
1384	A Dataset and Benchmark for Hospital Course Summarization with Adapted Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel pre-processed dataset, the MIMIC-IV-BHC, encapsulating clinical note and brief hospital course (BHC) pairs to adapt LLMs for BHC synthesis.	ASAD AALI et. al.	arxiv-cs.CL	2024-03-08
1385	How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named “GPTRadScore”.	QINGQING ZHU et. al.	arxiv-cs.AI	2024-03-08
1386	To Err Is Human, But Llamas Can Learn It Too Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs).	Agnes Luhtaru; Taido Purason; Martin Vainikko; Maksym Del; Mark Fishel;	arxiv-cs.CL	2024-03-08
1387	Will GPT-4 Run DOOM? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4’s reasoning and planning capabilities extend to the 1993 first-person shooter Doom.	Adrian de Wynter;	arxiv-cs.CL	2024-03-08
1388	RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models’ reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination.	ZIHAO WANG et. al.	arxiv-cs.CL	2024-03-08
1389	Electron Density-based GPT for Optimization and Suggestion of Host–guest Binders Related Papers Related Patents Related Grants Related Venues Related Experts View	JUAN MANUEL PARRILLA GUTIERREZ et. al.	Nature Computational Science	2024-03-08
1390	The Impact of Quantization on The Robustness of Transformer-based Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the effect of quantization on the robustness of Transformer-based models.	Seyed Parsa Neshaei; Yasaman Boreshban; Gholamreza Ghassem-Sani; Seyed Abolghasem Mirroshandel;	arxiv-cs.CL	2024-03-08
1391	Federated Recommendation Via Hybrid Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism.	Huimin Zeng; Zhenrui Yue; Qian Jiang; Dong Wang;	arxiv-cs.IR	2024-03-07
1392	Using GPT-4 to Provide Tiered, Formative Code Feedback Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have shown promise in generating sensible code explanation and feedback in programming exercises. In this experience report, we discuss the process of …	Ha Nguyen; Vicki Allan;	Proceedings of the 55th ACM Technical Symposium on Computer …	2024-03-07
1393	An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design an error-based human annotation framework to assess the GPT-4’s simplification capabilities.	Xuanxin Wu; Yuki Arase;	arxiv-cs.CL	2024-03-07
1394	A Large Scale RCT on Effective Error Messages in CS1 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial conducted in an open-access, online introductory computer …	Sierra Wang; John C. Mitchell; C. Piech;	Proceedings of the 55th ACM Technical Symposium on Computer …	2024-03-07
1395	Feedback-Generation for Programming Exercises With GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education.	Imen Azaiz; Natalie Kiesler; Sven Strickroth;	arxiv-cs.AI	2024-03-07
1396	Exploring The Impact of Generative AI for StandUp Report Recommendations in Software Capstone Project Development Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: StandUp Reports play an important role in capstone software engineering courses, facilitating progress tracking, obstacle identification, and team collaboration. However, despite …	ANDRÉS NEYEM et. al.	Proceedings of the 55th ACM Technical Symposium on Computer …	2024-03-07
1397	Probabilistic Topic Modelling with Transformer Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Transformer-Representation Neural Topic Model (TNTM), which combines the benefits of topic representations in transformer-based embedding spaces and probabilistic modelling.	Arik Reuter; Anton Thielmann; Christoph Weisser; Benjamin Säfken; Thomas Kneib;	arxiv-cs.LG	2024-03-06
1398	Whodunit: Classifying Code As Human Authored or GPT-4 Generated- A Case Study on CodeChef Problems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificial intelligence (AI) assistants such as GitHub Copilot and ChatGPT, built on large language models like GPT-4, are revolutionizing how programming tasks are performed, …	Oseremen Joy Idialu; N. Mathews; Rungroj Maipradit; J. Atlee; Mei Nagappan;	2024 IEEE/ACM 21st International Conference on Mining …	2024-03-06
1399	Whodunit: Classifying Code As Human Authored or GPT-4 Generated — A Case Study on CodeChef Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study shows that code stylometry is a promising approach for distinguishing between GPT-4 generated code and human-authored code.	Oseremen Joy Idialu; Noble Saji Mathews; Rungroj Maipradit; Joanne M. Atlee; Mei Nagappan;	arxiv-cs.SE	2024-03-06
1400	Assessing The Aesthetic Evaluation Capabilities of GPT-4 with Vision: Insights from Group and Individual Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, it has been recognized that large language models demonstrate high performance on various intellectual tasks.	Yoshia Abe; Tatsuya Daikoku; Yasuo Kuniyoshi;	arxiv-cs.AI	2024-03-06
1401	Can Large Language Models Do Analytical Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports.	YEBOWEN HU et. al.	arxiv-cs.CL	2024-03-06
1402	Designing Informative Metrics for Few-Shot Example Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a complexity-based prompt selection approach for sequence tagging tasks.	Rishabh Adiga; Lakshminarayanan Subramanian; Varun Chandrasekaran;	arxiv-cs.CL	2024-03-06
1403	Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead of following the popular practice of directly translating existing English resources into Japanese (e.g., Japanese-Alpaca), we propose an efficient self-instruct method based on GPT-4.	YIKUN SUN et. al.	arxiv-cs.CL	2024-03-06
1404	Japanese-English Sentence Translation Exercises Dataset for Automatic Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the task of automatic assessment of Sentence Translation Exercises (STEs), that have been used in the early stage of L2 language learning.	NAOKI MIURA et. al.	arxiv-cs.CL	2024-03-05
1405	AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys.	Anjalee De Silva; Janaka L. Wijekoon; Rashini Liyanarachchi; Rrubaa Panchendrarajan; Weranga Rajapaksha;	arxiv-cs.AI	2024-03-05
1406	Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled By GPT-4 for Enhanced Interpretability and Public Engagement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: And the public requires complex techniques to inquiry and understand socio-cultural and institutional factors, often hinders the public’s understanding of flood risks. To overcome these challenges, our study introduces an innovative solution: a customized AI Assistant powered by the GPT-4 Large Language Model.	Rafaela Martelo; Ruo-Qian Wang;	arxiv-cs.AI	2024-03-05
1407	An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model Is Not A General Substitute for GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs.	HUI HUANG et. al.	arxiv-cs.CL	2024-03-05
1408	JMI at SemEval 2024 Task 3: Two-step Approach for Multimodal ECAC Using In-context Learning with GPT and Instruction-tuned Llama Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents our system development for SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations.	Mohammed Abbas Ansari; Chandni Saxena; Tanvir Ahmad;	arxiv-cs.CL	2024-03-05
1409	Evolution Transformer: In-Context Evolutionary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: An alternative promising approach is to leverage data and directly discover powerful optimization principles via meta-optimization. In this work, we follow such a paradigm and introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies.	Robert Tjarko Lange; Yingtao Tian; Yujin Tang;	arxiv-cs.AI	2024-03-05
1410	Predicting Learning Performance with Large Language Models: A Study in Adult Literacy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. …	LIANG ZHANG et. al.	ArXiv	2024-03-04
1411	Using LLMs for The Extraction and Normalization of Product Attribute Values Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Web Data Commons – Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments.	Alexander Brinkmann; Nick Baumann; Christian Bizer;	arxiv-cs.CL	2024-03-04
1412	Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A common way of assessing language learners’ mastery of vocabulary is via multiple-choice cloze (i.e., fill-in-the-blank) questions. But the creation of test items can be …	Qiao Wang; Ralph L. Rose; Naho Orita; Ayaka Sugawara;	ArXiv	2024-03-04
1413	What Is Missing in Multilingual Visual Reasoning and How to Fix It Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: NLP models today strive for supporting multiple languages and modalities, improving accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal …	Yueqi Song; Simran Khanuja; Graham Neubig;	ArXiv	2024-03-03
1414	An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based large language models (LLMs) such as Generative Pre-trained Transformer (GPT) have become popular due to their remarkable performance across diverse …	SANGSOO PARK et. al.	2024 IEEE International Symposium on High-Performance …	2024-03-02
1415	LM4OPT: Unveiling The Potential of Large Language Models in Formulating Mathematical Optimization Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the rapidly evolving field of natural language processing, the translation of linguistic descriptions into mathematical formulation of optimization problems presents a formidable challenge, demanding intricate understanding and processing capabilities from Large Language Models (LLMs). This study compares prominent LLMs, including GPT-3.5, GPT-4, and Llama-2-7b, in zero-shot and one-shot settings for this task.	Tasnim Ahmed; Salimur Choudhury;	arxiv-cs.CL	2024-03-02
1416	Analysis of Privacy Leakage in Federated Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need …	Minh N. Vu; Truc D. T. Nguyen; Tre’ R. Jeter; My T. Thai;	International Conference on Artificial Intelligence and …	2024-03-02
1417	Using GPT and Authentic Contextual Recognition to Generate Math Word Problems with Difficulty Levels Related Papers Related Patents Related Grants Related Venues Related Experts View	Wu-Yuin Hwang; Ika Qutsiati Utami;	Educ. Inf. Technol.	2024-03-02
1418	Low-light Images Enhancement Via A Dense Transformer Network Related Papers Related Patents Related Grants Related Venues Related Experts View	YI HUANG et. al.	Digit. Signal Process.	2024-03-01
1419	WaterFormer: A Global–Local Transformer for Underwater Image Enhancement With Environment Adaptor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Underwater image enhancement (UIE) is crucial for high-level vision in underwater robotics. While convolutional neural networks (CNNs) have made significant achievements in UIE, …	JUNJIE WEN et. al.	IEEE Robotics & Automation Magazine	2024-03-01
1420	MGCoT: Multi-Grained Contextual Transformer for Table-based Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Xianjie Mo; Yang Xiang; Youcheng Pan; Yongshuai Hou; Ping Luo;	Expert Syst. Appl.	2024-03-01
1421	Spikeformer: Training High-performance Spiking Neural Network with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Yudong Li; Yunlin Lei; Xu Yang;	Neurocomputing	2024-03-01
1422	LCDFormer: Long-term Correlations Dual-graph Transformer for Traffic Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View	Jiongbiao Cai; Chia-Hung Wang; Kun Hu;	Expert Syst. Appl.	2024-03-01
1423	Case Study Identification with GPT-4 and Implications for Mapping Studies Related Papers Related Patents Related Grants Related Venues Related Experts View	Kai Petersen;	Inf. Softw. Technol.	2024-03-01
1424	Probabilistic Gear Fatigue Life Prediction Based on Physics-informed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Yang Li; Huaiju Liu; Yiming Chen; Difa Chen;	Expert Syst. Appl.	2024-03-01
1425	PWDformer: Deformable Transformer for Long-term Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View	Zheng Wang; Haowei Ran; Jinchang Ren; Meijun Sun;	Pattern Recognit.	2024-03-01
1426	PPTtrack: Pyramid Pooling Based Transformer Backbone for Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	Jun Wang; Shuai Yang; Yuanyun Wang; Guang Yang;	Expert Syst. Appl.	2024-03-01
1427	Spatial–Temporal Synchronous Transformer for Skeleton-Based Hand Gesture Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Capturing the long-range spatial-temporal correlation among joints of dynamic skeletal data efficiently is very challenging in hand gesture recognition (HGR). The flexibility of …	Dongdong Zhao; Hongli Li; Shi Yan;	IEEE Transactions on Circuits and Systems for Video …	2024-03-01
1428	Driver Distraction Detection Using Semi-supervised Lightweight Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Adam A.Q. Mohammed; Xin Geng; Jing Wang; Zafar Ali;	Eng. Appl. Artif. Intell.	2024-03-01
1429	Transformer Based on The Prediction of Psoriasis Severity Treatment Response Related Papers Related Patents Related Grants Related Venues Related Experts View	Cho-I Moon; Eun Bin Kim; Yoosang Baek; Onesok Lee;	Biomed. Signal Process. Control.	2024-03-01
1430	LAB: Large-Scale Alignment for ChatBots IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training.	SHIVCHANDER SUDALAIRAJ et. al.	arxiv-cs.CL	2024-03-01
1431	A Point Contextual Transformer Network for Point Cloud Completion Related Papers Related Patents Related Grants Related Venues Related Experts View	Siyi Leng; Zhenxin Zhang; Liqiang Zhang;	Expert Syst. Appl.	2024-03-01
1432	T3SRS: Tensor Train Transformer for Compressing Sequential Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View	HAO LI et. al.	Expert Syst. Appl.	2024-03-01
1433	Comparing Large Language Models and Human Programmers for Generating Programming Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4 employing the optimal prompt strategy outperforms 85 percent of human participants.	Wenpin Hou; Zhicheng Ji;	arxiv-cs.SE	2024-03-01
1434	A Systematic Evaluation of Large Language Models for Generating Programming Code Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We systematically evaluated the performance of seven large language models in generating programming code using various prompt strategies, programming languages, and task …	Wenpin Hou; Zhicheng Ji;	ArXiv	2024-03-01
1435	An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models have great potential in the field of remote sensing super-resolution (SR) due to their excellent self-attention mechanisms. However, transformer models are …	WENJIAN ZHANG et. al.	Remote. Sens.	2024-03-01
1436	K-NN Attention-based Video Vision Transformer for Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	Weirong Sun; Yujun Ma; Ruili Wang;	Neurocomputing	2024-03-01
1437	Multi-modal Person Re-identification Based on Transformer Relational Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View	XIANGTIAN ZHENG et. al.	Inf. Fusion	2024-03-01
1438	DGFormer: Dynamic Graph Transformer for 3D Human Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhan-Heng Chen; Ju Dai; Junxuan Bai; Junjun Pan;	Pattern Recognit.	2024-03-01
1439	A Novel Full-convolution UNet-transformer for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	Tianyou Zhu; Derui Ding; Feng Wang; Wei Liang; Bo Wang;	Biomed. Signal Process. Control.	2024-03-01
1440	FAM: Improving Columnar Vision Transformer with Feature Attention Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View	LAN HUANG et. al.	Comput. Vis. Image Underst.	2024-03-01
1441	Transformer Based Multiple Instance Learning for WSI Breast Cancer Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	CHENGYANG GAO et. al.	Biomed. Signal Process. Control.	2024-03-01
1442	Attention Combined Pyramid Vision Transformer for Polyp Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	Xiaogang Liu; Shuang Song;	Biomed. Signal Process. Control.	2024-03-01
1443	PeLLE: Encoder-based Language Models for Brazilian Portuguese Based on Open Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present PeLLE, a family of large language models based on the RoBERTa architecture, for Brazilian Portuguese, trained on curated, open data from the Carolina corpus.	GUILHERME LAMARTINE DE MELLO et. al.	arxiv-cs.CL	2024-02-29
1444	Here’s A Free Lunch: Sanitizing Backdoored Models with Model Merge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared to multiple advanced defensive approaches, our method offers an effective and efficient inference-stage defense against backdoor attacks on classification and instruction-tuned tasks without additional resources or specific knowledge.	ANSH ARORA et. al.	arxiv-cs.CL	2024-02-29
1445	PROC2PDDL: Open-Domain Planning Representations from Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations.	TIANYI ZHANG et. al.	arxiv-cs.CL	2024-02-29
1446	Can GPT Improve The State of Prior Authorization Via Guideline Based Automated Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate whether GPT can validate numerous key factors, in turn helping health plans reach a decision drastically faster.	Shubham Vatsal; Ayush Singh; Shabnam Tafreshi;	arxiv-cs.CL	2024-02-28
1447	H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prior hardware accelerator designs primarily focused on single-chip solutions for 10 MB-class computer vision models. The GB-class transformer models for natural language …	Yandong Luo; Shimeng Yu;	ACM Transactions on Design Automation of Electronic Systems	2024-02-28
1448	Demo: On-Device Video Analysis with LLMs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present a new on-device pipeline that efficiently summarizes lecture videos and provides relevant answers directly from a smartphone. We utilize widely accessible tools like …	Vishnu Jaganathan; Deepak Gouda; Kriti Arora; Mohit Aggarwal; Chao Zhang;	Proceedings of the 25th International Workshop on Mobile …	2024-02-28
1449	A Language Model Based Framework for New Concept Placement in Ontologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection.	Hang Dong; Jiaoyan Chen; Yuan He; Yongsheng Gao; Ian Horrocks;	arxiv-cs.CL	2024-02-27
1450	Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters.	Flavio Petruzzellis; Alberto Testolin; Alessandro Sperduti;	arxiv-cs.CL	2024-02-27
1451	Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The majority of the recent initiatives targeting medium to low-resource languages produced relatively small annotated datasets, with a skewed distribution, posing challenges for the development of sophisticated propaganda detection models. To address this challenge, we carefully develop the largest propaganda dataset to date, ArPro, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques.	Maram Hasanain; Fatema Ahmed; Firoj Alam;	arxiv-cs.CL	2024-02-27
1452	CAPT: Category-level Articulation Estimation from A Single Point Cloud Using Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CAPT: category-level articulation estimation from a point cloud using Transformer.	Lian Fu; Ryoichi Ishikawa; Yoshihiro Sato; Takeshi Oishi;	arxiv-cs.CV	2024-02-27
1453	If in A Crowdsourced Data Annotation Pipeline, A GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were …	Zeyu He; Huang Chieh-Yang; C. C. Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang;	ArXiv	2024-02-26
1454	Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring The Design of Next-generation Neuromorphic Chips IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a general Transformer-based SNN architecture, termed as “Meta-SpikeFormer, whose goals are: (1) Lower-power, supports the spike-driven paradigm that there is only sparse addition in the network; (2) Versatility, handles various vision tasks; (3) High-performance, shows overwhelming performance advantages over CNN-based SNNs; (4) Meta-architecture, provides inspiration for future next-generation Transformer-based neuromorphic chip designs.	MAN YAO et. al.	iclr	2024-02-26
1455	The Reversal Curse: LLMs Trained on “A Is B” Fail to Learn “B Is A” Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is worth noting, however, that if ”_A_ is _B_” appears _in-context_, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as ”Uriah Hawthorne is the composer of _Abyssal Melodies_” and showing that they fail to correctly answer ”Who composed _Abyssal Melodies?	LUKAS BERGLUND et. al.	iclr	2024-02-26
1456	Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation can significantly benefit from large-scale video generative pre-training.	HONGTAO WU et. al.	iclr	2024-02-26
1457	Looped Transformers Are Better at Learning Learning Algorithms IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures.	Liu Yang; Kangwook Lee; Robert D Nowak; Dimitris Papailiopoulos;	iclr	2024-02-26
1458	Massive Editing for Large Language Models Via Meta Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameter using the normal equation.	Chenmien Tan; Ge Zhang; Jie Fu;	iclr	2024-02-26
1459	Transformer-VQ: Linear-Time Transformers Via Vector Quantization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time.	Lucas Dax Lingle;	iclr	2024-02-26
1460	Enhancing Neural Decoding with Large Language Models: A GPT-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Many neural decoders specialize in one function. They provide a task-dependent interpretation of the signal based on what is happening in the subject’s brain and the subject’s …	Dong Hyeok Lee; Chun Kee Chung;	2024 12th International Winter Conference on Brain-Computer …	2024-02-26
1461	Graph Transformers on EHRs: Better Representation Improves Downstream Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose GT-BEHRT, a new approach that leverages temporal visit embeddings extracted from a graph transformer and uses a BERT-based model to obtain more robust patient representations, especially on longer EHR sequences.	Raphael Poulain; Rahmatollah Beheshti;	iclr	2024-02-26
1462	The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity.	Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Re;	iclr	2024-02-26
1463	Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the effect of code on enhancing LLMs’ reasoning capability by introducing different constraints on the Code Usage Frequency of GPT-4 Code Interpreter.	AOJUN ZHOU et. al.	iclr	2024-02-26
1464	DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT).	XIANJUN YANG et. al.	iclr	2024-02-26
1465	Masked Distillation Advances Self-Supervised Transformer Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a masked image modelling (MIM) based self-supervised neural architecture search method specifically designed for vision transformers, termed as MaskTAS, which completely avoids the expensive costs of data labeling inherited from supervised learning.	CAIXIA YAN et. al.	iclr	2024-02-26
1466	Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We used an ablation study to show that joint training on neuronal responses and behavior boosted performance, highlighting the model’s ability to associate behavioral and neural representations in an unsupervised manner.	Antonis Antoniades; Yiyi Yu; Joe S Canzano; William Yang Wang; Spencer Smith;	iclr	2024-02-26
1467	Generating Effective Ensembles for Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, transformer models have revolutionized Natural Language Processing (NLP), achieving exceptional results across various tasks, including Sentiment Analysis (SA). …	Itay Etelis; Avi Rosenfeld; Abraham Itzhak Weinberg; David Sarne;	ArXiv	2024-02-26
1468	MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks.	PAN LU et. al.	iclr	2024-02-26
1469	Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing (e.g., mathematical, causal) reasoning tasks.	Murong Yue; Jie Zhao; Min Zhang; Liang Du; Ziyu Yao;	iclr	2024-02-26
1470	Is Self-Repair A Silver Bullet for Code Generation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze Code Llama, GPT-3.5 and GPT-4’s ability to perform self-repair on problems taken from HumanEval and APPS.	Theo X. Olausson; Jeevana Priya Inala; Chenglong Wang; Jianfeng Gao; Armando Solar-Lezama;	iclr	2024-02-26
1471	MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We believe that the enhanced multi-modal generation capabilities of GPT-4 stem from the utilization of sophisticated large language models (LLM). To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen advanced LLM, Vicuna, using one projection layer.	Deyao Zhu; Jun Chen; Xiaoqian Shen; Xiang Li; Mohamed Elhoseiny;	iclr	2024-02-26
1472	NOLA: Compressing LoRA Using Linear Combination of Random Basis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce NOLA, which overcomes the rank one lower bound present in LoRA.	Soroush Abbasi Koohpayegani; Navaneet K L; Parsa Nooralinejad; Soheil Kolouri; Hamed Pirsiavash;	iclr	2024-02-26
1473	If in A Crowdsourced Data Annotation Pipeline, A GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were …	Zeyu He; Chieh-Yang Huang; Chien-Kuang Cornelia Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang;	arxiv-cs.HC	2024-02-26
1474	A Multi-Level Framework for Accelerating Training Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by a set of key observations of inter- and intra-layer similarities among feature maps and attentions that can be identified from typical training processes, we propose a multi-level framework for training acceleration.	Longwei Zou; Han Zhang; Yangdong Deng;	iclr	2024-02-26
1475	GeoLLM: Extracting Geospatial Knowledge from Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged for geospatial prediction tasks.	ROHIN MANVI et. al.	iclr	2024-02-26
1476	Quantum Linear Algebra Is All You Need for Transformer Architectures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative machine learning methods such as large-language models are revolutionizing the creation of text and images. While these models are powerful they also harness a large …	Naixu Guo; Zhan Yu; Aman Agrawal; P. Rebentrost;	ArXiv	2024-02-26
1477	VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Video temporal grounding (VTG) aims to locate specific temporal segments from an untrimmed video based on a linguistic query. Most existing VTG models are trained on extensive …	Yifang Xu; Yunzhuo Sun; Zien Xie; Benxiang Zhai; Sidan Du;	ArXiv	2024-02-25
1478	From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatility IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education.	PRAVNEET KAUR et. al.	arxiv-cs.CL	2024-02-25
1479	Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including semantic understanding, intelligent writing, and reasoning, paving the way for a more generalized form of artificial intelligence.	Shuning Huo; Yafei Xiang; Hanyi Yu; Mengran Zhu; Yulu Gong;	arxiv-cs.CL	2024-02-25
1480	TV-SAM: Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model (TV-SAM) without any manual annotations.	ZEKUN JIANG et. al.	arxiv-cs.CV	2024-02-24
1481	SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection.	Ayan Datta; Aryan Chandramania; Radhika Mamidi;	arxiv-cs.CL	2024-02-24
1482	Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We propose a novel approach for machine-generated text detection using a RoBERTa model with weighted layer averaging and AdaLoRA for parameter-efficient fine-tuning. Our method …	Ayan Datta; Aryan Chandramania; Radhika Mamidi;	ArXiv	2024-02-24
1483	ArabianGPT: Native Arabic GPT-based Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, there is a theoretical and practical imperative for developing LLMs predominantly focused on Arabic linguistic elements. To address this gap, this paper proposes ArabianGPT, a series of transformer-based models within the ArabianLLM suite designed explicitly for Arabic.	Anis Koubaa; Adel Ammar; Lahouari Ghouti; Omar Najar; Serry Sibaee;	arxiv-cs.CL	2024-02-23
1484	Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing PEFT methods pose challenges in hyperparameter selection, such as choosing the rank for LoRA or Adapter, or specifying the length of soft prompts. To address these challenges, we propose a novel fine-tuning approach for neural models, named Representation EDiting (RED), which modifies the representations generated at some layers through the application of scaling and biasing operations.	MULING WU et. al.	arxiv-cs.LG	2024-02-23
1485	Self-Supervised Pre-Training for Table Structure Recognition Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we resolve the issue by proposing a self-supervised pre-training (SSP) method for TSR transformers.	ShengYun Peng; Seongmin Lee; Xiaojing Wang; Rajarajeswari Balasubramaniyan; Duen Horng Chau;	arxiv-cs.CV	2024-02-23
1486	Towards Efficient Active Learning in NLP Via Pretrained Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications.	Artem Vysogorets; Achintya Gopal;	arxiv-cs.LG	2024-02-23
1487	Multimodal Transformer With A Low-Computational-Cost Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multimodal Transformers significantly suffer from a quadratic complexity of the multi-head attention with the input sequence length, especially as the number of modalities increases. To address this, we introduce Low-Cost Multimodal Transformer (LoCoMT), a novel multimodal attention mechanism that aims to reduce computational cost during training and inference with minimal performance loss.	Sungjin Park; Edward Choi;	arxiv-cs.LG	2024-02-23
1488	A First Look at GPT Apps: Landscape and Vulnerability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}.	ZEJUN ZHANG et. al.	arxiv-cs.CR	2024-02-23
1489	Whose LLM Is It Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse inputs.	Ariel Rosenfeld; Teddy Lazebnik;	arxiv-cs.CL	2024-02-22
1490	Tokenization Counts: The Impact of Tokenization on Arithmetic in Frontier LLMs IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Tokenization, the division of input text into input tokens, is an often overlooked aspect of the large language model (LLM) pipeline and could be the source of useful or harmful …	Aaditya K. Singh; DJ Strouse;	ArXiv	2024-02-22
1491	OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code.	TIANYU ZHENG et. al.	arxiv-cs.SE	2024-02-22
1492	Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker.	Younghun Lee; Dan Goldwasser; Laura Schwab Reese;	arxiv-cs.CL	2024-02-21
1493	Do Efficient Transformers Really Save Computation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to understand the capabilities and limitations of efficient Transformers, specifically the Sparse Transformer and the Linear Transformer.	KAI YANG et. al.	arxiv-cs.LG	2024-02-21
1494	Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper highlights the best practices of the PGI, Persona, Grouping, and Intelligence, method, a strategic framework that achieved a remarkable error rate of only 3,15 percent across 4,000 responses generated by GPT in response to a real business challenge.	Aline Ioste;	arxiv-cs.CL	2024-02-21
1495	TransGOP: Transformer-Based Gaze Object Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, this paper introduces Transformer into the fields of gaze object prediction and proposes an end-to-end Transformer-based gaze object prediction method named TransGOP.	Binglu Wang; Chenxi Guo; Yang Jin; Haisheng Xia; Nian Liu;	arxiv-cs.CV	2024-02-21
1496	On The Expressive Power of A Variant of The Looped Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide theoretical evidence of the expressive power of the AlgoFormer in solving some challenging problems, mirroring human-designed algorithms.	YIHANG GAO et. al.	arxiv-cs.LG	2024-02-21
1497	An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present an optimized, fine-tuned transformer-based DistilBERT model designed for the detection of phishing emails.	Mohammad Amaz Uddin; Iqbal H. Sarker;	arxiv-cs.LG	2024-02-21
1498	Towards Equipping Transformer with The Ability of Systematic Compositionality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tentatively provide a successful implementation of a multi-layer CAT on the basis of the especially popular BERT.	Chen Huang; Peixin Qin; Wenqiang Lei; Jiancheng Lv;	aaai	2024-02-20
1499	Advancing GenAI Assisted Programming–A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to explore the best practices for utilizing GenAI as a programming tool, through a comparative analysis between GPT-4 and GLM-4.	Angus Yang; Zehan Li; Jie Li;	arxiv-cs.SE	2024-02-20
1500	Fairness-Aware Structured Pruning in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: WARNING: This work uses language that is offensive in nature.	Abdelrahman Zayed; Gonçalo Mordido; Samira Shabanian; Ioana Baldini; Sarath Chandar;	aaai	2024-02-20