Paper Digest: Recent Papers on Transformer

July 1, 2020February 14, 2025 admin

Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.

This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to read, write, get answers and review.

Try us today and unlock the full potential of our services for free!

TABLE 1: Paper Digest: Recent Papers on Transformer

	Paper	Author(s)	Source	Date
1	Application of Tabular Transformer Architectures for Operating System Fingerprinting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Tabular Transformer architectures-specifically TabTransformer and FT-Transformer-for OS fingerprinting, leveraging structured network data from three publicly available datasets.	Rubén Pérez-Jove; Cristian R. Munteanu; Alejandro Pazos; Jose Vázquez-Naya;	arxiv-cs.CR	2025-02-13
2	Zero-shot Generation of Synthetic Neurosurgical Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to evaluate the capability of zero-shot generation of synthetic neurosurgical data with a large language model (LLM), GPT-4o, by benchmarking with the conditional tabular generative adversarial network (CTGAN).	Austin A. Barr; Eddie Guo; Emre Sezgin;	arxiv-cs.CL	2025-02-13
3	AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AttentionSmithy, a modular software package that simplifies transformer innovation by breaking down key components into reusable building blocks: attention modules, feed-forward networks, normalization layers, and positional encodings.	Caleb Cranney; Jesse G. Meyer;	arxiv-cs.LG	2025-02-13
4	From Occupations to Tasks: A New Perspective on Automatability Prediction Using BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing research has primarily focused on the potential impact of automation at the occupation level, there has been a lack of investigation into the automatability of individual tasks. This paper addresses this gap by proposing a BERT-based classifier to predict the automatability of tasks in the forthcoming decade at a granular level leveraging the context and semantics information of tasks.	Dawei Xu; Haoran Yang; Marian-Andrei Rizoiu; Guandong Xu;	arxiv-cs.CY	2025-02-13
5	Mechanistic Unveiling of Transformer Circuits: Self-Influence As A Key to Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous research suggests that these models implicitly encode reasoning structures, it is still unclear which specific multi-step thought processes they employ to solve complex tasks. To address this gap, we propose a novel mechanistic interpretability framework, SICAF, designed to trace and analyze the reasoning strategies that language models use in multi-step inference tasks.	Lin Zhang; Lijie Hu; Di Wang;	arxiv-cs.AI	2025-02-13
6	APT-LLM: Embedding-Based Anomaly Detection of Cyber Advanced Persistent Threats Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces APT-LLM, a novel embedding-based anomaly detection framework that integrates large language models (LLMs) — BERT, ALBERT, DistilBERT, and RoBERTa — with autoencoder architectures to detect APTs.	Sidahmed Benabderrahmane; Petko Valtchev; James Cheney; Talal Rahwan;	arxiv-cs.CR	2025-02-13
7	MTDP: Modulated Transformer Diffusion Policy Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate key architectural designs of Transformers and improve the traditional Transformer architecture by proposing the Modulated Transformer Diffusion Policy (MTDP) model for diffusion policy.	Qianhao Wang; Yinqian Sun; Enmeng Lu; Qian Zhang; Yi Zeng;	arxiv-cs.RO	2025-02-13
8	A Hybrid Transformer Model for Fake News Detection: Leveraging Bayesian Optimization and Bidirectional Recurrent Unit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an optimized Transformer model that integrates Bayesian algorithms with a Bidirectional Gated Recurrent Unit (BiGRU), and apply it to fake news classification for the first time.	Tianyi Huang; Zeqiu Xu; Peiyang Yu; Jingyuan Yi; Xiaochuan Xu;	arxiv-cs.CL	2025-02-13
9	Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we tackle industry challenges in video content classification by exploring and optimizing GPT-based models for zero-shot classification across seven critical categories of video quality.	Mark Beliaev; Victor Yang; Madhura Raju; Jiachen Sun; Xinghai Hu;	arxiv-cs.CV	2025-02-13
10	Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel four-dimensional hybrid parallel algorithm implemented in a highly scalable, portable, open-source framework called AxoNN.	SIDDHARTH SINGH et. al.	arxiv-cs.LG	2025-02-12
11	Can Uniform Meaning Representation Help GPT-4 Translate from Indigenous Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the downstream technical utility of UMR for low-resource languages by incorporating it into GPT-4 prompts.	Shira Wein;	arxiv-cs.CL	2025-02-12
12	FoQA: A Faroese Question-Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present FoQA, a Faroese extractive question-answering (QA) dataset with 2,000 samples, created using a semi-automated approach combining Large Language Models (LLMs) and human validation.	Annika Simonsen; Dan Saattrup Nielsen; Hafsteinn Einarsson;	arxiv-cs.CL	2025-02-11
13	WHODUNIT: Evaluation Benchmark for Culprit Detection in Mystery Stories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel data set, WhoDunIt, to assess the deductive reasoning capabilities of large language models (LLM) within narrative contexts.	Kshitij Gupta;	arxiv-cs.CL	2025-02-11
14	Large Language Models Perpetuate Bias in Palliative Care: Development and Analysis of The Palliative Care Adversarial Dataset (PCAD) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Bias and inequity in palliative care disproportionately affect marginalised groups. Large language models (LLMs), such as GPT-4o, hold potential to enhance care but risk …	NAOMI AKHRAS et. al.	arxiv-cs.CY	2025-02-11
15	Making Language Models Robust Against Negation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a self-supervised method to make language models more robust against negation.	MohammadHossein Rezaei; Eduardo Blanco;	arxiv-cs.CL	2025-02-11
16	RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We detail the methodology behind data collection and annotation, and the challenges encountered during the data curation phase.	Naome A. Etori; Maria L. Gini;	arxiv-cs.CL	2025-02-10
17	A Large-Scale Benchmark for Vietnamese Sentence Paraphrases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents ViSP, a high-quality Vietnamese dataset for sentence paraphrasing, consisting of 1.2M original-paraphrase pairs collected from various domains.	Sang Quang Nguyen; Kiet Van Nguyen;	arxiv-cs.CL	2025-02-10
18	Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its effectiveness in mitigating vulnerabilities in LLM-generated code remains underexplored. To address this gap, we implemented a benchmark to automatically assess the impact of various prompt engineering strategies on code security.	Marc Bruni; Fabio Gabrielli; Mohammad Ghafari; Martin Kropp;	arxiv-cs.SE	2025-02-09
19	Learning to Substitute Words with Model-based Score Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To circumvent this issue, we instead employ a model-based score (BARTScore) to quantify sentence quality, thus forgoing the need for human annotations. Specifically, we use this score to define a distribution for each word substitution, allowing one to test whether a substitution is statistically superior relative to others.	Hongye Liu; Ricardo Henao;	arxiv-cs.CL	2025-02-09
20	Provably Overwhelming Transformer Models with Designed Inputs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop an algorithm which, given a trained transformer model $\mathcal{M}$ as input, as well as a string of tokens $s$ of length $n_{fix}$ and an integer $n_{free}$, can generate a mathematical proof that $\mathcal{M}$ is “overwhelmed” by $s$, in time and space $\widetilde{O}(n_{fix}^2 + n_{free}^3)$.	Lev Stambler; Seyed Sajjad Nezhadi; Matthew Coudron;	arxiv-cs.LG	2025-02-09
21	Flowing Through Layers: A Continuous Dynamical Systems Perspective on Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that the standard discrete update rule of transformer layers can be naturally interpreted as a forward Euler discretization of a continuous dynamical system.	Jacob Fein-Ashley;	arxiv-cs.LG	2025-02-08
22	Lowering The Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach that integrates large language models (LLMs), specifically Generative Pre-trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT)-based models, making it accessible to a wider audience.	Yejian Zhang; Shingo Takada;	arxiv-cs.CL	2025-02-05
23	A Systematic Approach for Assessing Large Language Models’ Test Case Generation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the assessment of LLM’s test case generation ability and lacking dataset for evaluation, we propose the Generated Benchmark from Control-Flow Structure and Variable Usage Composition (GBCV) approach, which systematically generates programs used for evaluating LLMs’ test generation capabilities.	Hung-Fu Chang; Mohammad Shokrolah Shirazi;	arxiv-cs.SE	2025-02-04
24	FewTopNER: Integrating Few-Shot Learning with Topic Modeling and Named Entity Recognition in A Multilingual Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FewTopNER, a novel framework that integrates few-shot named entity recognition (NER) with topic-aware contextual modeling to address the challenges of cross-lingual and low-resource scenarios.	Ibrahim Bouabdallaoui; Fatima Guerouate; Samya Bouhaddour; Chaimae Saadi; Mohammed Sbihi;	arxiv-cs.CL	2025-02-04
25	Annotation Tool and Dataset for Fact-Checking Podcasts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fact-checking podcasts is a challenging task, requiring transcription, annotation, and claim verification, all while preserving the contextual details of spoken content. Our tool offers a novel approach to tackle these challenges by enabling real-time annotation of podcasts during playback.	Vinay Setty; Adam James Becker;	arxiv-cs.CL	2025-02-03
26	Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their black-box nature introduces significant safety and compliance risks. In this work, we present a scalable framework for the automated evaluation of Custom GPTs against OpenAI’s usage policies, which define the permissible behaviors of these systems.	David Rodriguez; William Seymour; Jose M. Del Alamo; Jose Such;	arxiv-cs.CL	2025-02-03
27	The Jumping Reasoning Curve? Tracking The Evolution of Reasoning Performance in GPT-[n] and O-[n] Models on Multimodal Puzzles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We plan to continuously track new models in the series and update our results in this paper accordingly.	Vernon Y. H. Toh; Yew Ken Chia; Deepanway Ghosal; Soujanya Poria;	arxiv-cs.CV	2025-02-03
28	Optimal Sensor Placement in Power Transformers Using Physics-Informed Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work aims at simulating and predicting the temperature conditions inside a power transformer using Physics-Informed Neural Networks (PINNs).	Sirui Li; Federica Bragone; Matthieu Barreau; Tor Laneryd; Kateryna Morozovska;	arxiv-cs.LG	2025-02-01
29	Explainable AI for Sentiment Analysis of Human Metapneumovirus (HMPV) Using XLNet Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply transformer models, particularly XLNet, achieving 93.50% accuracy in sentiment classification.	Md. Shahriar Hossain Apu; Md Saiful Islam; Tanjim Taharat Aurpa;	arxiv-cs.CL	2025-02-01
30	Large Language Models’ Accuracy in Emulating Human Experts’ Evaluation of Public Sentiments About Heated Tobacco Products on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examined the accuracy of LLMs in replicating human sentiment evaluation of social media messages about heated tobacco products (HTPs).	Kwanho Kim; Soojong Kim;	arxiv-cs.CL	2025-01-31
31	Structure Development in List-Sorting Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Interestingly, vocabulary-splitting is present regardless of whether we use weight decay, a common regularization technique thought to drive simplification, supporting the thesis that neural networks naturally prefer simpler solutions.	Einar Urdshals; Jasmina Urdshals;	arxiv-cs.LG	2025-01-30
32	OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a continuous-time formulation of transformers.	Kelvin Kan; Xingjian Li; Stanley Osher;	arxiv-cs.LG	2025-01-30
33	A Multi-Layered Large Language Model Framework for Disease Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores three Arabic medical text preprocessing techniques: text summarization, text refinement, and Named Entity Recognition (NER).	Malak Mohamed; Rokaia Emad; Ali Hamdi;	arxiv-cs.CL	2025-01-30
34	Economic Rationality Under Specialization: Evidence of Decision Bias in AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the study by Chen et al. (2023) [01], the large language model GPT demonstrated economic rationality comparable to or exceeding the average human level in tasks such as budget allocation and risk preference.	ShuiDe Wen; Juan Feng;	arxiv-cs.AI	2025-01-30
35	Cross-Language Approach for Quranic QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these systems face unique challenges, including the linguistic disparity between questions written in Modern Standard Arabic and answers found in Quranic verses written in Classical Arabic, and the small size of existing datasets, which further restricts model performance. To address these challenges, we adopt a cross-language approach by (1) Dataset Augmentation: expanding and enriching the dataset through machine translation to convert Arabic questions into English, paraphrasing questions to create linguistic diversity, and retrieving answers from an English translation of the Quran to align with multilingual training requirements; and (2) Language Model Fine-Tuning: utilizing pre-trained models such as BERT-Medium, RoBERTa-Base, DeBERTa-v3-Base, ELECTRA-Large, Flan-T5, Bloom, and Falcon to address the specific requirements of Quranic QA.	Islam Oshallah; Mohamed Basem; Ali Hamdi; Ammar Mohammed;	arxiv-cs.CL	2025-01-29
36	Towards Supporting Penetration Testing Education with Large Language Models: An Evaluation and Comparison Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of LLMs in conducting a variety of penetration testing tasks.	Martin Nizon-Deladoeuille; Brynjólfur Stefánsson; Helmut Neukirchen; Thomas Welsh;	arxiv-cs.CR	2025-01-29
37	DINT Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it has two critical limitations: the lack of global context modeling, which is essential for identifying globally significant tokens, and numerical instability due to the absence of strict row normalization in the attention matrix. To overcome these challenges, we propose DINT Transformer, which extends DIFF Transformer by incorporating a differential-integral mechanism.	Yueyang Cang; Yuhang Liu; Xiaoteng Zhang; Erlu Zhao; Li Shi;	arxiv-cs.CL	2025-01-29
38	AlphaAdam:Asynchronous Masked Optimization with Dynamic Alpha for Selective Updates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AlphaAdam, an optimization framework for LLM from the perspective of intra-layer parameter updates.	Da Chang; Yu Li; Ganzhao Yuan;	arxiv-cs.LG	2025-01-29
39	Shared DIFF Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Shared DIFF Transformer, which draws on the idea of a differential amplifier by introducing a shared base matrix to model global patterns and incorporating low-rank updates to enhance task-specific flexibility.	Yueyang Cang; Yuhang Liu; Xiaoteng Zhang; Xiangju Wang;	arxiv-cs.LG	2025-01-29
40	MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present and release MIDI-GPT, a generative system based on the Transformer architecture that is designed for computer-assisted music composition workflows.	PHILIPPE PASQUIER et. al.	arxiv-cs.SD	2025-01-28
41	Divergent Emotional Patterns in Disinformation on Social Media? An Analysis of Tweets and TikToks About The DANA in Valencia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the dissemination of disinformation on social media platforms during the DANA event (DANA is a Spanish acronym for Depresion Aislada en Niveles Altos, translating to high-altitude isolated depression) that resulted in extremely heavy rainfall and devastating floods in Valencia, Spain, on October 29, 2024.	Iván Arcos; Paolo Rosso; Ramón Salaverría;	arxiv-cs.CL	2025-01-28
42	Comparing Human and LLM Generated Code: The Jury Is Still Out! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there has been limited evaluation effort in the research domain aimed at validating the true utility of such techniques, especially when compared to human coding outputs. We bridge this gap, where a benchmark dataset comprising 72 distinct software engineering tasks is used to compare the effectiveness of large language models (LLMs) and human programmers in producing Python software code.	Sherlock A. Licorish; Ansh Bajpai; Chetan Arora; Fanyu Wang; Kla Tantithamthavorn;	arxiv-cs.SE	2025-01-28
43	Detecting Harassment and Defamation in Cyberbullying with Emotion-adaptive Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, their performance is substantially lower on harassment and denigration multi-classification tasks. Therefore, we propose an emotion-adaptive training framework (EAT) that helps transfer knowledge from the domain of emotion detection to the domain of cyberbullying detection to help detect indirect cyberbullying events.	Peiling Yi; Arkaitz Zubiaga; Yunfei Long;	arxiv-cs.CL	2025-01-28
44	Leveraging In-Context Learning and Retrieval-Augmented Generation for Automatic Question Generation in Educational Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore advanced techniques for automated question generation in educational contexts, focusing on In-Context Learning (ICL), Retrieval-Augmented Generation (RAG), and a novel Hybrid Model that merges both methods.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2025-01-28
45	MEL: Legal Spanish Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the development and evaluation of MEL, a legal language model based on XLM-RoBERTa-large, fine-tuned on legal documents such as BOE (Bolet\’in Oficial del Estado, the Spanish oficial report of laws) and congress texts.	DAVID BETANCUR SÁNCHEZ et. al.	arxiv-cs.CL	2025-01-27
46	Optimizing Sentence Embedding with Pseudo-Labeling and Model Ensembles: A Hierarchical Framework for Enhanced NLP Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a framework that combines pseudo-label generation and model ensemble techniques to improve sentence embeddings.	Ziwei Liu; Qi Zhang; Lifu Gao;	arxiv-cs.CL	2025-01-27
47	Optimizing Deep Learning Models to Address Class Imbalance in Code Comment Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work investigates the use of different weighting strategies of the loss function to mitigate the scarcity of certain classes in the dataset.	Moritz Mock; Thomas Borsani; Giuseppe Di Fatta; Barbara Russo;	arxiv-cs.SE	2025-01-27
48	A Comprehensive Study on Fine-Tuning Large Language Models for Medical Question Answering Using Classification Models and Comparative Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the overview of the development and fine-tuning of large language models (LLMs) designed specifically for answering medical questions.	Aysegul Ucar; Soumik Nayak; Anunak Roy; Burak Taşcı; Gülay Taşcı;	arxiv-cs.CL	2025-01-26
49	Identifying Critical Tokens for Accurate Predictions in Transformer-based Medical Imaging Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a step towards demystifying the decision-making process of transformer-based medical imaging models and propose Token Insight, a novel method that identifies the critical tokens that contribute to the prediction made by the model.	Solha Kang; Joris Vankerschaver; Utku Ozbulak;	arxiv-cs.CV	2025-01-26
50	TractoGPT: A GPT Architecture for White Matter Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: White Matter Segmentation remains challenging due to structural similarity in streamlines, subject variability, symmetry in 2 hemispheres, etc. To address these challenges, we propose TractoGPT, a GPT-based architecture trained on streamline, cluster, and fusion data representations separately.	ANOUSHKRIT GOEL et. al.	arxiv-cs.CV	2025-01-26
51	Evaluating Simple Debiasing Techniques in RoBERTa-based Hate Speech Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This leads to a disparity where normal AAE text is more likely to be misclassified as abusive/hateful compared to non-AAE text. Simple debiasing techniques have been developed in the past to counter this sort of disparity, and in this work, we apply and evaluate these techniques in the scope of RoBERTa-based encoders.	Diana Iftimie; Erik Zinn;	arxiv-cs.CL	2025-01-26
52	Multi-stage Large Language Model Pipelines Can Outperform GPT-4o in Relevance Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an LLM-based modular classification pipeline that divides the relevance assessment task into multiple stages, each utilising different prompts and models of varying sizes and capabilities.	Julian A. Schnabel; Johanne R. Trippas; Falk Scholer; Danula Hettiachchi;	arxiv-cs.IR	2025-01-24
53	An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we pursue two main goals.	Shabnam Hassani; Mehrdad Sabetzadeh; Daniel Amyot;	arxiv-cs.SE	2025-01-24
54	Idiom Detection in Sorani Kurdish Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research provides a dataset, three optimized models, and insights into idiom detection, laying a foundation for advancing Kurdish NLP.	Skala Kamaran Omer; Hossein Hassani;	arxiv-cs.CL	2025-01-24
55	Assessing Large Language Models in Comprehending and Verifying Concurrent Programs Across Memory Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of several leading large language models (LLMs), including GPT-3.5-turbo, GPT-4, GPT-4o, GPT-4o-mini, and Mistral-AI’s Large2, in understanding and analyzing concurrency issues within software programs.	Ridhi Jain; Rahul Purandare;	arxiv-cs.SE	2025-01-24
56	GPT-HTree: A Decision Tree Framework Integrating Hierarchical Clustering and Large Language Models for Explainable Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces GPT-HTree, a framework combining hierarchical clustering, decision trees, and large language models (LLMs) to address this challenge.	Te Pei; Fuat Alican; Aaron Ontoyin Yin; Yigit Ihlamur;	arxiv-cs.LG	2025-01-23
57	Quantized Spike-driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, recent research in the SNN domain has mainly focused on enhancing accuracy by designing large-scale Transformer structures, which typically rely on substantial computational resources, limiting their deployment on resource-constrained devices. To overcome this challenge, we propose a quantized spike-driven Transformer baseline (QSD-Transformer), which achieves reduced resource demands by utilizing a low bit-width parameter.	XUERUI QIU et. al.	arxiv-cs.CV	2025-01-23
58	A Transformer-based Autoregressive Decoder Architecture for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce an effective hierarchical text classifier RADAr (Transformer-based Autoregressive Decoder Architecture) that is based only on an off-the-shelf RoBERTa transformer to process the input and a custom autoregressive decoder with two decoder layers for generating the classification output.	Younes Yousef; Lukas Galke; Ansgar Scherp;	arxiv-cs.LG	2025-01-23
59	Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with An Optimized Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The method proposed in this paper provides a new idea for algorithm optimization in the field of text classification and has good application potential and practical value.	JIA GAO et. al.	arxiv-cs.CL	2025-01-23
60	Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, ensuring privacy and compliance requires edge and private deployments of LLMs. This paper proposes a novel approach to semantic QA over EHRs by first identifying the most relevant FHIR resources for a user query (Task1) and subsequently answering the query based on these resources (Task2).	Sara Kothari; Ayush Gupta;	arxiv-cs.CL	2025-01-23
61	5G LDPC Linear Transformer for Channel Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a scalable approach to decode linear block codes with $O(n)$ complexity rather than $O(n^2)$ for regular transformers.	Mario Hernandez; Fernando Pinero;	arxiv-cs.LG	2025-01-23
62	MedSlice: Fine-Tuned Large Language Models for Secure Clinical Note Sectioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study develops a pipeline for automated note sectioning using open-source LLMs, focusing on three sections: History of Present Illness, Interval History, and Assessment and Plan.	JOSHUA DAVIS et. al.	arxiv-cs.CL	2025-01-23
63	LiT: Delving Into A Simplified Linear Diffusion Transformer for Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we offer a suite of ready-to-use solutions for efficient linear diffusion Transformers.	JIAHAO WANG et. al.	arxiv-cs.CV	2025-01-22
64	Comparative Approaches to Sentiment Analysis Using Datasets in Major European and Arabic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores transformer-based models such as BERT, mBERT, and XLM-R for multi-lingual sentiment analysis across diverse linguistic structures.	Mikhail Krasitskii; Olga Kolesnikova; Liliana Chanona Hernandez; Grigori Sidorov; Alexander Gelbukh;	arxiv-cs.CL	2025-01-21
65	Harnessing Generative Pre-Trained Transformer for Datacenter Packet Trace Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Researchers often use simplified mathematical models that lack the depth needed to recreate intricate traffic patterns and, thus, miss optimization opportunities found in realistic traffic. In this preliminary work, we introduce DTG-GPT, a packet-level Datacenter Traffic Generator (DTG), based on the generative pre-trained transformer (GPT) architecture used by many state-of-the-art large language models.	Chen Griner;	arxiv-cs.NI	2025-01-21
66	Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we have evaluated different combinations of multimodal models that integrate Computer Vision and Natural Language Processing to generate comprehensive radiology reports.	Md. Rakibul Islam; Md. Zahid Hossain; Mustofa Ahmed; Most. Sharmin Sultana Samu;	arxiv-cs.CV	2025-01-21
67	FuocChuVIP123 at CoMeDi Shared Task: Disagreement Ranking with XLM-Roberta Sentence Embeddings and Deep Neural Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents results of our system for CoMeDi Shared Task, focusing on Subtask 2: Disagreement Ranking.	Phuoc Duong Huy Chu;	arxiv-cs.CL	2025-01-21
68	LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text Across English and Multilingual Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content, focusing on the binary classification of machine-generated versus human-written text.	Md Kamrujjaman Mobin; Md Saiful Islam;	arxiv-cs.CL	2025-01-21
69	LuxVeri at GenAI Detection Task 3: Cross-Domain Detection of AI-Generated Text Using Inverse Perplexity-Weighted Ensemble of Fine-Tuned Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approach for Task 3 of the GenAI content detection workshop at COLING-2025, focusing on Cross-Domain Machine-Generated Text (MGT) Detection.	Md Kamrujjaman Mobin; Md Saiful Islam;	arxiv-cs.CL	2025-01-21
70	KEIR @ ECIR 2025: The Second Workshop on Knowledge-Enhanced Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this workshop is to bring together researchers from academia and industry to discuss various aspects of knowledge-enhanced information retrieval.	ZIHAN WANG et. al.	arxiv-cs.IR	2025-01-20
71	Trustformer: A Trusted Federated Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel FL method that reduces communication overhead while maintaining competitive utility.	Ali Abbasi Tadi; Dima Alhadidi; Luis Rueda;	arxiv-cs.LG	2025-01-20
72	Irony in Emojis: A Comparative Study of Human and LLM Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the ability of GPT-4o to interpret irony in emojis. By prompting GPT-4o to evaluate the likelihood of specific emojis being used to express irony on social media and comparing its interpretations with human perceptions, we aim to bridge the gap between machine and human understanding.	Yawen Zheng; Hanjia Lyu; Jiebo Luo;	arxiv-cs.CL	2025-01-19
73	PaSa: An LLM Agent for Comprehensive Academic Paper Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce PaSa, an advanced Paper Search agent powered by large language models.	YICHEN HE et. al.	arxiv-cs.IR	2025-01-17
74	Improving Automated Feedback Systems for Tutor Training in Low-Resource Scenarios Through Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results demonstrate that our data augmentation approach generalizes effectively to identify other types of praise, compared to the same model fine-tuned without augmentation.	Chentianye Xu; Jionghao Lin; Tongshuang Wu; Vincent Aleven; Kenneth R. Koedinger;	arxiv-cs.HC	2025-01-16
75	Exploring AI-based System Design for Pixel-level Protected Health Information Detection in Medical Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Purpose: This study aims to evaluate different setups of an AI-based solution to detect Protected Health Information (PHI) in medical images.	Tuan Truong; Ivo M. Baltruschat; Mark Klemens; Grit Werner; Matthias Lenga;	arxiv-cs.CV	2025-01-16
76	Demo: Interactive Visualization of Semantic Relationships in A Biomedical Project’s Talent Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an interactive visualization of the Cell Map for AI Talent Knowledge Graph (CM4AI TKG), a detailed semantic space comprising approximately 28,000 experts and 1,000 datasets focused on the biomedical field.	JIAWEI XU et. al.	arxiv-cs.SI	2025-01-16
77	Towards Multilingual LLM Evaluation for Baltic and Nordic Languages: A Study on Lithuanian History Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluated Lithuanian and general history knowledge of multilingual Large Language Models (LLMs) on a multiple-choice question-answering task.	Yevhen Kostiuk; Oxana Vitman; Łukasz Gagała; Artur Kiulian;	arxiv-cs.CL	2025-01-15
78	Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios.	Alain Komaty; Hatef Otroshi Shahreza; Anjith George; Sebastien Marcel;	arxiv-cs.CV	2025-01-15
79	Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach that combines PhoBERT-V2 and SentiWordnet for Sentiment Analysis of Vietnamese reviews.	Hong-Viet Tran; Van-Tan Bui; Lam-Quan Tran;	arxiv-cs.CL	2025-01-15
80	Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of narratives developed via GPT-4, featuring diverse semantic content and stylistic variations, we analyze BERT’s layerwise activations to uncover patterns of localized neural processing. Through dimensionality reduction techniques such as Principal Component Analysis (PCA) and Multidimensional Scaling (MDS), we reveal that BERT exhibits strong clustering based on narrative content in its later layers, with progressively compact and distinct clusters.	Awritrojit Banerjee; Achim Schilling; Patrick Krauss;	arxiv-cs.CL	2025-01-14
81	Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Tarsier2, a state-of-the-art large vision-language model (LVLM) designed for generating detailed and accurate video descriptions, while also exhibiting superior general video understanding capabilities.	Liping Yuan; Jiawei Wang; Haomiao Sun; Yuchen Zhang; Yuan Lin;	arxiv-cs.CV	2025-01-14
82	Enhancing The De-identification of Personally Identifiable Information in Educational Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by recent advancements in artificial intelligence, our study investigates the GPT-4o-mini model as a cost-effective and efficient solution for PII detection tasks.	Y. Shen; Z. Ji; J. Lin; K. R. Koedginer;	arxiv-cs.CL	2025-01-14
83	Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle.	Saad Mashkoor Siddiqui; Mohammad Ali Sheikh; Muhammad Aleem; Kajol R Singh;	arxiv-cs.CL	2025-01-14
84	Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, in this work, we investigate the effect of important parameters on the performance and energy efficiency of LLMs during inference and examine their trade-offs.	Paul Joe Maliakel; Shashikant Ilager; Ivona Brandic;	arxiv-cs.LG	2025-01-14
85	GPT As A Monte Carlo Language Tree: A Probabilistic Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel perspective that any language dataset can be represented by a Monte Carlo Language Tree (abbreviated as “Data-Tree”), where each node denotes a token, each edge denotes a token transition probability, and each sequence has a unique path.	Kun-Peng Ning; Jia-Yu Yao; Yu-Yang Liu; Mu-Nan Ning; Li Yuan;	arxiv-cs.CL	2025-01-13
86	An Efficient Sparse Hardware Accelerator for Spike-Driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient sparse hardware accelerator for Spike-driven Transformer.	Zhengke Li; Wendong Mao; Siyu Zhang; Qiwei Dong; Zhongfeng Wang;	arxiv-cs.AR	2025-01-13
87	Transforming Role Classification in Scientific Teams Using LLMs and Advanced Predictive Analytics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we present a transformative approach to classifying author roles in scientific teams using advanced large language models (LLMs), which offers a more refined analysis compared to traditional clustering methods.	Wonduk Seo; Yi Bu;	arxiv-cs.DL	2025-01-13
88	Investigating Large Language Models in Inferring Personality Traits from User Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are demonstrating remarkable human like capabilities across diverse domains, including psychological assessment.	Jianfeng Zhu; Ruoming Jin; Karin G. Coifman;	arxiv-cs.CL	2025-01-13
89	Robust Hybrid Classical-Quantum Transfer Learning Model for Text Classification Using GPT-Neo 125M with LoRA & SMOTE Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This research introduces a hybrid classical-quantum framework for text classification, integrating GPT-Neo 125M with Low-Rank Adaptation (LoRA) and Synthetic Minority Over-sampling Technique (SMOTE) using quantum computing backends.	Santanam Wishal;	arxiv-cs.LG	2025-01-12
90	Generative Artificial Intelligence-Supported Pentesting: A Comparison Between Claude Opus, GPT-4, and Copilot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we have analyzed the potential of leading generic-purpose GenAI tools-Claude Opus, GPT-4 from ChatGPT, and Copilot-in augmenting the penetration testing process as defined by the Penetration Testing Execution Standard (PTES).	Antonio López Martínez; Alejandro Cano; Antonio Ruiz-Martínez;	arxiv-cs.CR	2025-01-12
91	Comparing Few-Shot Prompting of GPT-4 LLMs with BERT Classifiers for Open-Response Assessment in Tutor Equity Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we study whether fine-tuning BERT on human annotations outperforms state-of-the-art LLMs (GPT-4o and GPT-4-Turbo) with few-shot prompting and instruction.	Sanjit Kakarla; Conrad Borchers; Danielle Thomas; Shambhavi Bhushan; Kenneth R. Koedinger;	arxiv-cs.HC	2025-01-11
92	Assessing Instructor-AI Cooperation for Grading Essay-type Questions in An Introductory Sociology Course Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the use of artificial intelligence (AI) as a complementary tool for grading essay-type questions in higher education, focusing on its consistency with human grading and potential to reduce biases.	Francisco Olivos; Tobias Kamelski; Sebastián Ascui-Gac;	arxiv-cs.AI	2025-01-11
93	ZNO-Eval: Benchmarking Reasoning Capabilities of Large Language Models in Ukrainian Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The purpose of this work is to establish a comprehensive benchmark for the reasoning capabilities evaluation of large language models in the Ukrainian language.	Mykyta Syromiatnikov; Victoria Ruvinskaya; Anastasiya Troynina;	arxiv-cs.CL	2025-01-11
94	Model Inversion in Split Learning for Personalized LLMs: New Insights from Information Bottleneck Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the first time, we introduce mutual information entropy to understand the information propagation of Transformer-based LLMs and assess privacy attack performance for LLM blocks.	Yunmeng Shu; Shaofeng Li; Tian Dong; Yan Meng; Haojin Zhu;	arxiv-cs.LG	2025-01-10
95	Aligning Brain Activity with Advanced Transformer Models: Exploring The Role of Punctuation in Semantic Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing an innovative approach originally proposed by Toneva and Wehbe, we evaluate four advanced transformer models RoBERTa, DistiliBERT, ALBERT, and ELECTRA against neural activity data.	Zenon Lamprou; Frank Polick; Yashar Moshfeghi;	arxiv-cs.CL	2025-01-10
96	From Conversation to Automation: Leveraging Large Language Models to Analyze Strategies in Problem Solving Therapy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study leverages anonymized therapy transcripts to analyze and classify therapeutic interventions using various LLMs and transformer-based models.	ELHAM AGHAKHANI et. al.	arxiv-cs.CL	2025-01-10
97	UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The newly developed method showed the difference in the length of the created trajectory in 22% and the mean error in finding the objects of interest on a map in 34.22 m by Euclidean distance in the K-Nearest Neighbors (KNN) approach.	OLEG SAUTENKOV et. al.	arxiv-cs.RO	2025-01-09
98	OpenAI ChatGPT Interprets Radiological Images: GPT-4 As A Medical Doctor for A Fast Check-Up Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we addressed the question of whether artificial intelligence (AI) can replace a healthcare professional (e.g., a medical doctor) or whether it can be used as a decision-support tool that makes decisions easier and more reliable.	Omer Aydin; Enis Karaarslan;	arxiv-cs.CV	2025-01-09
99	MB-TaylorFormer V2: Improved Multi-branch Linear Transformer Expanded By Taylor Formula for Image Restoration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the quadratic computational complexity of Softmax-attention poses a significant limitation on its extensive application in image restoration tasks, particularly for high-resolution images. To tackle this challenge, we propose a novel variant of the Transformer.	Zhi Jin; Yuwei Qiu; Kaihao Zhang; Hongdong Li; Wenhan Luo;	arxiv-cs.CV	2025-01-08
100	IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent research has investigated the problem of detecting machine-generated essays for academic purposes.	Mohammad AL-Smadi;	arxiv-cs.CL	2025-01-07
101	A Case Study on The Transformative Potential of AI in Software Engineering on LeetCode and ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This contribution presents a first large-scale study comparing generated code with human-written code based on LeetCode platform based on multiple measures including code quality, code understandability, time behaviour and resource utilisation.	Manuel Merkel; Jens Dörpinghaus;	arxiv-cs.DB	2025-01-07
102	Three-dimensional Attention Transformer for State Evaluation in Real-time Strategy Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose a tri-dimensional Space-Time-Feature Transformer (TSTF Transformer) architecture, which efficiently models battlefield situations through three independent but cascaded modules: spatial attention, temporal attention, and feature attention.	Yanqing Ye; Weilong Yang; Kai Qiu; Jie Zhang;	arxiv-cs.LG	2025-01-07
103	Text to Band Gap: Pre-trained Language Models As Encoders for Semiconductor Band Gap Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we explore the use of a transformer-based language model as an encoder to predict the band gaps of semiconductor materials directly from their text descriptions.	Ying-Ting Yeh; Janghoon Ock; Amir Barati Farimani;	arxiv-cs.CL	2025-01-06
104	Empowering Bengali Education with AI: Solving Bengali Math Word Problems Through Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This poses a significant challenge in natural language processing, particularly for low-resource languages such as Bengali. This paper addresses this challenge by developing an innovative approach to solving Bengali MWPs using transformer-based models, including Basic Transformer, mT5, BanglaT5, and mBART50.	Jalisha Jashim Era; Bidyarthi Paul; Tahmid Sattar Aothoi; Mirazur Rahman Zim; Faisal Muhammad Shah;	arxiv-cs.CL	2025-01-05
105	LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The machine learning and data science community has made significant while dispersive progress in accelerating transformer-based large language models (LLMs), and one promising approach is to replace the original causal attention in a generative pre-trained transformer (GPT) with \emph{exponentially decaying causal linear attention}. In this paper, we present LeetDecoding, which is the first Python package that provides a large set of computation routines for this fundamental operator.	Jiaping Wang; Simiao Zhang; Qiao-Chu He; Yifan Chen;	arxiv-cs.LG	2025-01-05
106	A Completely Uniform Transformer for Parity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a 3-layer constant-dimension transformer, recognizing the parity language, where neither parameter matrices nor the positional encoding depend on the input length.	Alexander Kozachinskiy; Tomasz Steifer;	arxiv-cs.LG	2025-01-05
107	Sensorformer: Cross-patch Attention with Global-patch Compression Is Effective for High-dimensional Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We attribute this issue to the dynamic time lags in the causal relationships between different variables. Therefore, we propose a new multivariate time series forecasting Transformer, Sensorformer, which first compresses the global patch information and then simultaneously extracts cross-variable and cross-time dependencies from the compressed representations.	Liyang Qin; Xiaoli Wang; Chunhua Yang; Huaiwen Zou; Haochuan Zhang;	arxiv-cs.LG	2025-01-05
108	Anonymization By Design of Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a privacy-by-design language modeling approach to address the problem of language models anonymization, and thus promote their sharing.	Antoine Boutet; Zakaria El Kazdam; Lucas Magnana; Helain Zimmermann;	arxiv-cs.CL	2025-01-04
109	LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a preliminary analysis of an experiment conducted by Frank Bold, a Czech expert group, to explore user interactions with GPT-4 for addressing legal queries.	Michal Kuk; Jakub Harasta;	arxiv-cs.HC	2025-01-03
110	VidFormer: A Novel End-to-end Framework Fused By 3DCNN and Transformer for Video-based Remote Physiological Measurement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce VidFormer, a novel end-to-end framework that integrates 3-Dimension Convolutional Neural Network (3DCNN) and Transformer models for rPPG tasks.	JIACHEN LI et. al.	arxiv-cs.CV	2025-01-03
111	End-to-End Long Document Summarization Using Gradient Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose CachED (Gradient $\textbf{Cach}$ing for $\textbf{E}$ncoder-$\textbf{D}$ecoder models), an approach that enables end-to-end training of existing transformer-based encoder-decoder models, using the entire document without truncation.	Rohit Saxena; Hao Tang; Frank Keller;	arxiv-cs.CL	2025-01-03
112	Predicting The Performance of Black-box LLMs Through Self-Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations to train reliable predictors of model behavior.	Dylan Sam; Marc Finzi; J. Zico Kolter;	arxiv-cs.LG	2025-01-02
113	Towards Interactive Deepfake Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to explore interactive deepfake analysis by performing instruction tuning on multi-modal large language models (MLLMs).	LIXIONG QIN et. al.	arxiv-cs.CV	2025-01-02
114	Digital Guardians: Can GPT-4, Perspective API, and Moderation API Reliably Detect Hate Speech in Reader Comments of German Online Newspapers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Some providers of large language models already offer solutions for automated hate speech detection or the identification of toxic content.	MANUEL WEBER et. al.	arxiv-cs.CL	2025-01-02
115	Multi-Head Explainer: A General Framework to Improve Explainability in CNNs and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce the Multi-Head Explainer (MHEX), a versatile and modular framework that enhances both the explainability and accuracy of Convolutional Neural Networks (CNNs) and Transformer-based models.	Bohang Sun; Pietro Liò;	arxiv-cs.CV	2025-01-02
116	An Efficient Attention Mechanism for Sequential Recommendation Tasks: HydraRec Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the idea of Hydra attention, we introduce an efficient Transformer based Sequential RS (HydraRec) which significantly improves theoretical complexity of computing attention for longer sequences and bigger datasets while preserving the temporal context.	Uzma Mushtaque;	arxiv-cs.IR	2025-01-02
117	Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this letter, Multiscaled Multi-Head Attention Video Transformer Network (MsMHA-VTN) for dynamic hand gesture recognition is proposed.	Mallika Garg; Debashis Ghosh; Pyari Mohan Pradhan;	arxiv-cs.CV	2025-01-01
118	Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting A Petroglyph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This may be partially due to a sudden growth of the language modeling community after the advent of GPT-2, but perhaps also due to the lack of a clear explanation in prior publications, despite being commonly understood by practitioners in the past. Here we review this long-forgotten explanation why explicit PEs are nonessential for multi-layer autoregressive Transformers (in contrast, one-layer models require PEs to discern order information of their input tokens).	Kazuki Irie;	arxiv-cs.LG	2024-12-31
119	ReFormer: Generating Radio Fakes for Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ReFormer, a generative AI (GAI) model that can efficiently generate synthetic radio-frequency (RF) data, or RF fakes, statistically similar to the data it was trained on, or with modified statistics, in order to augment datasets collected in real-world experiments.	Yagna Kaasaragadda; Silvija Kokalj-Filipovic;	arxiv-cs.LG	2024-12-31
120	Text Classification: Neural Networks VS Machine Learning Models VS Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a comparison between different techniques to perform text classification.	Christos Petridis;	arxiv-cs.LG	2024-12-30
121	GPT-4 on Clinic Depression Assessment: An LLM-Based Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the use of GPT-4 for clinical depression assessment based on transcript analysis.	Giuliano Lorenzoni; Pedro Elkind Velmovitsky; Paulo Alencar; Donald Cowan;	arxiv-cs.CL	2024-12-30
122	Comparative Performance of Advanced NLP Models and LLMs in Multilingual Geo-Entity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation of leading NLP models — SpaCy, XLM-RoBERTa, mLUKE, GeoLM — and LLMs, specifically OpenAI’s GPT 3.5 and GPT 4, within the context of multilingual geo-entity detection.	Kalin Kopanov;	arxiv-cs.CL	2024-12-29
123	NLP-based Regulatory Compliance — Using GPT 4.0 to Decode Regulatory Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) such as GPT-4.0 have shown significant promise in addressing the semantic complexities of regulatory documents, particularly in detecting inconsistencies and contradictions.	Bimal Kumar; Dmitri Roussinov;	arxiv-cs.CL	2024-12-29
124	Building A Rich Dataset to Empower The Persian Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, a comprehensive open-domain dataset is presented for Persian.	Mohsen Yazdinejad; Marjan Kaedi;	arxiv-cs.CL	2024-12-28
125	Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a distilled transformer network (DTN) to capture both rich local and global forgery traces and learn general and common representations for different forgery faces.	Yaning Zhang; Qiufu Li; Zitong Yu; Linlin Shen;	arxiv-cs.CV	2024-12-28
126	CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces CAD-GPT, a CAD synthesis method with spatial reasoning-enhanced MLLM that takes either a single image or a textual description as input.	SIYU WANG et. al.	arxiv-cs.CV	2024-12-27
127	Generative Pretrained Embedding and Hierarchical Irregular Time Series Representation for Daily Living Activity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further refine recognition, we incorporate into our proposed architecture an hour-of-the-day embedding.	Damien Bouchabou; Sao Mai Nguyen;	arxiv-cs.LG	2024-12-27
128	Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to conduct a comparative study by adapting and evaluating existing text classification techniques within the cyberbullying detection domain.	Adamu Gaston Philipo; Doreen Sebastian Sarwatt; Jianguo Ding; Mahmoud Daneshmand; Huansheng Ning;	arxiv-cs.CL	2024-12-27
129	DrivingWorld: Constructing World Model for Autonomous Driving Via Video GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DrivingWorld, a GPT-style world model for autonomous driving, featuring several spatial-temporal fusion mechanisms.	XIAOTAO HU et. al.	arxiv-cs.CV	2024-12-27
130	DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a pioneering Domain Adaptive Point Transformer (DAPoinTr) framework for point cloud completion.	YINGHUI LI et. al.	arxiv-cs.CV	2024-12-26
131	Feature Alignment-Based Knowledge Distillation for Efficient Compression of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a knowledge distillation algorithm based on large language models and feature alignment, aiming to effectively transfer the knowledge of large pre-trained models into lightweight student models, thereby reducing computational costs while maintaining high model performance.	SHUO WANG et. al.	arxiv-cs.CL	2024-12-26
132	Injecting Bias Into Text Classification Models Using Backdoor Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize backdoor attacks for a new purpose: bias injection.	A. Dilara Yavuz; M. Emre Gursoy;	arxiv-cs.CR	2024-12-25
133	Whose Morality Do They Speak? Unraveling Cultural Bias in Multilingual Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates whether multilingual LLMs, such as GPT-3.5-Turbo, GPT-4o-mini, Llama 3.1, and MistralNeMo, reflect culturally specific moral values or impose dominant moral norms, particularly those rooted in English.	Meltem Aksoy;	arxiv-cs.CL	2024-12-25
134	LoGFiLM: Fine-Tuning A Large Language Model for Automated Generation of Log Statements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning LLMs requires task-specific training data and custom-designed processing algorithms, which, however, have not been thoroughly explored for the log statement generation task. This paper fills this gap by contributing such a fine-tuning method LoGFiLM and an exemplar model by using the proposed method to fine-tune Llama-3-8B.	HAO ZHANG et. al.	arxiv-cs.SE	2024-12-25
135	Ister: Inverted Seasonal-Trend Decomposition Transformer for Explainable Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing models face challenges in identifying critical components for prediction, leading to limited interpretability and suboptimal performance. To address these issues, we propose the Inverted Seasonal-Trend Decomposition Transformer (Ister), a novel Transformer-based model for multivariate time series forecasting.	Fanpu Cao; Shu Yang; Zhengjian Chen; Ye Liu; Laizhong Cui;	arxiv-cs.LG	2024-12-25
136	Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose OMTSeg for open-vocabulary segmentation using another large-scale vision-language pre-trained model called BEiT-3 and leveraging the cross-modal attention between visual and linguistic features in BEiT-3 to achieve better performance.	Yi-Chia Chen; Wei-Hua Li; Chu-Song Chen;	arxiv-cs.CV	2024-12-25
137	Unlocking The Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the capability of state-of-the-art language models-RoBERTa Base, Bangla-BERT, and BERT Base-in automatically assessing Bangla passage-based question-answering from the National Curriculum and Textbook Board (NCTB) textbooks for classes 6-10.	Abdullah Khondoker; Enam Ahmed Taufik; Md Iftekhar Islam Tashik; S M Ishtiak mahmud; Antara Firoz Parsa;	arxiv-cs.CL	2024-12-24
138	Combining GPT and Code-Based Similarity Checking for Effective Smart Contract Vulnerability Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present SimilarGPT, a unique vulnerability identification tool for smart contract, which combines Generative Pretrained Transformer (GPT) models with Code-based similarity checking methods.	Jango Zhang;	arxiv-cs.SE	2024-12-24
139	Optimizing Large Language Models with An Enhanced LoRA Fine-Tuning Algorithm for Efficiency and Robustness in NLP Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a large language model optimization method based on the improved LoRA fine-tuning algorithm, aiming to improve the accuracy and computational efficiency of the model in natural language processing tasks.	JIACHENG HU et. al.	arxiv-cs.CL	2024-12-24
140	Segment-Based Attention Masking for GPTs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, attention is masked based on the known block structure at the prefill phase, followed by the conventional token-by-token autoregressive process after that.	Shahar Katz; Liran Ringel; Yaniv Romano; Lior Wolf;	arxiv-cs.CL	2024-12-24
141	IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the MultilingualRobertaClass model, a deep neural network built on the pretrained multilingual transformer model ia-multilingual-transliterated-roberta, optimized for classification tasks in multilingual and transliterated contexts.	Siddhant Gupta; Siddh Singhal; Azmine Toushik Wasi;	arxiv-cs.CL	2024-12-23
142	Token Statistics Transformer: Linear-Time Attention Via Variational Rate Reduction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel transformer attention operator whose computational complexity scales linearly with the number of tokens.	ZIYANG WU et. al.	arxiv-cs.LG	2024-12-23
143	SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a substation equipment fault analysis method based on a multimodal large language model (MLLM).	JINZHI WANG et. al.	arxiv-cs.AI	2024-12-22
144	PsychAdapter: Adapting LLM Transformers to Reflect Traits, Personality and Mental Health Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a lightweight modification to the standard language model transformer architecture – PsychAdapter – that uses empirically derived trait-language patterns to generate natural language for specified personality, demographic, and mental health characteristics (with or without prompting).	HUY VU et. al.	arxiv-cs.AI	2024-12-22
145	TAR3D: Creating High-Quality 3D Assets Via Next-Part Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TAR3D, a novel framework that consists of a 3D-aware Vector Quantized-Variational AutoEncoder (VQ-VAE) and a Generative Pre-trained Transformer (GPT) to generate high-quality 3D assets.	XUYING ZHANG et. al.	arxiv-cs.CV	2024-12-22
146	Reversed Attention: On The Gradient Descent Of Attention Layers In GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the mathematics of the backward pass of attention, revealing that it implicitly calculates an attention matrix we refer to as Reversed Attention.	Shahar Katz; Lior Wolf;	arxiv-cs.CL	2024-12-22
147	Development of A Large-scale Dataset of Chest Computed Tomography Reports in Japanese and A High-performance Finding Classification Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To develop a comprehensive Japanese CT report dataset through machine translation and establish a specialized language model for structured finding classification.	YOSUKE YAMAGISHI et. al.	arxiv-cs.CL	2024-12-20
148	BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the potential of recurrent neural networks (RNNs) and other subquadratic architectures as competitive alternatives to transformer-based models in low-resource language modeling scenarios.	Patrick Haller; Jonas Golde; Alan Akbik;	arxiv-cs.CL	2024-12-20
149	Demystifying The Potential of ChatGPT-4 Vision for Construction Progress Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of Large Vision-Language Models (LVLMs) such as OpenAI’s GPT-4 Vision into various sectors has marked a significant evolution in the field of artificial intelligence, particularly in the analysis and interpretation of visual data. This paper explores the practical application of GPT-4 Vision in the construction industry, focusing on its capabilities in monitoring and tracking the progress of construction projects.	Ahmet Bahaddin Ersoz;	arxiv-cs.CV	2024-12-20
150	Identifying Cyberbullying Roles in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions.	MANUEL SANDOVAL et. al.	arxiv-cs.LG	2024-12-20
151	Linguistic Features Extracted By GPT-4 Improve Alzheimer’s Disease Detection Based on Spontaneous Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech.	Jonathan Heitz; Gerold Schneider; Nicolas Langer;	arxiv-cs.CL	2024-12-20
152	Graph-Convolutional Networks: Named Entity Recognition and Large Language Model Embedding in Document Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel approach that integrates Named Entity Recognition (NER) and LLM embeddings within a graph-based framework for document clustering.	Imed Keraghel; Mohamed Nadif;	arxiv-cs.CL	2024-12-19
153	How Good Is GPT at Writing Political Speeches for The White House? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using large language models (LLMs), computers are able to generate a written text in response to a us er request.	Jacques Savoy;	arxiv-cs.CL	2024-12-19
154	A Full Transformer-based Framework for Automatic Pain Estimation Using Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a novel full transformer-based framework consisting of a Transformer in Transformer (TNT) model and a Transformer leveraging cross-attention and self-attention blocks.	Stefanos Gkikas; Manolis Tsiknakis;	arxiv-cs.CV	2024-12-19
155	LLMs As Mediators: Can They Diagnose Conflicts Accurately? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior research indicates that to be able to mediate conflict, observers of disagreements between parties must be able to reliably distinguish the sources of their disagreement as stemming from differences in beliefs about what is true (causality) vs. differences in what they value (morality). In this paper, we test if OpenAI’s Large Language Models GPT 3.5 and GPT 4 can perform this task and whether one or other type of disagreement proves particularly challenging for LLM’s to diagnose.	Özgecan Koçak; Phanish Puranam; Afşar Yegin;	arxiv-cs.CL	2024-12-19
156	FarExStance: Explainable Stance Detection for Farsi Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce FarExStance, a new dataset for explainable stance detection in Farsi.	MAJID ZARHARAN et. al.	arxiv-cs.CL	2024-12-18
157	Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce ModernBERT, bringing modern model optimizations to encoder-only models and representing a major Pareto improvement over older encoders.	BENJAMIN WARNER et. al.	arxiv-cs.CL	2024-12-18
158	Fake News Detection: Comparative Evaluation of BERT-like Models and Large Language Models with Generative AI-Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a comparative evaluation of BERT-like encoder-only models and autoregressive decoder-only large language models (LLMs) for fake news detection.	Shaina Raza; Drai Paulen-Patterson; Chen Ding;	arxiv-cs.CL	2024-12-18
159	Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, a hybrid model that combines LSTMs for temporal encoding with a Transformer encoder for capturing complex interactions between vehicles is proposed.	Chandra Raskoti; Weizi Li;	arxiv-cs.RO	2024-12-17
160	Lightweight Safety Classification Using Pruned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel technique for content safety and prompt injection classification for Large Language Models.	Mason Sawtell; Tula Masterman; Sandi Besen; Jim Brown;	arxiv-cs.CL	2024-12-17
161	No More Adam: Learning Rate Scaling at Initialization Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we question the necessity of adaptive gradient methods for training deep neural networks.	Minghao Xu; Lichuan Xiang; Xu Cai; Hongkai Wen;	arxiv-cs.LG	2024-12-16
162	Investigating Mixture of Experts in Dense Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), one limitation of these neural models is their narrow generalizability and robustness. To cope with …	Effrosyni Sokli; Pranav Kasela; Georgios Peikos; Gabriella Pasi;	arxiv-cs.IR	2024-12-16
163	Causal Diffusion Transformers for Generative Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Causal Diffusion as the autoregressive (AR) counterpart of Diffusion models.	Chaorui Deng; Deyao Zhu; Kunchang Li; Shi Guang; Haoqi Fan;	arxiv-cs.CV	2024-12-16
164	Seeing The Forest and The Trees: Solving Visual Graph and Tree Based Data Structure Problems Using Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research not only introduces an LMM benchmark to facilitate replication and further exploration but also underscores the potential of LMMs in solving complex computing problems, with important implications for pedagogy and assessment practices.	SEBASTIAN GUTIERREZ et. al.	arxiv-cs.AI	2024-12-15
165	Optimized Quran Passage Retrieval Using An Expanded QA Dataset and Fine-Tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Qur’an QA 2023 shared task dataset had a limited number of questions with weak model retrieval. To address this challenge, this work updated the original dataset and improved the model accuracy.	Mohamed Basem; Islam Oshallah; Baraa Hikal; Ali Hamdi; Ammar Mohamed;	arxiv-cs.CL	2024-12-15
166	Do Tutors Learn from Equity Training and Can Generative AI Assess It? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We apply a mixed-method approach to analyze the performance of 81 undergraduate remote tutors.	DANIELLE R. THOMAS et. al.	arxiv-cs.HC	2024-12-15
167	Tokens, The Oft-overlooked Appetizer: Large Language Models, The Distributional Hypothesis, and Meaning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides creating sub-optimal semantic building blocks and obscuring the model’s access to the necessary distributional patterns, we describe how tokenization pretraining can be a backdoor for bias and other unwanted content, which current alignment practices may not remediate.	JULIA WITTE ZIMMERMAN et. al.	arxiv-cs.CL	2024-12-14
168	SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation.	QILONG WU et. al.	arxiv-cs.CL	2024-12-14
169	Does Multiple Choice Have A Future in The Age of Generative AI? A Posttest-only RCT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using a posttest-only randomized control design, we compare the performance of 234 tutors (790 lesson completions) across three conditions: MCQ only, open response only, and a combination of both.	DANIELLE R. THOMAS et. al.	arxiv-cs.HC	2024-12-13
170	Evaluation of GPT-4o and GPT-4o-mini’s Vision Capabilities for Compositional Analysis from Dried Solution Drops Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using OpenAI’s image-enabled language models, we analyzed deposits from 12 salts with 200 images per salt and per model.	Deven B. Dangi; Beni B. Dangi; Oliver Steinbock;	arxiv-cs.CV	2024-12-13
171	SPT: Sequence Prompt Transformer for Interactive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods typically process one image at a time, failing to consider the sequential nature of the images. To overcome this limitation, we propose a novel method called Sequence Prompt Transformer (SPT), the first to utilize sequential image information for interactive segmentation.	Senlin Cheng; Haopeng Sun;	arxiv-cs.CV	2024-12-13
172	Adaptive Principal Components Allocation with The $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel Parameter-Efficient Fine-Tuning (PEFT) approach based on Gaussian Graphical Models (GGMs), marking the first application of GGMs to PEFT tasks, to the best of our knowledge.	Jingjing Zheng; Yankai Cao;	arxiv-cs.LG	2024-12-11
173	NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection Using Ensembling of BERT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work emphasizes the need for hate speech detection in Devanagari-scripted languages and presents a foundation for further research.	Anmol Guragain; Nadika Poudel; Rajesh Piryani; Bishesh Khanal;	arxiv-cs.CL	2024-12-11
174	Advancing Single- and Multi-task Text Classification Through Large Language Model Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study employed a diverse range of models and methods, varying in size and architecture, and including both fine-tuned and pre-trained approaches.	Hang Zhao; Qile P. Chen; Yijing Barry Zhang; Gang Yang;	arxiv-cs.CL	2024-12-11
175	A Survey on Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models.	Yang Li; Xinyu Zhou; Yitong Wang; Liangxin Qian; Jun Zhao;	arxiv-cs.CR	2024-12-11
176	Assessing Personalized AI Mentoring with Large Language Models in The Computing Field Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides an in-depth evaluation of three state-of-the-art Large Language Models (LLMs) for personalized career mentoring in the computing field, using three distinct student profiles that consider gender, race, and professional levels.	Xiao Luo; Sean O’Connell; Shamima Mithun;	arxiv-cs.CL	2024-12-11
177	GPT-2 Through The Lens of Vector Symbolic Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the resemblance between decoder-only transformer architecture and vector symbolic architectures (VSA) and presents experiments indicating that GPT-2 uses mechanisms involving nearly orthogonal vector bundling and binding operations similar to VSA for computation and communication between layers.	Johannes Knittel; Tushaar Gangavarapu; Hendrik Strobelt; Hanspeter Pfister;	arxiv-cs.LG	2024-12-10
178	Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach leveraging large language models (LLMs) like GPT-4, LLaMA 2 (13B), and BERT to generate KGs directly from unstructured data, bypassing traditional pipelines.	Ahan Bhatt; Nandan Vaghela; Kush Dudhia;	arxiv-cs.CL	2024-12-10
179	Causal World Representation in The GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Are generative pre-trained transformer (GPT) models only trained to predict the next token, or do they implicitly learn a world model from which a sequence is generated one token at a time? We examine this question by deriving a causal interpretation of the attention mechanism in GPT, and suggesting a causal world model that arises from this interpretation.	Raanan Y. Rohekar; Yaniv Gurwicz; Sungduk Yu; Vasudev Lal;	arxiv-cs.AI	2024-12-10
180	TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the first time, this paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs, including SRAM, AES, and UART modules. We propose a novel tool for this goal that systematically assesses state-of-the-art LLMs (GPT-4o, Gemini 1.5 pro, and Llama 3.1) in detecting HTs without prior fine-tuning.	Md Omar Faruque; Peter Jamieson; Ahmad Patooghy; Abdel-Hameed A. Badawy;	arxiv-cs.CR	2024-12-10
181	Rethinking Emotion Annotations in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the complexities of emotion annotation in the context of LLMs, focusing on GPT-4 as a leading model.	Minxue Niu; Yara El-Tawil; Amrit Romana; Emily Mower Provost;	arxiv-cs.CL	2024-12-10
182	Towards Predictive Communication with Brain-Computer Interfaces Integrating Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This perspective article aims at providing an outline of the state of the art and future developments towards the integration of cutting-edge predictive language models with BCI.	Andrea Caria;	arxiv-cs.HC	2024-12-10
183	Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study aims to explore the performance improvement method of large language models based on GPT-4 under the multi-task learning framework and conducts experiments on two …	ZHEN QI et. al.	ArXiv	2024-12-09
184	Inverting Visual Representations with Detection Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we apply the approach of training inverse models to reconstruct input images from intermediate layers within a Detection Transformer, showing that this approach is efficient and feasible for transformer-based vision models.	Jan Rathjens; Shirin Reyhanian; David Kappel; Laurenz Wiskott;	arxiv-cs.CV	2024-12-09
185	CARP: Visuomotor Policy Learning Via Coarse-to-Fine Autoregressive Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Coarse-to-Fine AutoRegressive Policy (CARP), a novel paradigm for visuomotor policy learning that redefines the autoregressive action generation process as a coarse-to-fine, next-scale approach.	ZHEFEI GONG et. al.	arxiv-cs.RO	2024-12-09
186	SplaXBERT: Leveraging Mixed Precision Training and Context Splitting for Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: SplaXBERT, built on ALBERT-xlarge with context-splitting and mixed precision training, achieves high efficiency in question-answering tasks on lengthy texts. Tested on SQuAD v1.1, …	Zhu Yufan; Hao Zeyu; Li Siqi; Niu Boqian;	arxiv-cs.CL	2024-12-06
187	Exploring Transformer-Based Music Overpainting for Jazz Piano Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs.	Eleanor Row; Ivan Shanin; György Fazekas;	arxiv-cs.SD	2024-12-05
188	FANAL — Financial Activity News Alerting Language Modeling Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FANAL (Financial Activity News Alerting Language Modeling Framework), a specialized BERT-based framework engineered for real-time financial event detection and analysis, categorizing news into twelve distinct financial categories.	Urjitkumar Patel; Fang-Chun Yeh; Chinmay Gondhalekar; Hari Nalluri;	arxiv-cs.CL	2024-12-04
189	Controlling The Mutation in Large Language Models for The Efficient Evolution of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach to mutation control within LLM-driven evolutionary frameworks, inspired by theory of genetic algorithms.	Haoran Yin; Anna V. Kononova; Thomas Bäck; Niki van Stein;	arxiv-cs.NE	2024-12-04
190	A Water Efficiency Dataset for African Data Centers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI computing and data centers consume a large amount of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to …	Noah Shumba; Opelo Tshekiso; Pengfei Li; Giulia Fanti; Shaolei Ren;	arxiv-cs.LG	2024-12-04
191	The Asymptotic Behavior of Attention in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we provide a rigorous, mathematical analysis of the asymptotic properties of attention in transformers.	Álvaro Rodríguez Abella; João Pedro Silvestre; Paulo Tabuada;	arxiv-cs.AI	2024-12-03
192	Transformer-Based Auxiliary Loss for Face Recognition Across Age Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a technique for loss evaluation that uses a transformer network as an additive loss in the face recognition domain.	Pritesh Prakash; Ashish Jacob Sam; S Umamaheswaran;	arxiv-cs.CV	2024-12-03
193	Achieving Semantic Consistency: Contextualized Word Representations for Political Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares Word2Vec and BERT using 20 years of People’s Daily articles to evaluate their performance in semantic representations across different timeframes.	Ruiyu Zhang; Lin Nie; Ce Zhao; Qingyang Chen;	arxiv-cs.CL	2024-12-03
194	Su-RoBERTa: A Semi-supervised Approach to Predicting Suicide Risk Through Social Media Using Base Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Su-RoBERTa, a fine-tuned RoBERTa on suicide risk prediction task that utilized both the labeled and unlabeled Reddit data and tackled class imbalance by data augmentation using GPT-2 model.	CHAYAN TANK et. al.	arxiv-cs.HC	2024-12-02
195	Assessing GPT Model Uncertainty in Mathematical OCR Tasks Via Entropy Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the uncertainty of Generative Pre-trained Transformer (GPT) models in extracting mathematical equations from images of varying resolutions and converting them into LaTeX code.	Alexei Kaltchenko;	arxiv-cs.IT	2024-12-02
196	Impact of Data Snooping on Deep Learning Models for Locating Vulnerabilities in Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the impact of data snooping on neural networks used to detect vulnerabilities in lifted code, and builds on previous research that used word2vec and unidirectional and bidirectional transformer-based embeddings.	Gary A. McCully; John D. Hastings; Shengjie Xu;	arxiv-cs.CR	2024-12-02
197	TGTOD: A Global Temporal Graph Transformer for Outlier Detection at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we rethink temporal graph Transformers and propose TGTOD, a novel end-to-end Temporal Graph Transformer for Outlier Detection.	Kay Liu; Jiahao Ding; MohamadAli Torkamani; Philip S. Yu;	arxiv-cs.LG	2024-12-01
198	Sequence Length Independent Norm-Based Generalization Bounds for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper provides norm-based generalization bounds for the Transformer architecture that do not depend on the input sequence length.	Jacob Trauger; Ambuj Tewari;	aistats	2024-12-01
199	Analysis of Privacy Leakage in Federated Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs.	Minh Vu; Truc Nguyen; Tre� Jeter; My T. Thai;	aistats	2024-12-01
200	Enhancing In-context Learning Via Linear Probe Calibration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input.	MOMIN ABBAS et. al.	aistats	2024-12-01
201	Automated Extraction of Acronym-Expansion Pairs from Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project addresses challenges posed by the widespread use of abbreviations and acronyms in digital texts. We propose a novel method that combines document preprocessing, regular expressions, and a large language model to identify abbreviations and map them to their corresponding expansions.	Izhar Ali; Million Haileyesus; Serhiy Hnatyshyn; Jan-Lucas Ott; Vasil Hnatyshin;	arxiv-cs.CL	2024-12-01
202	Understanding Complex-Valued Transformer for Modulation Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Complex-valued convolution neural networks (CVCNNs) have been recently applied for modulation recognition (MR), due to its ability to capture the relationship between the real and …	JINGRENG LEI et. al.	IEEE Wireless Communications Letters	2024-12-01
203	How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms.	Jorge Garc�a-Carrasco; Alejandro Mat�; Juan Carlos Trujillo;	aistats	2024-12-01
204	Homeostasis and Sparsity in Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The transformer architecture has become an integral part of the field of modern neural networks, playing a crucial role in a variety of tasks, such as text generation, machine …	Leonid Kotyuzanskiy; Artem Klimov;	arxiv-cs.LG	2024-11-30
205	Forma Mentis Networks Predict Creativity Ratings of Short Texts Via Interpretable Artificial Intelligence in Human and GPT-simulated Raters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use textual forma mentis networks (TFMN) to extract network (semantic/syntactic associations) and emotional features from approximately one thousand human- and GPT3.5-generated stories.	Edith Haim; Natalie Fischer; Salvatore Citraro; Giulio Rossetti; Massimo Stella;	arxiv-cs.AI	2024-11-30
206	LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the ever-increasing number of news stories available online, classifying them by topic, regardless of the language they are written in, has become crucial for enhancing readers’ access to relevant content. To address this challenge, we propose a teacher-student framework based on large language models (LLMs) for developing multilingual news classification models of reasonable size with no need for manual data annotation.	Taja Kuzman; Nikola Ljubešić;	arxiv-cs.CL	2024-11-29
207	Habit Coach: Customising RAG-based Chatbots to Support Behavior Change Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the iterative development of Habit Coach, a GPT-based chatbot designed to support users in habit change through personalized interaction.	Arian Fooroogh Mand Arabi; Cansu Koyuturk; Michael O’Mahony; Raffaella Calati; Dimitri Ognibene;	arxiv-cs.HC	2024-11-28
208	Waterfall Transformer for Multi-person Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Waterfall Transformer architecture for Pose estimation (WTPose), a single-pass, end-to-end trainable framework designed for multi-person pose estimation.	Navin Ranjan; Bruno Artacho; Andreas Savakis;	arxiv-cs.CV	2024-11-28
209	Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Self-Cross diffusion guidance to penalize the overlap between cross-attention maps and aggregated self-attention maps.	Weimin Qiu; Jieke Wang; Meng Tang;	arxiv-cs.CV	2024-11-28
210	The Impact of Example Selection in Few-Shot Prompting on Automated Essay Scoring Using GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the impact of example selection on the performance of au-tomated essay scoring (AES) using few-shot prompting with GPT models.	Lui Yoshida;	arxiv-cs.CL	2024-11-28
211	SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces SmartLLMSentry, a novel framework that leverages large language models (LLMs), specifically ChatGPT with in-context training, to advance smart contract vulnerability detection.	Oualid Zaazaa; Hanan El Bakkali;	arxiv-cs.CR	2024-11-28
212	Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Developing a system capable of automatically generating the literature reviews from only the PDF files as input is the primary objective of this research work.	Nurshat Fateh Ali; Md. Mahdi Mohtasim; Shakil Mosharrof; T. Gopi Krishna;	arxiv-cs.CL	2024-11-27
213	Training and Evaluating Language Models with Template-based Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models often struggle with tasks requiring complex reasoning, particularly in mathematical problem-solving, due in part to the scarcity of large-scale, high-quality, domain-specific datasets necessary for training sophisticated reasoning abilities. To address this limitation, we introduce Template-based Data Generation (TDG), a novel approach that leverages LLMs (GPT-4) to automatically generate parameterized meta-templates, which are then used to synthesize a vast array of high-quality problems and solutions.	Yifan Zhang;	arxiv-cs.CL	2024-11-27
214	CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Decoder-only models generate tokens autoregressively by caching key/value vectors, but as the cache grows, inference becomes memory-bound. To address this issue, we introduce CLOVER (Cross-Layer Orthogonal Vectors), a novel approach that treats pairs of attention layers as a set of low-rank decompositions.	Fanxu Meng; Pingzhi Tang; Fan jiang; Muhan Zhang;	arxiv-cs.LG	2024-11-26
215	On Limitations of LLM As Annotator for Low Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on Marathi, a low-resource language, and evaluate the performance of both closed-source and open-source LLMs as annotators.	Suramya Jadhav; Abhay Shanbhag; Amogh Thakurdesai; Ridhima Sinare; Raviraj Joshi;	arxiv-cs.CL	2024-11-26
216	An Attempt to Develop A Neural Parser Based on Simplified Head-Driven Phrase Structure Grammar on Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aimed to develop a neural parser for Vietnamese based on simplified Head-Driven Phrase Structure Grammar (HPSG).	Duc-Vu Nguyen; Thang Chau Phan; Quoc-Nam Nguyen; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-11-26
217	The Importance of Visual Modelling Languages in Generative Software Engineering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: GPT-4 accepts image and text inputs, rather than simply natural language. We investigate relevant use cases stemming from these enhanced capabilities of GPT-4.	Roberto Rossi;	arxiv-cs.SE	2024-11-26
218	Give Me The Code — Log Analysis of First-Year CS Students’ Interactions With GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite using unsophisticated prompting techniques, our findings suggest that the majority of students successfully leveraged GPT, incorporating the suggested solutions into their projects.	Pedro Alves; Bruno Pereira Cipriano;	arxiv-cs.CY	2024-11-26
219	Distributed Sign Momentum with Local Steps for Training Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates a novel communication-efficient distributed sign momentum method with local updates.	SHUHUA YU et. al.	arxiv-cs.LG	2024-11-26
220	Can Artificial Intelligence Predict Clinical Trial Outcomes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the predictive capabilities of large language models (LLMs) such as GPT-3.5, GPT-4, and HINT in determining clinical trial outcomes.	Shuyi Jin; Lu Chen; Hongru Ding; Meijie Wang; Lun Yu;	arxiv-cs.LG	2024-11-26
221	What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational Linguistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of new literature into the English curriculum remains a challenge since educators often lack scalable tools to rapidly evaluate readability and adapt texts for diverse classroom needs. This study proposes to address this gap through a multimodal approach that combines transformer-based text classification with linguistic feature analysis to align texts with UK Key Stages.	Jordan J. Bird;	arxiv-cs.CL	2024-11-26
222	Can Bidirectional Encoder Become The Ultimate Winner for Downstream Applications of Foundation Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article analyzes one-way and bidirectional models based on GPT and BERT and compares their differences based on the purpose of the model.	Lewen Yang; Xuanyu Zhou; Juao Fan; Xinyi Xie; Shengxin Zhu;	arxiv-cs.CL	2024-11-26
223	Can AI Grade Your Essays? A Comparative Analysis of Large Language Models and Teacher Ratings in Multidimensional Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent developments in generative AI, such as large language models, offer potential solutions to facilitate essay-scoring tasks for teachers.	Kathrin Seßler; Maurice Fürstenberg; Babette Bühler; Enkelejda Kasneci;	arxiv-cs.CL	2024-11-25
224	Development of Pre-Trained Transformer-based Models for The Nepali Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing efforts have predominantly concentrated on basic encoder-based models, there is a notable gap in the exploration of decoder-based architectures. To address this gap, we have collected 27.5 GB of Nepali text data, approximately 2.4x larger than any previously available Nepali language corpus.	Prajwal Thapa; Jinu Nyachhyon; Mridul Sharma; Bal Krishna Bal;	arxiv-cs.CL	2024-11-24
225	Improving Next Tokens Via Second-Last Predictions with Generate and Refine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use our model to improve the next token predictions of a standard GPT by combining both predictions in a “generate-then-refine” approach.	Johannes Schneider;	arxiv-cs.CL	2024-11-23
226	Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To tackle these problems, we first propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset, where the complex evaluation task is decoupled into simpler sub-tasks, effectively reducing the learning complexity. Based on this dataset, we design innovative training strategies to effectively distill GPT-4o’s evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.	RONG-CHENG TU et. al.	arxiv-cs.CL	2024-11-23
227	Nimbus: Secure and Efficient Two-Party Inference for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents a new two-party inference framework $\mathsf{Nimbus}$ for Transformer models.	ZHENGYI LI et. al.	arxiv-cs.CR	2024-11-23
228	All That Glitters: Approaches to Evaluations with Unreliable Model and Human Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The effects of this error can escape commonly reported metrics of label quality or obscure questions of accuracy, bias, fairness, and usefulness during model evaluation. This study demonstrates methods for answering such questions even in the context of very low reliabilities from expert humans.	Michael Hardy;	arxiv-cs.CL	2024-11-23
229	Enhancing Grammatical Error Detection Using BERT with Cleaned Lang-8 Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an improved LLM based model for Grammatical Error Detection (GED), which is a very challenging and equally important problem for many applications.	Rahul Nihalani; Kushal Shah;	arxiv-cs.CL	2024-11-23
230	Astro-HEP-BERT: A Bidirectional Language Model for Studying The Meanings of Concepts in Astrophysics and High Energy Physics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: I present Astro-HEP-BERT, a transformer-based language model specifically designed for generating contextualized word embeddings (CWEs) to study the meanings of concepts in astrophysics and high-energy physics.	Arno Simons;	arxiv-cs.CL	2024-11-22
231	Inducing Human-like Biases in Moral Reasoning Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the alignment (BrainScore) of large language models (LLMs) fine-tuned for moral reasoning on behavioral data and/or brain data of humans performing the same task.	ARTEM KARPOV et. al.	arxiv-cs.AI	2024-11-22
232	Purrfessor: A Fine-tuned Multimodal LLaVA Diet Health Chatbot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces Purrfessor, an innovative AI chatbot designed to provide personalized dietary guidance through interactive, multimodal engagement.	Linqi Lu; Yifan Deng; Chuan Tian; Sijia Yang; Dhavan Shah;	arxiv-cs.HC	2024-11-22
233	A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings show that transformer-based models have the potential to improve suicide ideation detection, thereby providing a path to develop robust mental health monitoring tools from social media. This research, therefore, underlines the undeniable prospect of advanced techniques in Natural Language Processing (NLP) while improving suicide prevention efforts.	Khalid Hasan; Jamil Saquer;	arxiv-cs.LG	2024-11-22
234	Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While existing research has primarily focused on model-specific adversarial methods, real-world applications demand a more generalizable and universal approach to audio adversarial attacks. In this paper, we introduce the Chat-Audio Attacks (CAA) benchmark including four distinct types of audio attacks, which aims to explore the the vulnerabilities of LLMs to these audio attacks in conversational scenarios.	WANQI YANG et. al.	arxiv-cs.SD	2024-11-22
235	Multiset Transformer: Advancing Representation Learning in Persistence Diagrams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve persistence diagram representation learning, we propose Multiset Transformer.	Minghua Wang; Ziyun Huang; Jinhui Xu;	arxiv-cs.LG	2024-11-21
236	Comparative Analysis of Pooling Mechanisms in LLMs: A Sentiment Analysis Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their widespread use, the comparative performance of these strategies on different LLM architectures remains underexplored. To address this gap, this paper investigates the effects of these pooling mechanisms on two prominent LLM families — BERT and GPT, in the context of sentence-level sentiment analysis.	Jinming Xing; Dongwen Luo; Chang Xue; Ruilin Xing;	arxiv-cs.CL	2024-11-21
237	GPT Versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objectives of the study were to examine novel ethical issues arising from the application of LLMs in multi-robot systems.	REBEKAH ROUSI et. al.	arxiv-cs.RO	2024-11-21
238	Evaluating The Robustness of Analogical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: On digit-matrix problems, we find a similar pattern but only on one out of the two types of variants we tested.	Martha Lewis; Melanie Mitchell;	arxiv-cs.CL	2024-11-21
239	BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, We experiment with four models from the BERT family: BERT Base, DistilBERT, ALBERT, and RoBERTa, and use multiclass classification to assess the alignment between CO and PO/PSO pairs.	Natenaile Asmamaw Shiferaw; Simpenzwe Honore Leandre; Aman Sinha; Dillip Rout;	arxiv-cs.LG	2024-11-21
240	Explaining GPT-4’s Schema of Depression Using Machine Behavior Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leveraged contemporary measurement theory to decode how GPT-4 interrelates depressive symptoms to inform both clinical utility and theoretical understanding.	ADITHYA V GANESAN et. al.	arxiv-cs.CL	2024-11-20
241	Exploring Large Language Models for Climate Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capability of GPT-4 in predicting rainfall at short-term (15-day) and long-term (12-month) scales.	Yang Wang; Hassan A. Karimi;	arxiv-cs.LG	2024-11-20
242	Topkima-Former: Low-energy, Low-Latency Inference for Transformers Using Top-k In-memory ADC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose innovations at the circuit, architecture, and algorithm levels to accelerate the transformer.	SHUAI DONG et. al.	arxiv-cs.AR	2024-11-20
243	AI-Driven Agents with Prompts Designed for High Agreeableness Increase The Likelihood of Being Mistaken for A Human in The Turing Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Various explanations in the literature address why these GPT agents were perceived as human, including psychological frameworks for understanding anthropomorphism. These findings highlight the importance of personality engineering as an emerging discipline in artificial intelligence, calling for collaboration with psychology to develop ergonomic psychological models that enhance system adaptability in collaborative activities.	U. LEÓN-DOMÍNGUEZ et. al.	arxiv-cs.AI	2024-11-20
244	SynEHRgy: Synthesizing Mixed-Type Structured Electronic Health Records Using Decoder-Only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel tokenization strategy tailored for structured EHR data, which encompasses diverse data types such as covariates, ICD codes, and irregularly sampled time series.	Hojjat Karami; David Atienza; Anisoara Ionescu;	arxiv-cs.LG	2024-11-20
245	Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Video Retrieval-Augmented Generation (Video-RAG), a training-free and cost-effective pipeline that employs visually-aligned auxiliary texts to help facilitate cross-modality alignment while providing additional information beyond the visual content.	YONGDONG LUO et. al.	arxiv-cs.CV	2024-11-20
246	Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article aims to introduce a novel approach or model that attains improved performance for Vietnamese NLI.	Dat Van-Thanh Nguyen; Tin Van Huynh; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-11-20
247	Benchmarking GPT-4 Against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a comprehensive evaluation of GPT-4’s translation capabilities compared to human translators of varying expertise levels.	JIANHAO YAN et. al.	arxiv-cs.CL	2024-11-20
248	Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of A Virtual Campus Environment with OpenAI GPT Integration with Unity 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach to multiple language learning, with Hindi the language to be learnt in our case, by using the integration of virtual reality environments and AI enabled tutoring systems using OpenAIs GPT api calls.	Adithya TG; Abhinavaram N; Gowri Srinivasa;	arxiv-cs.HC	2024-11-19
249	Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation of tokenizers used by 12 LLMs across all 22 official languages of India, with a focus on comparing the efficiency of their tokenization processes.	S. Tamang; D. J. Bora;	arxiv-cs.CL	2024-11-19
250	Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explored the improvement in terms of multi-class disease classification via pre-trained language models over Medical-Abstracts-TC-Corpus that spans five medical conditions.	Ahmed Akib Jawad Karim; Muhammad Zawad Mahmud; Samiha Islam; Aznur Azam;	arxiv-cs.CL	2024-11-19
251	Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ three distinct text vectorization methods for SVM: Term Frequency Inverse Document Frequency (TF-IDF), Word2Vec, and Bag of Words (BoW) evaluating their effectiveness in distinguishing between genuine and fake news.	Ahmed Akib Jawad Karim; Kazi Hafiz Md Asad; Aznur Azam;	arxiv-cs.CL	2024-11-19
252	Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review examines the development of abstractive NLP-based text summarization approaches and compares them to existing techniques for extractive summarization.	Leon Kopitar; Primoz Kocbek; Lucija Gosak; Gregor Stiglic;	arxiv-cs.CL	2024-11-18
253	Automatic A-C. Network Switching Units Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The desirable characteristics of automatic switching units designed for application in secondary a-c. distribution networks are discussed in this paper. Descriptions are given of …	G. G. Grissinger;	Journal of the A.I.E.E.
254	Re-examining Learning Linear Functions in Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore a simple model of ICL in a controlled setup with synthetic training data to investigate ICL of univariate linear functions.	Omar Naim; Guilhem Fouilhé; Nicholas Asher;	arxiv-cs.LG	2024-11-18
255	CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This task typically involves text-length alignment and seems easy to solve; however, due to the limited information content in pinyin abbreviations, achieving accurate conversion is challenging. In this paper, we treat this as a fill-mask task and propose CNMBERT, which stands for zh-CN Pinyin Multi-mask BERT Model, as a solution to this issue.	Zishuo Feng; Feng Cao;	arxiv-cs.CL	2024-11-18
256	Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: On the other hand, the dynamic multi-grained behavior-aware preference is hard to capture in interaction sequences, which reflects interaction-aware sequential pattern. To tackle these challenges, we propose a Multi-Grained Preference enhanced Transformer framework (M-GPT).	CHUAN HE et. al.	arxiv-cs.IR	2024-11-18
257	A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research introduces a novel text generation model that combines BERT’s semantic interpretation strengths with GPT-4’s generative capabilities, establishing a high standard in generating coherent, contextually accurate language.	JIAJING CHEN et. al.	arxiv-cs.CL	2024-11-18
258	Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach that encapsulates conceptual relationships among variables within a well-defined knowledge graph, forming dynamic and learnable KGEs for seamless integration into the transformer architecture.	Shubham Tanaji Kakde; Rony Mitra; Jasashwi Mandal; Manoj Kumar Tiwari;	arxiv-cs.LG	2024-11-17
259	Does Prompt Formatting Have Any Impact on LLM Performance? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although previous research has explored aspects like rephrasing prompt contexts, using various prompting techniques (like in-context learning and chain-of-thought), and ordering few-shot examples, our understanding of LLM sensitivity to prompt templates remains limited. Therefore, this paper examines the impact of different prompt templates on LLM performance.	JIA HE et. al.	arxiv-cs.CL	2024-11-15
260	Brain-inspired Action Generation with Spiking Transformer Diffusion Policy Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Especially in Can task, we achieved an improvement of 8%.	Qianhao Wang; Yinqian Sun; Enmeng Lu; Qian Zhang; Yi Zeng;	arxiv-cs.RO	2024-11-15
261	KuaiFormer: Transformer-Based Retrieval at Kuaishou Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce KuaiFormer, a novel transformer-based retrieval framework deployed in a large-scale content recommendation system.	CHI LIU et. al.	arxiv-cs.IR	2024-11-15
262	CMATH: Cross-Modality Augmented Transformer with Hierarchical Variational Distillation for Multimodal Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Cross-Modality Augmented Transformer with Hierarchical Variational Distillation, called CMATH, which consists of two major components, i.e., Multimodal Interaction Fusion and Hierarchical Variational Distillation.	XIAOFEI ZHU et. al.	arxiv-cs.MM	2024-11-15
263	Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although many model compression approaches have been explored, they often suffer from notorious performance degradation. To address this issue, we introduce a new method, namely Transformer Re-parameterization, to boost the performance of lightweight Transformer models.	Zixing Zhang; Zhongren Dong; Weixiang Xu; Jing Han;	arxiv-cs.SD	2024-11-14
264	Adopting RAG for LLM-Aided Future Vehicle Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to enhance automated design and software development in the automotive industry.	Vahid Zolfaghari; Nenad Petrovic; Fengjunjie Pan; Krzysztof Lebioda; Alois Knoll;	arxiv-cs.SE	2024-11-14
265	BabyLM Challenge: Exploring The Effect of Variation Sets on Language Model Training Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the context of the BabyLM Challenge, we focus on Variation Sets (VSs), sets of consecutive utterances expressing a similar intent with slightly different words and structures, which are ubiquitous in CDS.	Akari Haga; Akiyo Fukatsu; Miyu Oba; Arianna Bisazza; Yohei Oseki;	arxiv-cs.CL	2024-11-14
266	LoRA-LiteE: A Computationally Efficient Framework for Chatbot Preference-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, RLHF methods are often computationally intensive and resource-demanding, limiting their scalability and accessibility for broader applications. To address these challenges, this study introduces LoRA-Lite Ensemble (LoRA-LiteE), an innovative framework that combines Supervised Fine-tuning (SFT) with Low-Rank Adaptation (LoRA) and Ensemble Learning techniques to effectively aggregate predictions of lightweight models, which aim to achieve a balance between the performance and computational cost.	Yahe Yang; Chunliang Tao; Xiaojing Fan;	arxiv-cs.CL	2024-11-14
267	Evaluating World Models with LLM for Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a comprehensive evaluation of the world models with LLMs from the decision making perspective.	Chang Yang; Xinrun Wang; Junzhe Jiang; Qinggang Zhang; Xiao Huang;	arxiv-cs.AI	2024-11-13
268	LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH).	XIAONAN NIE et. al.	arxiv-cs.DC	2024-11-13
269	Towards Optimizing A Retrieval Augmented Generation Using Large Language Model on Academic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the growing trend of many organizations integrating Retrieval Augmented Generation (RAG) into their operations, we assess RAG on domain-specific data and test state-of-the-art models across various optimization techniques.	Anum Afzal; Juraj Vladika; Gentrit Fazlija; Andrei Staradubets; Florian Matthes;	arxiv-cs.AI	2024-11-13
270	CamemBERT 2.0: A Smarter French Language Model Aged to Perfection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This issue emphasizes the need for updated models that reflect current linguistic trends. In this paper, we introduce two new versions of the CamemBERT base model-CamemBERTav2 and CamemBERTv2-designed to address these challenges.	WISSAM ANTOUN et. al.	arxiv-cs.CL	2024-11-13
271	TRACE: Transformer-based Risk Assessment for Clinical Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TRACE (Transformer-based Risk Assessment for Clinical Evaluation), a novel method for clinical risk assessment based on clinical data, leveraging the self-attention mechanism for enhanced feature interaction and result interpretation.	Dionysis Christopoulos; Sotiris Spanos; Valsamis Ntouskos; Konstantinos Karantzalos;	arxiv-cs.CV	2024-11-13
272	Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using 385 questions spanning seven safety knowledge areas, the study analyzes the models’ accuracy, consistency, and reliability.	Farouq Sammour; Jia Xu; Xi Wang; Mo Hu; Zhenyu Zhang;	arxiv-cs.AI	2024-11-12
273	Circuit Complexity Bounds for RoPE-based Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we establish a circuit complexity bound for Transformers with $\mathsf{RoPE}$ attention.	BO CHEN et. al.	arxiv-cs.LG	2024-11-12
274	Derivational Morphology Reveals Analogical Generalization in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new method for investigating linguistic generalization in LLMs: focusing on GPT-J, we fit cognitive models that instantiate rule-based and analogical learning to the LLM training data and compare their predictions on a set of nonce adjectives with those of the LLM, allowing us to draw direct conclusions regarding underlying mechanisms.	Valentin Hofmann; Leonie Weissweiler; David Mortensen; Hinrich Schütze; Janet Pierrehumbert;	arxiv-cs.CL	2024-11-12
275	Split and Merge: Aligning Position Biases in LLM-based Evaluators IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLM-based evaluators exhibit position bias, or inconsistency, when used to evaluate candidate answers in pairwise comparisons, favoring either the first or second answer regardless of content. To address this limitation, we propose PORTIA, an alignment-based system designed to mimic human comparison strategies to calibrate position bias in a lightweight yet effective manner.	ZONGJIE LI et. al.	emnlp	2024-11-11
276	MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how to build small fact-checking models that have GPT-4-level performance but for 400x lower cost.	Liyan Tang; Philippe Laban; Greg Durrett;	emnlp	2024-11-11
277	Generalizing Clinical De-identification Models By Privacy-safe Data Augmentation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, labeling standards and the formats of patient records vary across different institutions. Our study addresses these issues by exploiting GPT-4 for data augmentation through one-shot and zero-shot prompts.	Woojin Kim; Sungeun Hahm; Jaejin Lee;	emnlp	2024-11-11
278	MTLS: Making Texts Into Linguistic Symbols Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we shift the focus to the symbolic properties and introduce MTLS: a pre-training method to improve the multilingual capability of models by Making Texts into Linguistic Symbols.	Wenlong Fei; Xiaohua Wang; Min Hu; Qingyu Zhang; Hongbo Li;	emnlp	2024-11-11
279	Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the frequency of (anti-)solidarity towards women and migrants in German parliamentary debates between 1867 and 2022.	AIDA KOSTIKOVA et. al.	emnlp	2024-11-11
280	White-Box Diffusion Transformer for Single-cell RNA-seq Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the process of data acquisition is often constrained by high cost and limited sample availability. To overcome these limitations, we propose a hybrid model based on Diffusion model and White-Box transformer that aims to generate synthetic and biologically plausible scRNA-seq data.	Zhuorui Cui; Shengze Dong; Ding Liu;	arxiv-cs.LG	2024-11-11
281	Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its efficiency, Sentence-BERT tackles STS tasks from a classification perspective, overlooking the progressive nature of semantic relationships, which results in suboptimal performance. To bridge this gap, this paper presents an innovative regression framework and proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss.	Bowen Zhang; Chunping Li;	emnlp	2024-11-11
282	Universal Response and Emergence of Induction in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By applying our method, we observe signatures of induction behavior within the residual stream of Gemma-2-2B, Llama-3.2-3B, and GPT-2-XL. Across all models, we find that these induction signatures gradually emerge within intermediate layers and identify the relevant model sections composing this behavior.	Niclas Luick;	arxiv-cs.LG	2024-11-11
283	Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-standard varieties from around the world).	EVE FLEISIG et. al.	emnlp	2024-11-11
284	Can LLMs Replace Neil DeGrasse Tyson? Evaluating The Reliability of LLMs As Science Communicators Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on evaluating the reliability of current LLMs as science communicators.	Prasoon Bajpai; Niladri Chatterjee; Subhabrata Dutta; Tanmoy Chakraborty;	emnlp	2024-11-11
285	A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing hate speech detection solutions have utilized the features by treating each post as an isolated input instance for the classification. This paper addresses this issue by introducing a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information.	Prashant Kapil; Asif Ekbal;	arxiv-cs.CL	2024-11-11
286	Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) evolve, evaluating their output reliably becomes increasingly difficult due to the high cost of human evaluation. To address this, we introduce FLAMe, a family of Foundational Large Autorater Models.	TU VU et. al.	emnlp	2024-11-11
287	Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a straightforward yet potent Conversation Reconstruction Attack.	Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang;	emnlp	2024-11-11
288	BudgetMLAgent: A Cost-Effective LLM Multi-Agent System for Automating Machine Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the motivation of developing a cost-efficient LLM based solution for solving ML tasks, we propose an LLM Multi-Agent based system which leverages combination of experts using profiling, efficient retrieval of past observations, LLM cascades, and ask-the-expert calls.	Shubham Gandhi; Manasi Patwardhan; Lovekesh Vig; Gautam Shroff;	arxiv-cs.MA	2024-11-11
289	Comparing A BERT Classifier and A GPT Classifier for Detecting Connective Language Across Multiple Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an approach for detecting connective language-defined as language that facilitates engagement, understanding, and conversation-from social media discussions.	Josephine Lukito; Bin Chen; Gina M. Masullo; Natalie Jomini Stroud;	emnlp	2024-11-11
290	SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy.	PRAKAMYA MISHRA et. al.	emnlp	2024-11-11
291	DA3: A Distribution-Aware Adversarial Attack Against Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, they are easy to detect using straightforward detection methods, diminishing the efficacy of such attacks. To address this issue, we propose a Distribution-Aware Adversarial Attack (DA3) method.	Yibo Wang; Xiangjue Dong; James Caverlee; Philip S. Yu;	emnlp	2024-11-11
292	Annotation Alignment: Comparing LLM and Human Annotations of Conversational Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that larger datasets are needed to resolve whether GPT-4 exhibits disparities in how well it correlates with different demographic groups.	Rajiv Movva; Pang Wei Koh; Emma Pierson;	emnlp	2024-11-11
293	On The Reliability of Psychological Scales on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits.	JEN-TSE HUANG et. al.	emnlp	2024-11-11
294	TreeCoders: Trees of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TreeCoders, a novel family of transformer trees.	Pierre Colonna D’Istria; Abdulrahman Altahhan;	arxiv-cs.CL	2024-11-11
295	TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A particular interest lies on keystroke dynamics (KD), which refers to the task of recognizing individuals’ identity based on their unique typing style. In this work, we propose the use of pre-trained language models (PLMs) to recognize such patterns.	Matheus Simão; Fabiano Prado; Omar Abdul Wahab; Anderson Avila;	arxiv-cs.CR	2024-11-11
296	MaLei at The PLABA Track of TAC-2024: RoBERTa for Task 1 — LLaMA3.1 and GPT-4o for Task 2 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In task one, we applied fine-tuned ReBERTa-Base models to identify and classify the difficult terms, jargon and acronyms in the biomedical abstracts and reported the F1 score.	Zhidong Ling; Zihao Li; Pablo Romero; Lifeng Han; Goran Nenadic;	arxiv-cs.CL	2024-11-11
297	GPT Vs RETRO: Exploring The Intersection of Retrieval and Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we apply PEFT methods (P-tuning, Adapters, and LoRA) to a modified Retrieval-Enhanced Transformer (RETRO) and a baseline GPT model across several sizes, ranging from 823 million to 48 billion parameters.	Aleksander Ficek; Jiaqi Zeng; Oleksii Kuchaiev;	emnlp	2024-11-11
298	On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.	YEKUN CHAI et. al.	emnlp	2024-11-11
299	Leveraging Pre-trained Language Models for Linguistic Analysis: A Case of Argument Structure Constructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of pre-trained language models in identifying argument structure constructions, important for modeling both first and second language learning.	Hakyung Sung; Kristopher Kyle;	emnlp	2024-11-11
300	Unraveling The Gradient Descent Dynamics of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradient Descent (GD) to achieve guaranteed convergence?	Bingqing Song; Boran Han; Shuai Zhang; Jie Ding; Mingyi Hong;	arxiv-cs.LG	2024-11-11
301	DAMRO: Dive Into The Attention Mechanism of LVLM to Reduce Object Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issue, we propose DAMRO, a novel training-free strategy that Dive into Attention Mechanism of LVLM to Reduce Object Hallucination.	Xuan Gong; Tianshi Ming; Xinpeng Wang; Zhihua Wei;	emnlp	2024-11-11
302	ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision.	Mallika Garg; Debashis Ghosh; Pyari Mohan Pradhan;	arxiv-cs.CV	2024-11-11
303	Knowledge Graph Enhanced Large Language Model Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of post-edit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME.	MENGQI ZHANG et. al.	emnlp	2024-11-11
304	Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first of its kind benchmark for depression-anxiety comorbidity classification from social media posts.	AMEY HENGLE et. al.	emnlp	2024-11-11
305	Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months.	Michael Lan; Philip Torr; Fazl Barez;	emnlp	2024-11-11
306	Will LLMs Replace The Encoder-Only Models in Temporal Relation Classification? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task.	Gabriel Roccabruna; Massimo Rizzoli; Giuseppe Riccardi;	emnlp	2024-11-11
307	Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese.	Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari;	emnlp	2024-11-11
308	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks.	VIKTORIIA A. CHEKALINA et. al.	emnlp	2024-11-11
309	BiasWipe: Mitigating Unintended Bias in Text Classifiers Through Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a robust and generalizable technique BiasWipe to mitigate unintended bias in language models.	Mamta Mamta; Rishikant Chigrupaatii; Asif Ekbal;	emnlp	2024-11-11
310	FOOL ME IF YOU CAN! An Adversarial Dataset to Investigate The Robustness of LMs in Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models still struggle with recognizing semantic boundaries and often misclassify homonyms in adversarial context. Therefore, we propose FOOL: FOur-fold Obscure Lexical, a new coarse-grained WSD dataset, which includes four different test sets designed to assess the robustness of language models in WSD tasks.	MOHAMAD BALLOUT et. al.	emnlp	2024-11-11
311	Evaluating ChatGPT-3.5 Efficiency in Solving Coding Problems of Different Complexity Levels: An Empirical Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We assess the performance of ChatGPT’s GPT-3.5-turbo model on LeetCode, a popular platform with algorithmic coding challenges for technical interview practice, across three difficulty levels: easy, medium, and hard.	Minda Li; Bhaskar Krishnamachari;	arxiv-cs.SE	2024-11-11
312	High-Fidelity Cellular Network Control-Plane Traffic Generation Without Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the feasibility of developing a high-fidelity MCN control plane traffic generator by leveraging generative ML models.	Z. Jonny Kong; Nathan Hu; Y. Charlie Hu; Jiayi Meng; Yaron Koral;	arxiv-cs.NI	2024-11-11
313	Using Language Models to Disambiguate Lexical Choices in Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English.	Josh Barua; Sanjay Subramanian; Kayo Yin; Alane Suhr;	emnlp	2024-11-11
314	Pron Vs Prompt: Can Large Language Models Already Challenge A World-Class Fiction Author at Creative Text Writing? Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Are LLMs ready to compete in creative writing skills with a top (rather than average) novelist? To provide an initial answer for this question, we have carried out a contest …	Guillermo Marco; Julio Gonzalo; M.Teresa Mateo-Girona; Ram�n Del Castillo Santos;	emnlp	2024-11-11
315	Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge.	Steven Y. Feng; Noah Goodman; Michael Frank;	emnlp	2024-11-11
316	Subword Segmentation in LLMs: Looking at Inflection and Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study two criteria: (i) adherence to morpheme boundaries and (ii) the segmentation consistency of the different inflected forms of a lemma.	Marion Di Marco; Alexander Fraser;	emnlp	2024-11-11
317	GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Iterative Refinement Induced Self-Jailbreak (IRIS), a novel approach that leverages the reflective capabilities of LLMs for jailbreaking with only black-box access.	Govind Ramesh; Yao Dou; Wei Xu;	emnlp	2024-11-11
318	Evaluating Psychological Safety of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we designed unbiased prompts to systematically evaluate the psychological safety of large language models (LLMs).	Xingxuan Li; Yutong Li; Lin Qiu; Shafiq Joty; Lidong Bing;	emnlp	2024-11-11
319	Ambient AI Scribing Support: Comparing The Performance of Specialized AI Agentic Architecture to Leading Foundational Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares Sporo Health’s AI Scribe, a proprietary model fine-tuned for medical scribing, with various LLMs (GPT-4o, GPT-3.5, Gemma-9B, and Llama-3.2-3B) in clinical documentation.	Chanseo Lee; Sonu Kumar; Kimon A. Vogt; Sam Meraj;	arxiv-cs.AI	2024-11-10
320	LProtector: An LLM-driven Vulnerability Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG).	ZE SHENG et. al.	arxiv-cs.CR	2024-11-10
321	Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This thesis introduces a Parameter-Efficient Fine-Tuning (PEFT) approach tailored for GPT-like models, aiming to mitigate hallucinations and enhance reproducibility, particularly in the computational domain of mass spectrometry.	Daniil Sulimov;	arxiv-cs.CL	2024-11-10
322	Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing finance benchmarks often suffer from limited language and task coverage, as well as challenges such as low-quality datasets and inadequate adaptability for LLM evaluation. To address these limitations, we propose Golden Touchstone, the first comprehensive bilingual benchmark for financial LLMs, which incorporates representative datasets from both Chinese and English across eight core financial NLP tasks.	XIAOJUN WU et. al.	arxiv-cs.CL	2024-11-09
323	AI’s Spatial Intelligence: Evaluating AI’s Understanding of Spatial Transformations in PSVT:R and Augmented Reality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies show Artificial Intelligence (AI) with language and vision capabilities still face limitations in spatial reasoning. In this paper, we have studied generative AI’s spatial capabilities of understanding rotations of objects utilizing its image and language processing features.	Uttamasha Monjoree; Wei Yan;	arxiv-cs.AI	2024-11-09
324	GPT Semantic Cache: Reducing LLM Costs and Latency Via Semantic Embedding Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GPT Semantic Cache, a method that leverages semantic caching of query embeddings in in-memory storage (Redis).	Sajal Regmi; Chetan Phakami Pun;	arxiv-cs.LG	2024-11-07
325	FineTuneBench: How Well Do Commercial Fine-tuning APIs Infuse Knowledge Into LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce FineTuneBench, an evaluation framework and dataset for understanding how well commercial fine-tuning APIs can successfully learn new and updated knowledge.	Eric Wu; Kevin Wu; James Zou;	arxiv-cs.CL	2024-11-07
326	High Entropy Alloy Property Predictions Using Transformer-based Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a language transformer-based machine learning model to predict key mechanical properties of high-entropy alloys (HEAs), addressing the challenges due to their complex, multi-principal element compositions and limited experimental data.	Spyros Kamnis; Konstantinos Delibasis;	arxiv-cs.CE	2024-11-07
327	Lightning IR: Straightforward Fine-tuning and Inference of Transformer-based Language Models for Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Lightning IR, an easy-to-use PyTorch Lightning-based framework for applying transformer-based language models in retrieval scenarios.	Ferdinand Schlatt; Maik Fröbe; Matthias Hagen;	arxiv-cs.IR	2024-11-07
328	Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams.	Adriana Caraeni; Alexander Scarlatos; Andrew Lan;	arxiv-cs.CY	2024-11-07
329	A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to explore how LLMs can alleviate the burden of manual summarization, streamline workflow efficiencies, and support informed decision-making in healthcare settings.	YIMING LI et. al.	arxiv-cs.CL	2024-11-06
330	On-Device Emoji Classifier Trained with GPT-based Data Augmentation for A Mobile Keyboard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an on-device emoji classifier based on MobileBert with reasonable memory and latency requirements for SwiftKey.	Hossam Amer; Joe Osborne; Michael Zaki; Mohamed Afify;	arxiv-cs.CL	2024-11-06
331	Understanding The Effects of Human-written Paraphrases in LLM-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we devise a new data collection strategy to collect Human & LLM Paraphrase Collection (HLPC), a first-of-its-kind dataset that incorporates human-written texts and paraphrases, as well as LLM-generated texts and paraphrases.	Hiu Ting Lau; Arkaitz Zubiaga;	arxiv-cs.CL	2024-11-06
332	Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the feasibility of using large language models (LLMs), specifically GPT-4o (ChatGPT), for automated grading of conceptual questions in an undergraduate Mechanical Engineering course.	RUJUN GAO et. al.	arxiv-cs.CY	2024-11-05
333	Automatic Generation of Question Hints for Mathematics Problems Using Large Language Models in Educational Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present here the study of several dimensions: 1) identifying error patterns made by simulated students on secondary-level math exercises; 2) developing various prompts for GPT-4o as a teacher and evaluating their effectiveness in generating hints that enable simulated students to self-correct; and 3) testing the best-performing prompts, based on their ability to produce relevant hints and facilitate error correction, with Llama-3-8B-Instruct as the teacher, allowing for a performance comparison with GPT-4o.	Junior Cedric Tonga; Benjamin Clement; Pierre-Yves Oudeyer;	arxiv-cs.CL	2024-11-05
334	Enhancing Transformer Training Efficiency with Dynamic Dropout Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Dynamic Dropout, a novel regularization technique designed to enhance the training efficiency of Transformer models by dynamically adjusting the dropout rate based on training epochs or validation loss improvements.	Hanrui Yan; Dan Shao;	arxiv-cs.LG	2024-11-05
335	From Medprompt to O1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Following on the Medprompt study with GPT-4, we systematically evaluate the o1-preview model across various medical benchmarks.	HARSHA NORI et. al.	arxiv-cs.CL	2024-11-05
336	Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we identify representation collapse in the model’s intermediate layers as a key factor limiting their reasoning capabilities.	MD RIFAT AREFIN et. al.	arxiv-cs.LG	2024-11-04
337	Wave Network: An Ultra-Small Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an innovative token representation and update method in a new ultra-small language model: the Wave network.	Xin Zhang; Victor S. Sheng;	arxiv-cs.CL	2024-11-04
338	Ask, and It Shall Be Given: Turing Completeness of Prompting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Since the success of GPT, large language models (LLMs) have been revolutionizing machine learning and have initiated the so-called LLM prompting paradigm. In the era of LLMs, …	Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong;	ArXiv	2024-11-04
339	Ask, and It Shall Be Given: On The Turing Completeness of Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function.	Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong;	arxiv-cs.LG	2024-11-04
340	Advancements and Limitations of LLMs in Replicating Human Color-word Associations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compared multiple generations of LLMs (from GPT-3 to GPT-4o) against human color-word associations using data collected from over 10,000 Japanese participants, involving 17 colors and words from eight categories in Japanese.	Makoto Fukushima; Shusuke Eshita; Hiroshige Fukuhara;	arxiv-cs.CL	2024-11-04
341	Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging advancements in natural language processing, this study presents a systematic approach to enrich tabular datasets with features derived from large language model embeddings.	Gjergji Kasneci; Enkelejda Kasneci;	arxiv-cs.LG	2024-11-03
342	Can Large Language Model Predict Employee Attrition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning (ML) advancements offer more scalable and accurate solutions, but large language models (LLMs) introduce new potential in human resource management by interpreting nuanced employee communication and detecting subtle turnover cues.	Xiaoye Ma; Weiheng Liu; Changyi Zhao; Liliya R. Tukhvatulina;	arxiv-cs.LG	2024-11-02
343	Transformer-CNN for Small Image Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Yan-Lin Chen; Chun-Liang Lin; Yu-Chen Lin; Tzu-Chun Chen;	Signal Process. Image Commun.	2024-11-01
344	Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism.	BIN LEI et. al.	arxiv-cs.AI	2024-11-01
345	LLMs: A Game-Changer for Software Engineers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a critical analysis of technical strengths, limitations, real-world case studies, and future research directions, this paper argues that LLMs are not just reshaping how software is developed but are redefining the role of developers.	Md Asraful Haque;	arxiv-cs.SE	2024-11-01
346	Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, we introduce the Lingma SWE-GPT series, comprising Lingma SWE-GPT 7B and 72B.	YINGWEI MA et. al.	arxiv-cs.SE	2024-11-01
347	GameGen-X: Interactive Open-world Game Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos.	Haoxuan Che; Xuanhua He; Quande Liu; Cheng Jin; Hao Chen;	arxiv-cs.CV	2024-11-01
348	A Lightweight CNN-Transformer Network for Pixel-based Crop Mapping Using Time-series Sentinel-2 Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View	YUMIAO WANG et. al.	Comput. Electron. Agric.	2024-11-01
349	Online Semi-Supervised Transformer for Resilient Vehicle GNSS/INS Navigation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Inertial Navigation Systems (INS) and Global Navigation Satellite Systems (GNSS) integrated navigation system is widely employed for vehicular positioning. However, obstacles …	HAOWEN WANG et. al.	IEEE Transactions on Vehicular Technology	2024-11-01
350	GPT for Games: An Updated Scoping Review (2020-2024) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to illustrate the state of the art in innovative GPT applications in games, offering a foundation to enrich game development and enhance player experiences through cutting-edge AI innovations.	Daijin Yang; Erica Kleinman; Casper Harteveld;	arxiv-cs.AI	2024-10-31
351	Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces.	Nouar AlDahoul; Myles Joshua Toledo Tan; Harishwar Reddy Kasireddy; Yasir Zaki;	arxiv-cs.CV	2024-10-31
352	GPT or BERT: Why Not Both? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple way to merge masked language modeling with causal language modeling.	Lucas Georges Gabriel Charpentier; David Samuel;	arxiv-cs.CL	2024-10-31
353	IO Transformer: Evaluating SwinV2-Based Reward Models for Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines SwinV2-based reward models, called the Input-Output Transformer (IO Transformer) and the Output Transformer.	Maxwell Meyer; Jack Spruyt;	arxiv-cs.CV	2024-10-31
354	Handwriting Recognition in Historical Documents with Multimodal LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I evaluate the accuracy of handwritten document transcriptions generated by Gemini against the current state of the art Transformer based methods.	Lucian Li;	arxiv-cs.CV	2024-10-31
355	Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases.	MUHAMMED SAEED et. al.	arxiv-cs.CL	2024-10-31
356	Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN).	IBNE HASSAN et. al.	arxiv-cs.CV	2024-10-31
357	EDT: An Efficient Diffusion Transformer Framework Inspired By Human-like Sketching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the computation budget of transformer-based DPMs, this work proposes the Efficient Diffusion Transformer (EDT) framework.	Xinwang Chen; Ning Liu; Yichen Zhu; Feifei Feng; Jian Tang;	arxiv-cs.CV	2024-10-31
358	An Empirical Analysis of GPT-4V’s Performance on Fashion Aesthetic Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time.	YUKI HIRAKAWA et. al.	arxiv-cs.CV	2024-10-31
359	LoFLAT: Local Feature Matching Using Focused Linear Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper.	Naijian Cao; Renjie He; Yuchao Dai; Mingyi He;	arxiv-cs.CV	2024-10-30
360	A Comprehensive Study on Quantization Techniques for Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have been extensively researched and used in both academia and industry since the rise in popularity of the Transformer model, which demonstrates …	Jiedong Lang; Zhehao Guo; Shuyu Huang;	ArXiv	2024-10-30
361	ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching.	JUNJIE NI et. al.	arxiv-cs.CV	2024-10-30
362	Automated Personnel Selection for Software Engineers Using LLM-Based Profile Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a fresh dataset and technique as well as shows how transformer models could improve recruiting procedures.	Ahmed Akib Jawad Karim; Shahria Hoque; Md. Golam Rabiul Alam; Md. Zia Uddin;	arxiv-cs.SE	2024-10-30
363	ProTransformer: Robustify Transformers Via Plug-and-Play Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures.	Zhichao Hou; Weizhi Gao; Yuchen Shen; Feiyi Wang; Xiaorui Liu;	arxiv-cs.LG	2024-10-30
364	EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The former hurts the fairness of benchmarks, and the latter hinders practitioners from selecting superior LLMs for specific programming domains. To address these two limitations, we propose a new benchmark – EvoCodeBench, which has the following advances: (1) Evolving data.	JIA LI et. al.	arxiv-cs.CL	2024-10-30
365	GPT-4o Reads The Mind in The Eyes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using two versions of a widely used theory of mind test, the Reading the Mind in Eyes Test and the Multiracial Reading the Mind in the Eyes Test, we found that GPT-4o outperformed humans in interpreting mental states from upright faces but underperformed humans when faces were inverted.	JAMES W. A. STRACHAN et. al.	arxiv-cs.HC	2024-10-29
366	Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context for preference.	Rishabh Adiga; Besmira Nushi; Varun Chandrasekaran;	arxiv-cs.CL	2024-10-29
367	AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent work, AmpleGCG~\citep{liao2024amplegcg}, demonstrates that a generative model can quickly produce numerous customizable gibberish adversarial suffixes for any harmful query, exposing a range of alignment gaps in out-of-distribution (OOD) language spaces. To bring more attention to this area, we introduce AmpleGCG-Plus, an enhanced version that achieves better performance in fewer attempts.	Vishal Kumar; Zeyi Liao; Jaylen Jones; Huan Sun;	arxiv-cs.CL	2024-10-29
368	Is GPT-4 Less Politically Biased Than GPT-3.5? A Renewed Investigation of ChatGPT’s Political Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the political biases and personality traits of ChatGPT, specifically comparing GPT-3.5 to GPT-4.	Erik Weber; Jérôme Rutinowski; Niklas Jost; Markus Pauly;	arxiv-cs.CL	2024-10-28
369	Sequential Choice in Ordered Bundles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate several predictive models, including two custom Transformers using decoder-only and encoder-decoder architectures, fine-tuned GPT-3, a custom LSTM model, a reinforcement learning model, two Markov models, and a zero-order model.	Rajeev Kohli; Kriste Krstovski; Hengyu Kuang; Hengxu Lin;	arxiv-cs.LG	2024-10-28
370	A Simple Yet Effective Corpus Construction Framework for Indonesian Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How to efficiently construct high-quality evaluation corpora for GEC in low-resource languages has become a significant challenge. To fill these gaps, in this paper, we present a framework for constructing GEC corpora.	NANKAI LIN et. al.	arxiv-cs.CL	2024-10-28
371	SepMamba: State-space Models for Speaker Separation Using Mamba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers.	THOR HØJHUS AVENSTRUP et. al.	arxiv-cs.SD	2024-10-28
372	KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD.	Rambod Azimi; Rishav Rishav; Marek Teichmann; Samira Ebrahimi Kahou;	arxiv-cs.CL	2024-10-28
373	Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a medical literature summary generation method based on the BERT model to address the challenges brought by the current explosion of medical information.	JIACHENG HU et. al.	arxiv-cs.CL	2024-10-28
374	MultiTok: Variable-Length Tokenization for Efficient LLMs Adapted from LZW Compression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, current methodologies to train such LLMs require extensive resources including but not limited to large amounts of data, expensive machinery, and lengthy training. To solve this problem, this paper proposes a new tokenization method inspired by universal Lempel-Ziv-Welch data compression that compresses repetitive phrases into multi-word tokens.	Noel Elias; Homa Esfahanizadeh; Kaan Kale; Sriram Vishwanath; Muriel Medard;	arxiv-cs.CL	2024-10-28
375	UOttawa at LegalLens-2024: Transformer-based Classification Experiments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents the methods used for LegalLens-2024 shared task, which focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals.	Nima Meghdadi; Diana Inkpen;	arxiv-cs.CL	2024-10-28
376	Gender Bias in LLM-generated Interview Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications.	Haein Kong; Yongsu Ahn; Sangyub Lee; Yunho Maeng;	arxiv-cs.CL	2024-10-28
377	Exploring The Potential of Large Language Models for Red Teaming in Military Coalition Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper reports on an ongoing investigation comparing the performance of large language models (LLMs) in generating penetration test scripts for realistic red agents. The goal …	ERIK ADLER et. al.	MILCOM 2024 – 2024 IEEE Military Communications Conference …	2024-10-28
378	Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project explores the security vulnerabilities in relation to prompt injection attacks.	Md Abdur Rahman; Fan Wu; Alfredo Cuzzocrea; Sheikh Iqbal Ahamed;	arxiv-cs.CL	2024-10-27
379	SeisGPT: A Physics-Informed Data-Driven Large Model for Real-Time Seismic Response Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods, which rely on complex finite element models often struggle with balancing computational efficiency and accuracy. To address this challenge, we introduce SeisGPT, a data-driven, large physics-informed model that leverages deep neural networks based on the Generative Pre-trained Transformer (GPT) architecture.	SHIQIAO MENG et. al.	arxiv-cs.CE	2024-10-26
380	Notes on The Mathematical Structure of GPT LLM Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM. …	Spencer Becker-Kahn;	arxiv-cs.LG	2024-10-25
381	GPT-4o System Card IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. …	OPENAI AARON HURST et. al.	ArXiv	2024-10-25
382	No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts.	ISRAEL FAMA et. al.	arxiv-cs.CL	2024-10-24
383	Probing Ranking LLMs: Mechanistic Interpretability in Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the mechanistic workings of state-of-the-art, fine-tuning-based passage-reranking transformer networks.	Tanya Chowdhury; James Allan;	arxiv-cs.IR	2024-10-24
384	Integrating Large Language Models with Internet of Things Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection, macroprogramming over IoT systems, and sensor data processing.	Mingyu Zong; Arvin Hekmati; Michael Guastalla; Yiyi Li; Bhaskar Krishnamachari;	arxiv-cs.AI	2024-10-24
385	GPT-Signal: Generative AI for Semi-automated Feature Engineering in The Alpha Research Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the recent development of Generative Artificial Intelligence(Gen AI) and Large Language Models (LLMs), we present a novel way of leveraging GPT-4 to generate new return-predictive formulaic alphas, making alpha mining a semi-automated process, and saving time and energy for investors and traders.	Yining Wang; Jinman Zhao; Yuri Lawryshyn;	arxiv-cs.CE	2024-10-24
386	Lightweight Neural App Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel mobile phone control architecture, Lightweight Multi-modal App Control (LiMAC), for efficient interactions and control across various Android apps.	FILIPPOS CHRISTIANOS et. al.	arxiv-cs.AI	2024-10-23
387	OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel End-to-End GPT-based model OmniFlatten for full-duplex conversation, capable of effectively modeling the complex behaviors inherent to natural conversations with low latency.	QINGLIN ZHANG et. al.	arxiv-cs.CL	2024-10-23
388	Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords.	Farshad Jafari; Claire Arthur;	arxiv-cs.IT	2024-10-23
389	An Eye for An AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that although GPT-4o exhibits great potential in solving questions with visual information independently, major limitations still exist to the accuracy and quality of the generated results. We propose several novel approaches for CG educators to incorporate GenAI into CG teaching despite these limitations.	Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Jacqueline Whalley;	arxiv-cs.AI	2024-10-22
390	Interpreting Affine Recurrence Learning in GPT-style Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In-context learning allows transformers to generalize during inference without modifying their weights, yet the precise operations driving this capability remain largely opaque. This paper presents an investigation into the mechanistic interpretability of these transformers, focusing specifically on their ability to learn and predict affine recurrences as an ICL task.	Samarth Bhargav; Alexander Gu;	arxiv-cs.LG	2024-10-22
391	In Context Learning and Reasoning for Symbolic Regression with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore the potential of LLMs to perform symbolic regression — a machine-learning method for finding simple and accurate equations from datasets.	Samiha Sharlin; Tyler R. Josephson;	arxiv-cs.CL	2024-10-22
392	GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although large language models (LLMs) have demonstrated potential in code generation tasks, they often encounter issues such as refusal to code or hallucination in geospatial code generation due to a lack of domain-specific knowledge and code corpora. To address these challenges, this paper presents and open-sources the GeoCode-PT and GeoCode-SFT corpora, along with the GeoCode-Eval evaluation dataset.	SHUYANG HOU et. al.	arxiv-cs.SE	2024-10-22
393	Graph Transformers Dream of Electric Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The input to the Transformer is simply the graph incidence matrix; no other explicit positional encoding information is provided. We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms.	Xiang Cheng; Lawrence Carin; Suvrit Sra;	arxiv-cs.LG	2024-10-22
394	Learning to Differentiate Pairwise-Argument Representations for Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable encoders to produce clearly distinguishable representations, we propose a joint learning framework.	ZHIPANG WANG et. al.	cikm	2024-10-21
395	Exploring Pretraining Via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pretraining strategy that uses active forgetting to achieve similar cross lingual transfer in decoder-only LLMs.	Divyanshu Aggarwal; Ashutosh Sathe; Sunayana Sitaram;	arxiv-cs.CL	2024-10-21
396	Comparative Study of Multilingual Idioms and Similes in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the gap in the literature concerning the comparative performance of LLMs in interpreting different types of figurative language across multiple languages.	PARIA KHOSHTAB et. al.	arxiv-cs.CL	2024-10-21
397	BART-based Hierarchical Attentional Network for Sentence Ordering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel BART-based Hierarchical Attentional Ordering Network (BHAONet), aiming to address the coherence modeling challenge within paragraphs, which stands as a cornerstone in comprehension, generation, and reasoning tasks.	Yiping Yang; Baiyun Cui; Yingming Li;	cikm	2024-10-21
398	Inferring Visualization Intent from Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider a conversational approach to visualization, where users specify their needs at each step in natural language, with a visualization being returned in turn.	Haotian Li; Nithin Chalapathi; Huamin Qu; Alvin Cheung; Aditya G. Parameswaran;	cikm	2024-10-21
399	Large Language Models in Computer Science Education: A Systematic Literature Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, …	Nishat Raihan; Mohammed Latif Siddiq; Joanna C. S. Santos; Marcos Zampieri;	ArXiv	2024-10-21
400	Improving Neuron-level Interpretability with White-box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE), explicitly engineered to capture sparse, low-dimensional structures within data distributions.	Hao Bai; Yi Ma;	arxiv-cs.CL	2024-10-21
401	Using GPT Models for Qualitative and Quantitative News Analytics in The 2024 US Presidental Election Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper considers an approach of using Google Search API and GPT-4o model for qualitative and quantitative analyses of news through retrieval-augmented generation (RAG).	Bohdan M. Pavlyshenko;	arxiv-cs.CL	2024-10-21
402	Diffusion Transformer Policy: Scaling Diffusion Transformer for Generalist Vision-Language-Action Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent large vision-language action models pretrained on diverse robot datasets have demonstrated the potential for generalizing to new environments with a few in-domain data.	ZHI HOU et. al.	arxiv-cs.RO	2024-10-21
403	Does ChatGPT Have A Poetic Style? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the GPT models, especially GPT-4, can successfully produce poems in a range of both common and uncommon English-language forms in superficial yet noteworthy ways, such as by producing poems of appropriate lengths for sonnets (14 lines), villanelles (19 lines), and sestinas (39 lines).	Melanie Walsh; Anna Preus; Elizabeth Gronski;	arxiv-cs.CL	2024-10-20
404	Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias.	Sanguk Lee; Kai-Qi Yang; Tai-Quan Peng; Ruth Heo; Hui Liu;	arxiv-cs.AI	2024-10-20
405	BERTtime Stories: Investigating The Role of Synthetic Story Data in Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe our contribution to the Strict and Strict-Small tracks of the 2nd iteration of the BabyLM Challenge.	Nikitas Theodoropoulos; Giorgos Filandrianos; Vassilis Lyberatos; Maria Lymperaiou; Giorgos Stamou;	arxiv-cs.CL	2024-10-20
406	DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-based Proximal Policy Optimization (DTPPO) method.	Anning Wei; Jintao Liang; Kaiyuan Lin; Ziyue Li; Rui Zhao;	arxiv-cs.MA	2024-10-19
407	Bias Amplification: Language Models As Increasingly Biased Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the gap in understanding the bias amplification of LLMs with four main contributions. Firstly, we propose a theoretical framework, defining the necessary and sufficient conditions for its occurrence, and emphasizing that it occurs independently of model collapse.	ZE WANG et. al.	arxiv-cs.AI	2024-10-19
408	Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This scarcity of annotated data impedes the development of effective machine learning models for cancer document classification. To address this challenge, we present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics.	ELIAS HOSSAIN et. al.	arxiv-cs.AI	2024-10-19
409	From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation By Natural Language Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces SecCode, a framework that leverages an innovative interactive encouragement prompting (EP) technique for secure code generation with \textit{only NL} prompts.	SHIGANG LIU et. al.	arxiv-cs.CR	2024-10-18
410	Automated Genre-Aware Article Scoring and Feedback Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the development of an advanced intelligent article scoring system that not only assesses the overall quality of written work but also offers detailed feature-based scoring tailored to various article genres.	CHIHANG WANG et. al.	arxiv-cs.CL	2024-10-18
411	XPerT: Extended Persistence Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel transformer architecture called the \textit{Extended Persistence Transformer (xPerT)}, which is highly scalable than the compared to Persformer, an existing transformer for persistence diagrams.	Sehun Kim;	arxiv-cs.LG	2024-10-18
412	Harmony: A Home Agent for Responsive Management and Action Optimization with A Locally Deployed Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to optimize the privacy and economy of data processing while maintaining the powerful functions of LLMs, we propose Harmony, a smart home assistant framework that uses a locally deployable small-scale LLM.	Ziqi Yin; Mingxin Zhang; Daisuke Kawahara;	arxiv-cs.HC	2024-10-18
413	Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs.	XINGYU TAN et. al.	arxiv-cs.CL	2024-10-18
414	Detecting AI-Generated Texts in Cross-Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs.	You Zhou; Jie Wang;	arxiv-cs.CL	2024-10-17
415	Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that transformer-based solutions pose higher computational demands, consistently yield inferior performance, and experience significant performance degradation when quantized to accommodate resource-constrained devices.	Clayton Souza Leite; Henry Mauranen; Aziza Zhanabatyrova; Yu Xiao;	arxiv-cs.LG	2024-10-17
416	FaithBench: A Diverse Hallucination Benchmark for Summarization By Modern LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FaithBench, a summarization hallucination benchmark comprising challenging hallucinations made by 10 modern LLMs from 8 different families, with ground truth annotations by human experts.	FORREST SHENG BAO et. al.	arxiv-cs.CL	2024-10-17
417	Transfer Learning on Transformers for Building Energy Consumption Forecasting — A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting.	Robert Spencer; Surangika Ranathunga; Mikael Boulic; Andries van Heerden; Teo Susnjak;	arxiv-cs.LG	2024-10-17
418	SBI-RAG: Enhancing Math Word Problem Solving for Students Through Schema-Based Instruction and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Schema-based instruction (SBI) is an evidence-based strategy that helps students categorize problems based on their structure, improving problem-solving accuracy. Building on this, we propose a Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG) framework that incorporates a large language model (LLM).	Prakhar Dixit; Tim Oates;	arxiv-cs.LG	2024-10-17
419	Judgment of Learning: A Human Ability Beyond Generative Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce a cross-agent prediction model to assess whether ChatGPT-based LLMs align with human judgments of learning (JOL), a metacognitive measure where individuals predict their own future memory performance.	Markus Huff; Elanur Ulakçı;	arxiv-cs.CL	2024-10-17
420	Linguistically Grounded Analysis of Language Models Using Shapley Head Values Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the processing of morphosyntactic phenomena, by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs).	Marcell Fekete; Johannes Bjerva;	arxiv-cs.CL	2024-10-17
421	Measuring and Modifying The Readability of English Texts with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, in a pre-registered human experiment (N = 59), we ask whether Turbo can reliably make text easier or harder to read. We find evidence to support this hypothesis, though considerable variance in human judgments remains unexplained.	Sean Trott; Pamela D. Rivière;	arxiv-cs.CL	2024-10-17
422	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed raster order using BERT- or GPT-like transformer architectures.	LIJIE FAN et. al.	arxiv-cs.CV	2024-10-17
423	Unifying Economic and Language Models for Enhanced Sentiment Analysis of The Oil Market Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these LMs often have difficulty with domain-specific terminology, limiting their effectiveness in the crude oil sector. Addressing this gap, we introduce CrudeBERT, a fine-tuned LM specifically for the crude oil market.	Himmet Kaplan; Ralf-Peter Mundani; Heiko Rölke; Albert Weichselbraun; Martin Tschudy;	arxiv-cs.IR	2024-10-16
424	Stabilize The Latent Space for Image Autoregressive Modeling: A Unified Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This finding contrasts sharply with the field of NLP, where the autoregressive model GPT has established a commanding presence. To address this discrepancy, we introduce a unified perspective on the relationship between latent space and generative models, emphasizing the stability of latent space in image generative modeling.	YONGXIN ZHU et. al.	arxiv-cs.CV	2024-10-16
425	With A Grain of SALT: Are LLMs Fair Across Social Dimensions? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an analysis of biases in open-source Large Language Models (LLMs) across various genders, religions, and races.	Samee Arif; Zohaib Khan; Agha Ali Raza; Awais Athar;	arxiv-cs.CL	2024-10-16
426	Context-Scaling Versus Task-Scaling in In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training.	Amirhesam Abedsoltan; Adityanarayanan Radhakrishnan; Jingfeng Wu; Mikhail Belkin;	arxiv-cs.LG	2024-10-16
427	Reconstruction of Differentially Private Text Sanitization Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two attacks (black-box and white-box) based on the accessibility to LLMs and show that LLMs could connect the pair of DP-sanitized text and the corresponding private training data of LLMs by giving sample text pairs as instructions (in the black-box attacks) or fine-tuning data (in the white-box attacks).	SHUCHAO PANG et. al.	arxiv-cs.CR	2024-10-16
428	When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether GPTs can appropriately respond to unanswerable math word problems by applying prompts typically used in solvable mathematical scenarios.	Asir Saadat; Tasmia Binte Sogir; Md Taukir Azam Chowdhury; Syem Aziz;	arxiv-cs.CL	2024-10-16
429	SELF-BART : A Transformer-based Molecular Representation Model Using SELFIES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we develop an encoder-decoder model based on BART that is capable of leaning molecular representations and generate new molecules.	INDRA PRIYADARSINI et. al.	arxiv-cs.CE	2024-10-16
430	Table-LLM-Specialist: Language Model Specialists for Tables Using Iterative Generator-Validator Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Table-LLM-Specialist, or Table-Specialist for short, as a new self-trained fine-tuning paradigm specifically designed for table tasks.	JUNJIE XING et. al.	arxiv-cs.CL	2024-10-15
431	In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have achieved impressive results across various tasks.	Alireza Shamshiri; Kyeong Rok Ryu; June Young Park;	arxiv-cs.CL	2024-10-15
432	Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Jigsaw Puzzles (JSP), a straightforward yet effective multi-turn jailbreak strategy against the advanced LLMs.	Hao Yang; Lizhen Qu; Ehsan Shareghi; Gholamreza Haffari;	arxiv-cs.CL	2024-10-15
433	TraM : Enhancing User Sleep Prediction with Transformer-based Multivariate Time Series Modeling and Machine Learning Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach that leverages Transformer-based multivariate time series model and Machine Learning Ensembles to predict the quality of human sleep, emotional states, and stress levels.	Jinjae Kim; Minjeong Ma; Eunjee Choi; Keunhee Cho; Chanwoo Lee;	arxiv-cs.LG	2024-10-15
434	De-jargonizing Science for Journalists with GPT-4: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study offers an initial evaluation of a human-in-the-loop system leveraging GPT-4 (a large language model or LLM), and Retrieval-Augmented Generation (RAG) to identify and define jargon terms in scientific abstracts, based on readers’ self-reported knowledge.	Sachita Nishal; Eric Lee; Nicholas Diakopoulos;	arxiv-cs.CL	2024-10-15
435	Embedding Self-Correction As An Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often encounter difficulties in certain aspects of mathematical reasoning, leading to flawed reasoning and erroneous results. To mitigate these issues, we introduce a novel mechanism, the Chain of Self-Correction (CoSC), specifically designed to embed self-correction as an inherent ability in LLMs, enabling them to validate and rectify their own results.	Kuofeng Gao; Huanqia Cai; Qingyao Shuai; Dihong Gong; Zhifeng Li;	arxiv-cs.AI	2024-10-14
436	Rethinking Legal Judgement Prediction in A Realistic Scenario in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking.	Shubham Kumar Nigam; Aniket Deroy; Subhankar Maity; Arnab Bhattacharya;	arxiv-cs.CL	2024-10-14
437	Open6DOR: Benchmarking Open-instruction 6-DoF Object Rearrangement and A VLM-based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The integration of large-scale Vision-Language Models (VLMs) with embodied AI can greatly enhance the generalizability and the capacity to follow open instructions for robots. …	YUFEI DING et. al.	2024 IEEE/RSJ International Conference on Intelligent …	2024-10-14
438	Performance in A Dialectal Profiling Task of LLMs for Varieties of Brazilian Portuguese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results offer sociolinguistic contributions for an equity fluent NLP technology.	Raquel Meister Ko Freitag; Túlio Sousa de Gois;	arxiv-cs.CL	2024-10-14
439	RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers.	Md Kowsher; Tara Esmaeilbeig; Chun-Nam Yu; Mojtaba Soltanalian; Niloofar Yousefi;	arxiv-cs.CL	2024-10-13
440	Evaluating Gender Bias of LLMs in Making Morality Judgements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates whether current closed and open-source LLMs possess gender bias, especially when asked to give moral opinions. To evaluate these models, we curate and introduce a new dataset GenMO (Gender-bias in Morality Opinions) comprising parallel short stories featuring male and female characters respectively.	Divij Bajaj; Yuanyuan Lei; Jonathan Tong; Ruihong Huang;	arxiv-cs.CL	2024-10-13
441	Transformer-based Language Models for Reasoning in The Description Logic ALCQ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we systematically investigate the logical reasoning capabilities of a supervised fine-tuned DeBERTa-based model and two large language models (GPT-3.5, GPT-4) with few-shot prompting.	Angelos Poulis; Eleni Tsalapati; Manolis Koubarakis;	arxiv-cs.CL	2024-10-12
442	\llinstruct: An Instruction-tuned Model for English Language Proficiency Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications.	Debanjan Ghosh; Sophia Chan;	arxiv-cs.CL	2024-10-11
443	Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data.	Grace Proebsting; Adam Poliak;	arxiv-cs.CL	2024-10-11
444	Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism Via Dual Diffusion Models and GPT Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Traditional methods often rely on extensive and costly data collection using sonar sensors, jeopardizing data quality and diversity. To overcome these limitations, this study proposes a new sonar image synthesis framework, Synth-SONAR leveraging diffusion models and GPT prompting.	Purushothaman Natarajan; Kamal Basha; Athira Nambiar;	arxiv-cs.CV	2024-10-11
445	Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a pipeline for developing in-house LLMs tailored to identify differential diagnoses from radiology reports.	LUOYAO CHEN et. al.	arxiv-cs.CL	2024-10-11
446	Improving Legal Entity Recognition Using A Hybrid Transformer Model and Semantic Filtering Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel hybrid model that enhances the accuracy and precision of Legal-BERT, a transformer model fine-tuned for legal text processing, by introducing a semantic similarity-based filtering mechanism.	Duraimurugan Rajamanickam;	arxiv-cs.CL	2024-10-11
447	Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture.	Evan Lucas; Dylan Kangas; Timothy C Havens;	arxiv-cs.CL	2024-10-11
448	AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For instance, attacks tend to be less effective when models pay more attention to system prompts designed to ensure LLM safety alignment. Building on this discovery, we introduce an enhanced method that manipulates models’ attention scores to facilitate LLM jailbreaking, which we term AttnGCG.	ZIJUN WANG et. al.	arxiv-cs.CL	2024-10-11
449	The Rise of AI-Generated Content in Wikipedia Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative, to establish lower bounds on the presence of AI-generated content in recently created Wikipedia pages.	Creston Brooks; Samuel Eggert; Denis Peskoff;	arxiv-cs.CL	2024-10-10
450	Robust AI-Generated Text Detection By Restricted Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on the robustness of classifier-based detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains.	KRISTIAN KUZNETSOV et. al.	arxiv-cs.CL	2024-10-10
451	HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a method named HorGait, which utilizes a hybrid model with a Transformer architecture for gait recognition on the planar projection of 3D point clouds from LiDAR.	JIAXING HAO et. al.	arxiv-cs.CV	2024-10-10
452	Evaluating Transformer Models for Suicide Risk Detection on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on leveraging state-of-the-art natural language processing solutions for identifying suicide risk in social media posts as a submission for the IEEE BigData 2024 Cup: Detection of Suicide Risk on Social Media conducted by the kubapok team.	Jakub Pokrywka; Jeremi I. Kaczmarek; Edward J. Gorzelańczyk;	arxiv-cs.CL	2024-10-10
453	Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While morally clear scenarios are more discernible to LLMs, greater difficulty is encountered in morally ambiguous contexts. In this investigation, we explored LLM calibration to show that human and LLM judgments are poorly aligned in such scenarios.	PRANAV SENTHILKUMAR et. al.	arxiv-cs.CL	2024-10-10
454	VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce VibeCheck, a system for automatically comparing a pair of LLMs by discovering identifying traits of a model (vibes) that are well-defined, differentiating, and user-aligned.	Lisa Dunlap; Krishna Mandal; Trevor Darrell; Jacob Steinhardt; Joseph E Gonzalez;	arxiv-cs.CL	2024-10-10
455	SWE-Bench+: Enhanced Coding Benchmark for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a systematic evaluation of the quality of SWE-bench remains missing. In this paper, we addressed this gap by presenting an empirical analysis of the SWE-bench dataset.	REEM ALEITHAN et. al.	arxiv-cs.SE	2024-10-09
456	Stanceformer: Target-Aware Transformer for Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, these models yield similar performance regardless of whether we utilize or disregard target information, undermining the task’s significance. To address this challenge, we introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference.	Krishna Garg; Cornelia Caragea;	arxiv-cs.CL	2024-10-09
457	Optimized Spatial Architecture Mapping Flow for Transformer Accelerators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the design process for existing spatial architectures is predominantly manual, and it often involves time-consuming redesigns for new applications and new problem dimensions, which greatly limits the development of optimally designed accelerators for Transformer models. To address these challenges, we propose SAMT (Spatial Architecture Mapping for Transformers), a comprehensive framework designed to optimize the dataflow mapping of Transformer inference workloads onto spatial accelerators.	HAOCHENG XU et. al.	arxiv-cs.AR	2024-10-09
458	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks.	VIKTORIIA CHEKALINA et. al.	arxiv-cs.CL	2024-10-09
459	InAttention: Linear Context Scaling for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we modify the decoder-only transformer, replacing self-attention with InAttention, which scales linearly with context length during inference by having tokens attend only to initial states.	Joseph Eisner;	arxiv-cs.LG	2024-10-09
460	Unveiling Transformer Perception By Exploring Input Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a general method for the exploration of equivalence classes in the input space of Transformer models.	Alessandro Benfenati; Alfio Ferrara; Alessio Marta; Davide Riva; Elisabetta Rocchetti;	arxiv-cs.LG	2024-10-08
461	Solving Multi-Goal Robotic Tasks with Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, no existing methods effectively combine offline training, multi-goal learning, and transformer-based architectures. In this paper, we address these challenges by introducing a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics.	Paul Gajewski; Dominik Żurek; Marcin Pietroń; Kamil Faber;	arxiv-cs.RO	2024-10-08
462	SC-Bench: A Large-Scale Dataset for Smart Contract Auditing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SC-Bench, the first dataset for automated smart-contract auditing research.	Shihao Xia; Mengting He; Linhai Song; Yiying Zhang;	arxiv-cs.CR	2024-10-08
463	Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini – frontier models trained to be helpful, harmless, and honest – can engage in specification gaming without training on a curriculum of tasks, purely from in-context iterative reflection (which we call in-context reinforcement learning, ICRL).	Leo McKee-Reid; Christoph Sträter; Maria Angelica Martinez; Joe Needham; Mikita Balesni;	arxiv-cs.AI	2024-10-08
464	A GPT-based Decision Transformer for Multi-Vehicle Coordination at Unsignalized Intersections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the application of the Decision Transformer, a decision-making algorithm based on the Generative Pre-trained Transformer (GPT) architecture, to multi-vehicle coordination at unsignalized intersections.	Eunjae Lee; Minhee Kang; Yoojin Choi; Heejin Ahn;	arxiv-cs.RO	2024-10-08
465	A Comparative Study of Hybrid Models in Health Misinformation Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs), aiming to develop more effective tools for countering the spread of health misinformation during the pan-demic.	Mkululi Sikosana; Oluwaseun Ajao; Sean Maudsley-Barton;	arxiv-cs.IR	2024-10-08
466	Leveraging Free Energy in Pretraining Model Selection for Improved Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a Bayesian model selection criterion, called the downstream free energy, which quantifies a checkpoint’s adaptability by measuring the concentration of nearby favorable parameters for the downstream task.	Michael Munn; Susan Wei;	arxiv-cs.LG	2024-10-07
467	Timer-XL: Long-Context Transformers for Unified Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Timer-XL, a generative Transformer for unified time series forecasting.	Yong Liu; Guo Qin; Xiangdong Huang; Jianmin Wang; Mingsheng Long;	arxiv-cs.LG	2024-10-07
468	SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel transformer-to-SNN conversion method that outputs an end-to-end spike-based transformer, named SpikedAttention.	Sangwoo Hwang; Seunghyun Lee; Dahoon Park; Donghun Lee; Jaeha Kung;	nips	2024-10-07
469	Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.	Akshay Krishnamurthy; Keegan Harris; Dylan J Foster; Cyril Zhang; Aleksandrs Slivkins;	nips	2024-10-07
470	LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH).	XIAONAN NIE et. al.	nips	2024-10-07
471	Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models Via Recounted Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new prompting technique tailored for temporal reasoning, Narrative-of-Thought (NoT), that first converts the events set to a Python class, then prompts a small model to generate a temporally grounded narrative, guiding the final generation of a temporal graph.	Xinliang Frederick Zhang; Nick Beauchamp; Lu Wang;	arxiv-cs.CL	2024-10-07
472	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	nips	2024-10-07
473	JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.	KUN ZHOU et. al.	nips	2024-10-07
474	Achieving Efficient Alignment Through Learned Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Aligner, a novel and simple alignment paradigm that learns the correctional residuals between preferred and dispreferred answers using a small model.	JIAMING JI et. al.	nips	2024-10-07
475	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 774M parameters.	RHEA SUKTHANKER et. al.	nips	2024-10-07
476	In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introducing the concept of state vector.	DONGFANG LI et. al.	nips	2024-10-07
477	Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods.	Lingxiao Zhao; Xueying Ding; Leman Akoglu;	nips	2024-10-07
478	Finding Transformer Circuits With Edge Pruning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we frame circuit discovery as an optimization problem and propose _Edge Pruning_ as an effective and scalable solution.	Adithya Bhaskar; Alexander Wettig; Dan Friedman; Danqi Chen;	nips	2024-10-07
479	SAND: Smooth Imputation of Sparse and Noisy Functional Data with Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the transformer architecture has come to dominate other models for text and image data, its application to irregularly-spaced longitudinal data has been limited. We introduce a variant of the transformer that enables it to more smoothly impute such functional data.	Ju-Sheng Hong; Junwen Yao; Jonas Mueller; Jane-Ling Wang;	nips	2024-10-07
480	Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Efficient Multi-Task Learning (EMTAL), a novel approach that transforms a pre-trained Transformer into an efficient multi-task learner during training, and reparameterizes the knowledge back to the original Transformer for efficient inference.	Hanwen Zhong; Jiaxin Chen; Yutong Zhang; Di Huang; Yunhong Wang;	nips	2024-10-07
481	VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach to reduce vision compute by leveraging redundant vision tokens “skipping layers” rather than decreasing the number of vision tokens.	SHIWEI WU et. al.	nips	2024-10-07
482	$M^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents $M^3$GPT, an advanced \textbf{M}ultimodal, \textbf{M}ultitask framework for \textbf{M}otion comprehension and generation.	MINGSHUANG LUO et. al.	nips	2024-10-07
483	Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks.	Jonathan Hayase; Ema Borevković; Nicholas Carlini; Florian Tramer; Milad Nasr;	nips	2024-10-07
484	A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation.	ARCHIT SHARMA et. al.	nips	2024-10-07
485	SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods.	PEI ZHOU et. al.	nips	2024-10-07
486	Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks.	JONATHAN THOMM et. al.	nips	2024-10-07
487	Does RoBERTa Perform Better Than BERT in Continual Learning: An Attention Sink Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we observe that pre-trained models may allocate high attention scores to some ‘sink’ tokens, such as [SEP] tokens, which are ubiquitous across various tasks.	Xueying Bai; Yifan Sun; Niranjan Balasubramanian;	arxiv-cs.LG	2024-10-07
488	The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent research suggests that state-space models (SSMs) like Mamba can be competitive with Transformer models for language modeling with advantageous deployment characteristics. Given the focus and expertise on training large-scale Transformer models, we consider the challenge of converting these pretrained models into SSMs for deployment.	Junxiong Wang; Daniele Paliotta; Avner May; Alexander Rush; Tri Dao;	nips	2024-10-07
489	Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library.	YE FANG et. al.	nips	2024-10-07
490	SemCoder: Training Code Language Models with Comprehensive Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair.	YANGRUIBO DING et. al.	nips	2024-10-07
491	Weak-to-Strong Search: Align Large Language Models Via Searching Over Small Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce weak-to-strong search, framing the alignment of a large language model as a test-time greedy search to maximize the log-likelihood difference between small tuned and untuned models while sampling from the frozen large model.	ZHANHUI ZHOU et. al.	nips	2024-10-07
492	LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The KG datastore is designed as a plug-and-play module, allowing for seamless integration with various model architectures. We introduce and evaluate three distinct frameworks within this paradigm: KG-LLaVA, which integrates the pre-trained LLaVA model with KG-RAG; Med-XPT, a custom framework combining MedCLIP, a transformer-based projector, and GPT-2; and Bio-LLaVA, which adapts LLaVA by incorporating the Bio-ViT-L vision model.	Ameer Hamza; Yong Hyun Ahn; Sungyoung Lee; Seong Tae Kim;	arxiv-cs.CV	2024-10-07
493	FinBen: An Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making.	QIANQIAN XIE et. al.	nips	2024-10-07
494	ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching.	JUNJIE NI et. al.	nips	2024-10-07
495	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a method that is able to distill a pre-trained Transformer architecture into alternative architectures such as state space models (SSMs).	Aviv Bick; Kevin Li; Eric Xing; J. Zico Kolter; Albert Gu;	nips	2024-10-07
496	Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work has proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top $k$ similar tokens.	CHAU TRAN et. al.	nips	2024-10-07
497	RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.	SHAOTENG LIU et. al.	nips	2024-10-07
498	Understanding Transformers Via N-Gram Statistics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper takes a first step in this direction by considering families of functions (i.e. rules) formed out of simple N-gram based statistics of the training data. By studying how well these rulesets approximate transformer predictions, we obtain a variety of novel discoveries: a simple method to detect overfitting during training without using a holdout set, a quantative measure of how transformers progress from learning simple to more complex statistical rules over the course of training, a model-variance criterion governing when transformer predictions tend to be described by N-gram rules, and insights into how well transformers can be approximated by N-gram rulesets in the limit where these rulesets become increasingly complex.	Timothy Nguyen;	nips	2024-10-07
499	APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents APIGen, an automated data generation pipeline designed to produce verifiable high-quality datasets for function-calling applications.	ZUXIN LIU et. al.	nips	2024-10-07
500	MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents.	WEI TAO et. al.	nips	2024-10-07
501	OccamLLM: Fast and Exact Language Model Arithmetic in A Single Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that enables exact arithmetic in a single autoregressive step, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities.	OWEN DUGAN et. al.	nips	2024-10-07
502	Differential Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise.	TIANZHU YE et. al.	arxiv-cs.CL	2024-10-07
503	Perception of Knowledge Boundary for Large Language Models Through Semi-open-ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perceive the LLMs’ knowledge boundary with semi-open-ended questions by discovering more ambiguous answers.	ZHIHUA WEN et. al.	nips	2024-10-07
504	UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction.	Yansong Ning; Hao Liu;	nips	2024-10-07
505	Visual Autoregressive Modeling: Scalable Image Generation Via Next-Scale Prediction IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine next-scale prediction or next-resolution prediction, diverging from the standard raster-scan next-token prediction.	Keyu Tian; Yi Jiang; Zehuan Yuan; BINGYUE PENG; Liwei Wang;	nips	2024-10-07
506	Seshat Global History Databank Text Dataset and Benchmark of Large Language Models’ History Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This benchmarking is particularly challenging, given that human knowledge of history is inherently unbalanced, with more information available on Western history and recent periods. To address this challenge, we introduce a curated sample of the Seshat Global History Databank, which provides a structured representation of human historical knowledge, containing 36,000 data points across 600 historical societies and over 600 scholarly references.	JAKOB HAUSER et. al.	nips	2024-10-07
507	DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size—adding a few thousand parameters for large-scale models in the 100B parameters range.	Matteo Pagliardini; Amirkeivan Mohtashami; François Fleuret; Martin Jaggi;	nips	2024-10-07
508	ProtocoLLM: Automatic Evaluation Framework of LLMs on Domain-Specific Scientific Protocol Formulation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a flexible, automatic framework to evaluate LLM’s capability on SPFT: ProtocoLLM.	Seungjun Yi; Jaeyoung Lim; Juyong Yoon;	arxiv-cs.CL	2024-10-06
509	Selective Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Transformer models face two key challenges when dealing with HSI scenes characterized by diverse land cover types and rich spectral information: (1) fixed receptive field representation overlooks effective contextual information; (2) redundant self-attention feature representation. To address these limitations, we propose a novel Selective Transformer (SFormer) for HSI classification.	Yichu Xu; Di Wang; Lefei Zhang; Liangpei Zhang;	arxiv-cs.CV	2024-10-04
510	How Language Models Prioritize Contextual Grammatical Cues? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun.	Hamidreza Amirzadeh; Afra Alishahi; Hosein Mohebbi;	arxiv-cs.CL	2024-10-04
511	Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts.	AMEY HENGLE et. al.	arxiv-cs.CL	2024-10-04
512	Learning Semantic Structure Through First-Order-Logic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study whether transformer-based language models can extract predicate argument structure from simple sentences.	Akshay Chaturvedi; Nicholas Asher;	arxiv-cs.CL	2024-10-04
513	Dynamic Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we introduce a Timestep-wise Dynamic Width (TDW) approach that adapts model width conditioned on the generation timesteps.	WANGBO ZHAO et. al.	arxiv-cs.CV	2024-10-04
514	Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the tokenization techniques employed by state-of-the-art large language models (LLMs) and their implications on the cost and availability of services across different languages, especially low resource languages.	Abrar Rahman; Garry Bowlin; Binit Mohanty; Sean McGunigal;	arxiv-cs.CL	2024-10-04
515	IndicSentEval: How Effectively Do Multilingual Transformer Models Encode Linguistic Properties for Indic Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate similar questions regarding encoding capability and robustness for 8 linguistic properties across 13 different perturbations in 6 Indic languages, using 9 multilingual Transformer models (7 universal and 2 Indic-specific).	Akhilesh Aravapalli; Mounika Marreddy; Subba Reddy Oota; Radhika Mamidi; Manish Gupta;	arxiv-cs.CL	2024-10-03
516	AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first correct-by-construction learning-based system for step-by-step mathematical integration.	Mert Ünsal; Timon Gehr; Martin Vechev;	arxiv-cs.LG	2024-10-03
517	CulturalBench: A Robust, Diverse and Challenging Benchmark on Measuring The (Lack Of) Cultural Knowledge of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CulturalBench: a set of 1,227 human-written and human-verified questions for effectively assessing LLMs’ cultural knowledge, covering 45 global regions including the underrepresented ones like Bangladesh, Zimbabwe, and Peru.	YU YING CHIU et. al.	arxiv-cs.CL	2024-10-03
518	GPT-4o As The Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose SIEVE, a lightweight alternative that matches GPT-4o accuracy at less than 1\% of the cost.	Jifan Zhang; Ziyue Luo; Jia Liu; Ness Shroff; Robert Nowak;	arxiv-cs.CL	2024-10-03
519	Intrinsic Evaluation of RAG Systems for Deep-Logic Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Overall Performance Index (OPI), an intrinsic metric to evaluate retrieval-augmented generation (RAG) mechanisms for applications involving deep-logic queries.	Junyi Hu; You Zhou; Jie Wang;	arxiv-cs.AI	2024-10-03
520	AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose AutoDAN-Turbo, a black-box jailbreak method that can automatically discover as many jailbreak strategies as possible from scratch, without any human intervention or predefined scopes (e.g., specified candidate strategies), and use them for red-teaming.	XIAOGENG LIU et. al.	arxiv-cs.CR	2024-10-03
521	Coal Mining Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to coal mining question answering (QA) using large language models (LLMs) combined with tailored prompt engineering techniques.	Antonio Carlos Rivera; Anthony Moore; Steven Robinson;	arxiv-cs.CL	2024-10-03
522	TSOTSALearning at LLMs4OL Tasks A and B : Combining Rules to Large Language Model for Ontology Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our contribution to the Large Language Model For Ontology Learning (LLMs4OL) challenge hosted by ISWC conference. The challenge involves extracting and …	Carick Appolinaire Atezong Ymele; Azanzi Jiomekong;	LLMs4OL@ISWC	2024-10-02
523	Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of LLMs and FinBERT for FSA, comparing their performance on news articles, financial reports and company announcements.	Yanxin Shen; Pulin Kirin Zhang;	arxiv-cs.IR	2024-10-02
524	A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.	LIANG CHEN et. al.	arxiv-cs.CV	2024-10-02
525	Automatic Deductive Coding in Discourse Analysis: An Application of Large Language Models in Learning Analytics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding.	Lishan Zhang; Han Wu; Xiaoshan Huang; Tengfei Duan; Hanxiang Du;	arxiv-cs.CL	2024-10-02
526	Emotion-Aware Response Generation Using Affect-Enriched Embeddings with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel framework that integrates multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as LLAMA 2, Flan-T5, ChatGPT 3.0, and ChatGPT 4.0.	Abdur Rasool; Muhammad Irfan Shahzad; Hafsa Aslam; Vincent Chan;	arxiv-cs.CL	2024-10-02
527	ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, even state-of-the-art vision-language models (VLMs), such as GPT-4o, still fall short of human-level performance, particularly in intricate web environments and long-horizon tasks. To address these limitations, we present ExACT, an approach to combine test-time search and self-learning to build o1-like models for agentic applications.	XIAO YU et. al.	arxiv-cs.CL	2024-10-02
528	Creative and Context-Aware Translation of East Asian Idioms with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, compiling a dictionary of candidate translations demands much time and creativity even for expert translators. To alleviate such burden, we evaluate if GPT-4 can help generate high-quality translations.	Kenan Tang; Peiyang Song; Yao Qin; Xifeng Yan;	arxiv-cs.CL	2024-10-01
529	Ayaka: A Versatile Transformer Accelerator With Low-Rank Estimation and Heterogeneous Dataflow Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer model has demonstrated outstanding performance in the field of artificial intelligence. However, its remarkable performance comes at the cost of substantial …	YUBIN QIN et. al.	IEEE Journal of Solid-State Circuits	2024-10-01
530	CIMFormer: A Systolic CIM-Array-Based Transformer Accelerator With Token-Pruning-Aware Attention Reformulating and Principal Possibility Gathering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models have achieved impressive performance in various artificial intelligence (AI) applications. However, the high cost of computation and memory footprint make its …	RUIQI GUO et. al.	IEEE Journal of Solid-State Circuits	2024-10-01
531	APT: Alarm Prediction Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Nika Strem; D. Dhami; Benedikt Schmidt; Benjamin Klöpper; K. Kersting;	Expert Syst. Appl.	2024-10-01
532	SIGMA: Secure GPT Inference with Function Secret Sharing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the …	KANAV GUPTA et. al.	Proc. Priv. Enhancing Technol.	2024-10-01
533	When Transformer Meets Large Graphs: An Expressive and Efficient Two-View Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The successes of applying Transformer to graphs have been witnessed on small graphs (e.g., molecular graphs), yet two barriers prevent its adoption on large graphs (e.g., citation …	Weirui Kuang; Zhen Wang; Zhewei Wei; Yaliang Li; Bolin Ding;	IEEE Transactions on Knowledge and Data Engineering	2024-10-01
534	MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone’s Potential with Masked Autoregressive Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on this analysis, we propose Masked Autoregressive Pretraining (MAP) to pretrain a hybrid Mamba-Transformer vision backbone network.	Yunze Liu; Li Yi;	arxiv-cs.CV	2024-10-01
535	TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost.	Areeg Fahad Rasheed; M. Zarkoosh; Safa F. Abbas; Sana Sabah Al-Azzawi;	arxiv-cs.CL	2024-09-30
536	Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenges posed by the substantial training time and memory consumption associated with video transformers, focusing on the ViViT (Video Vision Transformer) model, in particular the Factorised Encoder version, as our baseline for action recognition tasks.	Shreyank N Gowda; Anurag Arnab; Jonathan Huang;	eccv	2024-09-30
537	GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called , simultaneously applicable for various vision tasks only with a vanilla ViT.Interestingly, our builds a new benchmark in generalist performance, and fosters mutual enhancement across tasks, leading to significant improvements compared to isolated training.	HAIYANG WANG et. al.	eccv	2024-09-30
538	An Explainable Vision Question Answer Model Via Diffusion Chain-of-Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This means that generating explanations solely for the answer can lead to a semantic discrepancy between the content of the explanation and the question-answering content. To address this, we propose a step-by-step reasoning approach to reduce such semantic discrepancies.	Chunhao LU; Qiang Lu; Jake Luo;	eccv	2024-09-30
539	AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering.	Weiran Huang; Xiuyuan Chen; Yuan Lin; Yuchen Zhang;	eccv	2024-09-30
540	GENIXER: Empowering Multimodal Large Language Models As A Powerful Data Generator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce , a comprehensive data generation pipeline consisting of four key steps: (i) instruction data collection, (ii) instruction template design, (iii) empowering MLLMs, and (iv) data generation and filtering.	Henry Hengyuan Zhao; Pan Zhou; Mike Zheng Shou;	eccv	2024-09-30
541	Evaluating The Fairness of Task-adaptive Pretraining on Unlabeled Test Data Before Few-shot Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Few-shot learning benchmarks are critical for evaluating modern NLP techniques.	Kush Dubey;	arxiv-cs.CL	2024-09-30
542	HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction.	Fangqin Zhou; Mert Kilickaya; Joaquin Vanschoren; Ran Piao;	eccv	2024-09-30
543	LingoQA: Video Question Answering for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving.We release our dataset and benchmark1 as an evaluation platform for vision-language models in autonomous driving.	ANA-MARIA MARCU et. al.	eccv	2024-09-30
544	Sparse Attention Decomposition Applied to Circuit Tracing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we seek to isolate and identify the features used to effect communication and coordination among attention heads in GPT-2 small.	Gabriel Franco; Mark Crovella;	arxiv-cs.LG	2024-09-30
545	OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.	WENZHAO ZHENG et. al.	eccv	2024-09-30
546	Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face limitations in both shape reconstruction and texture generation. This paper introduces an innovative Analysis-by-Synthesis Transformer that addresses these limitations in a unified framework by effectively modeling pixel-to-shape and pixel-to-texture relationships.	DIAN JIA et. al.	eccv	2024-09-30
547	MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce MaskMamba, a novel hybrid model that combines Mamba and Transformer architectures, utilizing Masked Image Modeling for non-autoregressive image synthesis.	Wenchao Chen; Liqiang Niu; Ziyao Lu; Fandong Meng; Jie Zhou;	arxiv-cs.CV	2024-09-30
548	MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the hypergraph transformer-based method for trajectory prediction is yet to be explored. Therefore, we present a MultiscAle Relational Transformer (MART) network for multi-agent trajectory prediction.	Seongju Lee; Junseok Lee; Yeonguk Yu; Taeri Kim; Kyoobin Lee;	eccv	2024-09-30
549	ACE: All-round Creator and Editor Following Instructions Via Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose ACE, an All-round Creator and Editor, which achieves comparable performance compared to those expert models in a wide range of visual generation tasks.	ZHEN HAN et. al.	arxiv-cs.CV	2024-09-30
550	Depression Detection in Social Media Posts Using Transformer-based Models and Auxiliary Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing studies have explored various approaches to this problem but often fall short in terms of accuracy and robustness. To address these limitations, this research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers.	Marios Kerasiotis; Loukas Ilias; Dimitris Askounis;	arxiv-cs.CL	2024-09-30
551	Multimodal Misinformation Detection By Learning from Synthetic Data with Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions.	Fengzhu Zeng; Wenqian Li; Wei Gao; Yan Pang;	arxiv-cs.CL	2024-09-29
552	INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore fundamental questions related to solving mathematical reasoning problems using natural language and code with state-of-the-art LLMs, including GPT-4o-mini and LLama-3.1-8b-Turbo.	Xuyuan Xiong; Simeng Han; Ziyue Zhou; Arman Cohan;	arxiv-cs.CL	2024-09-28
553	3D-CT-GPT: Generating 3D Radiology Reports Through Integration of Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model specifically designed for generating radiology reports from 3D CT scans, particularly chest CTs.	HAO CHEN et. al.	arxiv-cs.CV	2024-09-28
554	Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a custom self-attention in-memory computing architecture based on emerging charge-based memories called gain cells, which can be efficiently written to store new tokens during sequence generation and enable parallel analog dot-product computation required for self-attention.	NATHAN LEROUX et. al.	arxiv-cs.NE	2024-09-28
555	INSIGHTBUDDY-AI: Medication Extraction and Entity Linking Using Large Language Models and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects.	Pablo Romero; Lifeng Han; Goran Nenadic;	arxiv-cs.CL	2024-09-28
556	Efficient Federated Intrusion Detection in 5G Ecosystem Using Optimized BERT-based Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs).	Frederic Adjewa; Moez Esseghir; Leila Merghem-Boulahia;	arxiv-cs.CR	2024-09-28
557	FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research on food image understanding using recipe data has been a long-standing focus due to the diversity and complexity of the data.	Yuki Imajuku; Yoko Yamakata; Kiyoharu Aizawa;	arxiv-cs.CV	2024-09-27
558	Cottention: Linear Transformers With Cosine Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Cottention, a novel attention mechanism that replaces the softmax operation with cosine similarity.	Gabriel Mongaras; Trevor Dohm; Eric C. Larson;	arxiv-cs.LG	2024-09-27
559	Experimental Evaluation of Machine Learning Models for Goal-oriented Customer Service Chatbot with Pipeline Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a tailored experimental evaluation approach for goal-oriented customer service chatbots with pipeline architecture, focusing on three key components: Natural Language Understanding (NLU), dialogue management (DM), and Natural Language Generation (NLG).	Nurul Ain Nabilah Mohd Isa; Siti Nuraishah Agos Jawaddi; Azlan Ismail;	arxiv-cs.AI	2024-09-27
560	Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Pre-trained language models offer promise for identifying suicidality from unstructured clinical narratives.	ZEHAN LI et. al.	arxiv-cs.CL	2024-09-27
561	Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec.	Richard Yue; John E. Ortega;	arxiv-cs.CL	2024-09-26
562	The Application of GPT-4 in Grading Design University Students’ Assignment and Providing Feedback: An Exploratory Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to investigate whether GPT-4 can effectively grade assignments for design university students and provide useful feedback.	Qian Huang; Thijs Willems; King Wang Poon;	arxiv-cs.AI	2024-09-26
563	Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses from Closed-Domain LLM Vs. Clinical Teams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, responding to these patients’ inquiries has become a significant burden on healthcare workflows, consuming considerable time for clinical care teams. To address this, we introduce RadOnc-GPT, a specialized Large Language Model (LLM) powered by GPT-4 that has been designed with a focus on radiotherapeutic treatment of prostate cancer with advanced prompt engineering, and specifically designed to assist in generating responses.	YUEXING HAO et. al.	arxiv-cs.AI	2024-09-26
564	Reducing and Exploiting Data Augmentation Noise Through Meta Reweighting Contrastive Learning for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To boost deep learning models’ performance given augmented data/samples in text classification tasks, we propose a novel framework, which leverages both meta learning and contrastive learning techniques as parts of our design for reweighting the augmented samples and refining their feature representations based on their quality.	Guanyi Mou; Yichuan Li; Kyumin Lee;	arxiv-cs.CL	2024-09-25
565	Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code.	Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier;	arxiv-cs.CR	2024-09-25
566	Assessing The Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people, from multiple social media sources.	Mukaffi Bin Moin; Pronay Debnath; Usafa Akther Rifa; Rijeet Bin Anis;	arxiv-cs.CL	2024-09-25
567	Beyond Turing Test: Can GPT-4 Sway Experts’ Decisions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the post-Turing era, evaluating large language models (LLMs) involves assessing generated text based on readers’ reactions rather than merely its indistinguishability from human-produced content.	Takehiro Takayanagi; Hiroya Takamura; Kiyoshi Izumi; Chung-Chi Chen;	arxiv-cs.CE	2024-09-25
568	SynChart: Synthesizing Charts from Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a large-scale chart dataset, SynChart, which contains approximately 4 million diverse chart images with over 75 million dense annotations, including data tables, code, descriptions, and question-answer sets. We trained a 4.2B chart-expert model using this dataset and achieve near-GPT-4O performance on the ChartQA task, surpassing GPT-4V.	MENGCHEN LIU et. al.	arxiv-cs.AI	2024-09-24
569	GPT-4 As A Homework Tutor Can Improve Student Engagement and Learning Outcomes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools.	Alessandro Vanzo; Sankalan Pal Chowdhury; Mrinmaya Sachan;	arxiv-cs.CY	2024-09-24
570	Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms in English, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality.	Xufeng Duan; Xinyu Zhou; Bei Xiao; Zhenguang G. Cai;	arxiv-cs.CL	2024-09-24
571	MonoFormer: One Transformer for Both Diffusion and Autoregression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to study a simple idea: share one transformer for both autoregression and diffusion.	CHUYANG ZHAO et. al.	arxiv-cs.CV	2024-09-24
572	SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in FL environments.	Minyeong Choe; Cheolhee Park; Changho Seo; Hyunil Kim;	arxiv-cs.LG	2024-09-23
573	SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce \textit{multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries}, SOFI.	Sebastian Janampa; Marios Pattichis;	arxiv-cs.CV	2024-09-23
574	Towards A Realistic Long-Term Benchmark for Open-Web Research Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present initial results of a forthcoming benchmark for evaluating LLM agents on white-collar tasks of economic value.	Peter Mühlbacher; Nikos I. Bosse; Lawrence Phillips;	arxiv-cs.CL	2024-09-23
575	Improving Academic Skills Assessment with NLP and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP).	Xinyi Huang; Yingyi Wu; Danyang Zhang; Jiacheng Hu; Yujian Long;	arxiv-cs.CL	2024-09-23
576	Can Pre-trained Language Models Generate Titles for Research Papers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fine-tune pre-trained language models to generate titles of papers from their abstracts.	Tohida Rehman; Debarshi Kumar Sanyal; Samiran Chattopadhyay;	arxiv-cs.CL	2024-09-22
577	Evaluating The Quality of Code Comments Generated By Large Language Models for Novice Programmers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated.	Aysa Xuemo Fan; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Jiaze Ke;	arxiv-cs.SE	2024-09-22
578	The Use of GPT-4o and Other Large Language Models for The Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OpenAI’s ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft’s Copilot, Google’s Gemini 1.5 Pro, and Antrophic’s Claude 3.5 Sonnet can be effectively used in various phases of scientific research.	Goran Bubaš;	arxiv-cs.AI	2024-09-21
579	Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Narrow Jump to Conclusions (NJTC) and Normalized Narrow Jump to Conclusions (N-NJTC) – parameter efficient alternatives to standard linear shortcutting that reduces shortcut parameter count by over 97%.	Amrit Diggavi Seshadri;	arxiv-cs.AI	2024-09-21
580	AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the capabilities and potential of the intelligent personal assistant (IPA) CORE (Checklist Organizer for Research and Exploration), designed to support astronauts during procedures onboard the International Space Station (ISS), the Lunar Gateway station, and beyond.	OLIVER BENSCH et. al.	arxiv-cs.AI	2024-09-21
581	Loop Neural Networks for Parameter Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size.	Kei-Sing Ng; Qingchen Wang;	arxiv-cs.AI	2024-09-21
582	On Importance of Pruning and Distillation for Efficient Low Resource NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the case of the low-resource Indic language Marathi.	AISHWARYA MIRASHI et. al.	arxiv-cs.CL	2024-09-21
583	HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This approach ensures that the correlation between the original and updated parameters is preserved, leveraging the semantic features learned during pre-training. Building on this paradigm, we present the Hadamard Updated Transformation (HUT) method.	Geyuan Zhang; Xiaofei Zhou; Chuheng Chen;	arxiv-cs.CL	2024-09-20
584	T2M-X: Learning Expressive Text-to-Motion Generation from Partially Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose T2M-X, a two-stage method that learns expressive text-to-motion generation from partially annotated data.	Mingdian Liu; Yilin Liu; Gurunandan Krishnan; Karl S Bayer; Bing Zhou;	arxiv-cs.CV	2024-09-20
585	Prompting Large Language Models for Supporting The Differential Diagnosis of Anemia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by clinical guidelines, our study aimed to develop pathways similar to those that can be obtained in clinical guidelines.	Elisa Castagnari; Lillian Muyama; Adrien Coulet;	arxiv-cs.CL	2024-09-20
586	FAIR GPT: A Virtual Consultant for Research Data Management in ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: FAIR GPT is a first virtual consultant in ChatGPT designed to help researchers and organizations make their data and metadata compliant with the FAIR (Findable, Accessible, …	R. Shigapov; Irene Schumm;	ArXiv	2024-09-20
587	Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are renowned for their exceptional capabilities, and applying to a wide range of applications.	Md Abdur Rahman; Hossain Shahriar; Fan Wu; Alfredo Cuzzocrea;	arxiv-cs.CL	2024-09-20
588	Drift to Remember Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that representational drift can alleviate catastrophic forgetting in AI during new task acquisition. To test this, we introduce DriftNet, a network designed to constantly explore various local minima in the loss landscape while dynamically retrieving relevant tasks.	JIN DU et. al.	arxiv-cs.AI	2024-09-20
589	‘Since Lawyers Are Males..’: Examining Implicit Gender Bias in Hindi Language Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education.	Ishika Joshi; Ishita Gupta; Adrita Dey; Tapan Parikh;	arxiv-cs.CL	2024-09-20
590	3DTopia-XL: Scaling High-quality 3D Asset Generation Via Primitive Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for physically based rendering (PBR). In this paper, we introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome these limitations.	ZHAOXI CHEN et. al.	arxiv-cs.CV	2024-09-19
591	$\text{M}^\text{6}(\text{GPT})^\text{3}$: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm for the generation of melodic elements.	Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara;	arxiv-cs.SD	2024-09-19
592	TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing prompt compression techniques either rely on sub-optimal metrics such as information entropy or model it as a task-agnostic token classification problem that fails to capture task-specific information. To address these issues, we propose a novel and efficient reinforcement learning (RL) based task-aware prompt compression method.	SHIVAM SHANDILYA et. al.	arxiv-cs.CL	2024-09-19
593	Introducing The Large Medical Model: State of The Art Healthcare Cost and Risk Prediction with Transformers Trained on Patient Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration.	RICKY SAHU et. al.	arxiv-cs.LG	2024-09-19
594	Program Slicing in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the application of large language models (LLMs) to both static and dynamic program slicing, with a focus on Java programs.	Kimya Khakzad Shahandashti; Mohammad Mahdi Mohajer; Alvine Boaye Belle; Song Wang; Hadi Hemmati;	arxiv-cs.SE	2024-09-18
595	Recommendation with Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a taxonomy that categorizes DGMs into three types: ID-driven models, large language models (LLMs), and multimodal models.	YASHAR DELDJOO et. al.	arxiv-cs.IR	2024-09-18
596	AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders.	PENGAN CHEN et. al.	arxiv-cs.RO	2024-09-18
597	American Sign Language to Text Translation Using Transformer and Seq2Seq with LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the Transformer with the Sequence-to-Sequence (Seq2Seq) model in translating sign language to text.	Gregorius Guntur Sunardi Putra; Adifa Widyadhani Chanda D’Layla; Dimas Wahono; Riyanarto Sarno; Agus Tri Haryono;	arxiv-cs.CL	2024-09-17
598	Small Language Models Can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate the creative fiction writing abilities of a fine-tuned small language model (SLM), BART-large, and compare its performance to human writers and two large language models (LLMs): GPT-3.5 and GPT-4o.	Guillermo Marco; Luz Rello; Julio Gonzalo;	arxiv-cs.CL	2024-09-17
599	Adaptive Large Language Models By Layerwise Attention Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we propose to challenge this and introduce adaptive computations for LLM-like setups, which allow the final layer to attend to all of the intermediate layers as it deems fit through the attention mechanism, thereby introducing computational \textbf{attention shortcuts}.	Prateek Verma; Mert Pilanci;	arxiv-cs.CL	2024-09-16
600	SelECT-SQL: Self-correcting Ensemble Chain-of-Thought for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SelECT-SQL, a novel in-context learning solution that uses an algorithmic combination of chain-of-thought (CoT) prompting, self-correction, and ensemble methods to yield a new state-of-the-art result on challenging Text-to-SQL benchmarks.	Ke Shen; Mayank Kejriwal;	arxiv-cs.CL	2024-09-16
601	Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories.	Shaznin Sultana; Sadia Afreen; Nasir U. Eisty;	arxiv-cs.SE	2024-09-16
602	Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, inspired by the recent public release of the GPT-o1 models, we conduct the first study to compare the effectiveness of different versions of the GPT-family models in APR.	Haichuan Hu; Ye Shang; Guolin Xu; Congqing He; Quanjun Zhang;	arxiv-cs.SE	2024-09-16
603	LLMs for Clinical Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the efficacy of GPT-4 and clinalytix Medical AI in predicting the clinical risk of delirium development.	Mohamed Rezk; Patricia Cabanillas Silva; Fried-Michael Dahlweid;	arxiv-cs.CL	2024-09-16
604	Investigating The Impact of Code Comment Inconsistency on Bug Introducing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research investigates the impact of code-comment inconsistency on bug introduction using large language models, specifically GPT-3.5.	Shiva Radmanesh; Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour;	arxiv-cs.SE	2024-09-16
605	GP-GPT: Large Language Model for Gene-Phenotype Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis.	YANJUN LYU et. al.	arxiv-cs.CL	2024-09-15
606	Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive investigation of the use of large language models (LLMs) and their capabilities in detecting OWASP Top Ten vulnerabilities in Solidity.	Md Tauseef Alam; Raju Halder; Abyayananda Maiti;	arxiv-cs.CR	2024-09-15
607	Leveraging Open-Source Large Language Models for Native Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Native Language Identification (NLI) – the task of identifying the native language (L1) of a person based on their writing in the second language (L2) – has applications in forensics, marketing, and second language acquisition. Historically, conventional machine learning approaches that heavily rely on extensive feature engineering have outperformed transformer-based language models on this task.	Yee Man Ng; Ilia Markov;	arxiv-cs.CL	2024-09-15
608	CAT: Customized Transformer Accelerator Framework on Versal ACAP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is far more flexible than GPU in hardware customization, and has better and smaller design solution space than traditional FPGA. Therefore, this paper proposes the Customized Transformer Accelerator Framework(CAT), through the CAT framework, a customized Transformer accelerator family can be derived on Versal ACAP, CAT framework has an abstract accelerator architecture design idea, which deconstructs and efficiently maps the Transformer into the hardware, which contains a variety of customizable properties.	Wenbo Zhang; Yiqi Liu; Zhenshan Bao;	arxiv-cs.AR	2024-09-15
609	Enhancing LLM Problem Solving with REAP: Reflection, Explicit Problem Deconstruction, and Advanced Prompting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have transformed natural language processing, yet improving their problem-solving capabilities, particularly for complex, reasoning-intensive tasks, …	Ryan Lingo; Martin Arroyo; Rajeev Chhajer;	ArXiv	2024-09-14
610	Evaluating Authenticity and Quality of Image Captions Via Sentiment and Semantic Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an evaluation method focused on sentiment and semantic richness.	Aleksei Krotov; Alison Tebo; Dylan K. Picart; Aaron Dean Algave;	arxiv-cs.CV	2024-09-14
611	Autoregressive + Chain of Thought = Recurrent: Recurrence’s Role in Language Models’ Computability and A Revisit of Recurrent Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thoroughly investigate the influence of recurrent structures in neural models on their reasoning abilities and computability, contrasting the role autoregression plays in the neural models’ computational power.	Xiang Zhang; Muhammad Abdul-Mageed; Laks V. S. Lakshmanan;	arxiv-cs.CL	2024-09-13
612	Undergrads Are All You Have Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper also demonstrates that GPT-UGRD is cheaper and easier to train and operate than transformer models. In this paper, we outline the implementation, application, multi-tenanting, and social implications of using this new model in research and other contexts.	Ashe Neth;	arxiv-cs.CY	2024-09-13
613	Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper’s contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.	Jake Street; Isibor Ihianle; Funminiyi Olajide; Ahmad Lotfi;	arxiv-cs.LG	2024-09-12
614	SDformer: Efficient End-to-End Transformer for Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a different window-based Transformer architecture for depth completion tasks named Sparse-to-Dense Transformer (SDformer).	JIAN QIAN et. al.	arxiv-cs.CV	2024-09-12
615	Towards Fairer Health Recommendations: Finding Informative Unbiased Samples Via Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, some of these terms, especially those related to race and ethnicity, can carry different meanings (e.g., white matter of spinal cord). To address this issue, we propose the use of Word Sense Disambiguation models to refine dataset quality by removing irrelevant sentences.	GAVIN BUTTS et. al.	arxiv-cs.CL	2024-09-11
616	How Effectively Do LLMs Extract Feature-Sentiment Pairs from App Reviews? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of our study is to explore the capabilities of LLMs to perform feature-specific sentiment analysis of user reviews.	Faiz Ali Shah; Ahmed Sabir; Rajesh Sharma; Dietmar Pfahl;	arxiv-cs.CL	2024-09-11
617	A Novel Mathematical Framework for Objective Characterization of Ideas Through Vector Embeddings in LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This method suffers from limitations such as human judgment errors, bias, and oversight. Addressing this gap, our study introduces a comprehensive mathematical framework for automated analysis to objectively evaluate the plethora of ideas generated by CAI systems and/or humans.	B. Sankar; Dibakar Sen;	arxiv-cs.AI	2024-09-11
618	Analysis of Responses of GPT-4 V to The Japanese National Clinical Engineer Licensing Examination Related Papers Related Patents Related Grants Related Venues Related Experts View	Kai Ishida; Naoya Arisaka; Kiyotaka Fujii;	Journal of medical systems	2024-09-11
619	GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address the challenges of using LLM-as-a-Judge when evaluating grounded answers generated by RAG systems.	Sacha Muller; António Loison; Bilel Omrani; Gautier Viaud;	arxiv-cs.CL	2024-09-10
620	Identifying The Sources of Ideological Bias in GPT Models Through Linguistic Variation in Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we provide an original approach to identifying ideological bias in generative models, showing that bias can stem from both the training data and the filtering algorithm.	Christina Walker; Joan C. Timoneda;	arxiv-cs.CL	2024-09-09
621	Harmonic Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are becoming very popular and are used for many different purposes, including creative tasks in the arts.	Anna Kruspe;	arxiv-cs.CL	2024-09-09
622	FairHome: A Fair Housing and Fair Lending Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories.	Anusha Bagalkotkar; Aveek Karmakar; Gabriel Arnson; Ondrej Linda;	arxiv-cs.LG	2024-09-09
623	NOVI : Chatbot System for University Novice with BERT and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the difficulties of university freshmen in adapting to university life, we developed NOVI, a chatbot system based on GPT-4o.	Yoonji Nam; TaeWoong Seo; Gyeongcheol Shin; Sangji Lee; JaeEun Im;	arxiv-cs.CL	2024-09-09
624	Can Large Language Models Unlock Novel Scientific Research Ideas? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the capability of LLMs in generating novel research ideas based on information from research papers.	Sandeep Kumar; Tirthankar Ghosal; Vinayak Goyal; Asif Ekbal;	arxiv-cs.CL	2024-09-09
625	Low Latency Transformer Inference on FPGAs for Physics Applications with Hls4ml Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays(FPGAs) using hls4ml.	ZHIXING JIANG et. al.	arxiv-cs.LG	2024-09-08
626	TracrBench: Generating Interpretability Testbeds with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Achieving a mechanistic understanding of transformer-based language models is an open challenge, especially due to their large number of parameters. Moreover, the lack of ground …	Hannes Thurnherr; J’er’emy Scheurer;	ArXiv	2024-09-07
627	You Can Remove GPT2’s LayerNorm By Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we show that it is possible to remove the LN layers from a pre-trained GPT2-small model by fine-tuning on a fraction (500M tokens) of the training data.	Stefan Heimersheim;	arxiv-cs.CL	2024-09-06
628	The Emergence of Large Language Models (LLM) As A Tool in Literature Reviews: An LLM Automated Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review.	Dmitry Scherbakov; Nina Hubig; Vinita Jansari; Alexander Bakumenko; Leslie A. Lenert;	arxiv-cs.DL	2024-09-06
629	Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PIPELOAD mechanism, we present Hermes, a framework optimized for large model inference on edge devices.	XUEYUAN HAN et. al.	arxiv-cs.DC	2024-09-06
630	CA-BERT: Leveraging Context Awareness for Enhanced Multi-Turn Chat Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional models often struggle with determining when additional context is necessary for generating appropriate responses. This paper introduces Context-Aware BERT (CA-BERT), a transformer-based model specifically fine-tuned to address this challenge.	Minghao Liu; Mingxiu Sui; Yi Nan; Cangqing Wang; Zhijie Zhou;	arxiv-cs.CL	2024-09-05
631	From GPT-3.5 to GPT-4.o: A Leap in AI’s Medical Exam Performance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT is a large language model trained on increasingly large datasets to perform diverse language-based tasks. It is capable of answering multiple-choice questions, such as …	Markus Kipp;	Inf.	2024-09-05
632	Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of …	Maheep Chaudhary; Atticus Geiger;	ArXiv	2024-09-05
633	LLM-based Multi-agent Poetry Generation in Non-cooperative Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Under the rationale that the learning process of the poetry generation systems should be more human-like and their output more diverse and novel, we introduce a framework based on social learning where we emphasize non-cooperative interactions besides cooperative interactions to encourage diversity.	Ran Zhang; Steffen Eger;	arxiv-cs.CL	2024-09-05
634	CACER: Clinical Concept Annotations for Cancer Events and Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Clinical Concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48,000 medical problems and drug events and 10,000 drug-problem and problem-problem relations.	YUJUAN FU et. al.	arxiv-cs.CL	2024-09-05
635	Detecting Calls to Action in Multimodal Content: Analysis of The 2021 German Federal Election Campaign on Instagram Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts.	Michael Achmann-Denkler; Jakob Fehle; Mario Haim; Christian Wolff;	arxiv-cs.SI	2024-09-04
636	A Comparative Study of Sentiment Classification Models for Greek Reviews Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, people have expressed their opinions and sentiments about products, services, and other issues on social media platforms and review websites. These sentiments are …	Panagiotis D. Michailidis;	Big Data Cogn. Comput.	2024-09-04
637	OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the experiments and results for the CheckThat!	WŁODZIMIERZ LEWONIEWSKI et. al.	arxiv-cs.CL	2024-09-04
638	MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Finally many Transformer based approaches rely primarily on CNN based decoders overlooking the benefits of Transformer based decoding models. Recognizing these limitations, we address the need efficient lightweight solutions by introducing MobileUNETR, which aims to overcome the performance constraints associated with both CNNs and Transformers while minimizing model size, presenting a promising stride towards efficient image segmentation.	Shehan Perera; Yunus Erzurumlu; Deepak Gulati; Alper Yilmaz;	arxiv-cs.CV	2024-09-04
639	LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Modeling and predicting such intricate behavior without explicit knowledge of the system’s underlying topology presents a significant challenge, motivating the development of algorithms that can generalize across various grid configurations and boundary conditions. We develop a decoder-only generative pretrained transformer (GPT) model to solve this problem, showing that our model can simulate Life on a toroidal grid with no prior knowledge on the size of the grid, or its periodic boundary conditions (LifeGPT).	Jaime A. Berkovich; Markus J. Buehler;	arxiv-cs.AI	2024-09-03
640	How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper seeks to address this gap by providing a comprehensive case study evaluating LLMs’ performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) with respect to privacy policies and data protection regulations. We introduce a Privacy Technical Review (PTR) framework, highlighting its role in mitigating privacy risks during the software development life-cycle.	XICHOU ZHU et. al.	arxiv-cs.CL	2024-09-03
641	Dialogue You Can Trust: Human and AI Perspectives on Generated Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing the GPT-4o API, we generated a diverse dataset of conversations and conducted a two-part experimental analysis.	Ike Ebubechukwu; Johane Takeuchi; Antonello Ceravola; Frank Joublin;	arxiv-cs.CL	2024-09-03
642	Beyond ChatGPT: Enhancing Software Quality Assurance Tasks with Diverse LLMs and Validation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There remains a gap in understanding the performance of various LLMs in this critical domain. This paper aims to address this gap by conducting a comprehensive investigation into the capabilities of several LLMs across two SQA tasks: fault localization and vulnerability detection.	Ratnadira Widyasari; David Lo; Lizi Liao;	arxiv-cs.SE	2024-09-02
643	Research on LLM Acceleration Using The High-Performance RISC-V Processor Xiangshan (Nanhu Version) Based on The Open-Source Matrix Instruction Set Extension (Vector Dot Product) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main contributions of this paper are as follows: For the characteristics of large language models, custom instructions were extended based on the RISC-V instruction set to perform vector dot product calculations, accelerating the computation of large language models on dedicated vector dot product acceleration hardware.	XU-HAO CHEN et. al.	arxiv-cs.AR	2024-09-01
644	A Current-Fed Transformer-Based High-Gain DC–DC Converter With Inverse Gain Characteristic for Renewable Energy Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This article proposes a current-fed high-gain high-efficiency dc–dc converter for renewable energy applications such as photovoltaic energy. The converter is based on the modified …	E. A. O. Barbosa; Mário Lúcio da Silva Martins; L. Limongi; R. Neto; E. J. Barbosa;	IEEE Transactions on Industrial Electronics	2024-09-01
645	Towards Faster Graph Partitioning Via Pre-training and Inductive Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm.	MENG QIN et. al.	arxiv-cs.LG	2024-09-01
646	Selective Information Flow for Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	J. Kugarajeevan; T. Kokul; Amirthalingam Ramanan; S. Fernando;	Expert Syst. Appl.	2024-09-01
647	An Empirical Study on Information Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs’ human-like characteristics, we propose and analyze the effects of a series of simple prompt-based methods, which can be generalized to other LLMs and NLP tasks.	RIDONG HAN et. al.	arxiv-cs.CL	2024-08-31
648	Finding Frames with BERT: A Transformer-based Approach to Generic News Frame Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The availability of digital data offers new possibilities for studying how specific aspects of social reality are made more salient in online communication but also raises challenges related to the scaling of framing analysis and its adoption to new research areas (e.g. studying the impact of artificial intelligence-powered systems on representation of societally relevant issues). To address these challenges, we introduce a transformer-based approach for generic news frame detection in Anglophone online content.	Vihang Jumle; Mykola Makhortykh; Maryna Sydorova; Victoria Vziatysheva;	arxiv-cs.CL	2024-08-30
649	From Text to Emotion: Unveiling The Emotion Annotation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the potential of Large Language Models (LLMs), specifically GPT4, in automating or assisting emotion annotation.	Minxue Niu; Mimansa Jaiswal; Emily Mower Provost;	arxiv-cs.CL	2024-08-30
650	Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning), using leverage retrieval information from the memory to aid in generating accurate answers and persuasive explanations without relying on complex networks and extra datasets.	Su Hyeon Lim; Minkuk Kim; Hyeon Bae Kim; Seong Tae Kim;	arxiv-cs.CV	2024-08-30
651	Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study performs a comparative analysis of various natural language models for medical text classification.	SHUBHAM AGARWAL et. al.	arxiv-cs.CL	2024-08-30
652	Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in The Environmental and Climate Change Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through this research, we aim to contribute to the ongoing discussion on the utility and effectiveness of generative LMs in addressing some of the planet’s most urgent issues, highlighting their strengths and limitations in the context of ecology and CC.	Francesca Grasso; Stefano Locci;	arxiv-cs.CL	2024-08-30
653	ProGRes: Prompted Generative Rescoring on ASR N-Best Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs.	Ada Defne Tur; Adel Moumen; Mirco Ravanelli;	arxiv-cs.CL	2024-08-30
654	Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing far-right and far-left ideological keywords and manually labeled them as extremist or non-extremist.	Beidi Dong; Jin R. Lee; Ziwei Zhu; Balassubramanian Srinivasan;	arxiv-cs.CL	2024-08-29
655	MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Orthogonally, in this work we rely solely on imitation learning that leverages a large dataset of expert MAPF solutions and transformer-based neural network to create a foundation model for MAPF called MAPF-GPT.	Anton Andreychuk; Konstantin Yakovlev; Aleksandr Panov; Alexey Skrynnik;	arxiv-cs.MA	2024-08-29
656	Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Fully Pipelined Distributed Transformer (FPDT) for efficiently training long-context LLMs with extreme hardware efficiency.	JINGHAN YAO et. al.	arxiv-cs.DC	2024-08-29
657	Can AI Replace Human Subjects? A Large-Scale Replication of Psychological Experiments with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that GPT-4 successfully replicates 76.0 percent of main effects and 47.0 percent of interaction effects observed in the original studies, closely mirroring human responses in both direction and significance.	Ziyan Cui; Ning Li; Huaikang Zhou;	arxiv-cs.CL	2024-08-29
658	Unleashing The Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Audio-Language-Referenced SAM 2 (AL-Ref-SAM 2) pipeline to explore the training-free paradigm for audio and language-referenced video object segmentation, namely AVS and RVOS tasks.	SHAOFEI HUANG et. al.	arxiv-cs.CV	2024-08-28
659	FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses Over SORRY-Bench (Automated Multi-shot Jailbreaks) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FRACTURED-SORRY-Bench, a framework for evaluating the safety of Large Language Models (LLMs) against multi-turn conversational attacks.	Aman Priyanshu; Supriti Vijay;	arxiv-cs.CL	2024-08-28
660	Speech Recognition Transformers: Topological-lingualism Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a comprehensive survey of transformer techniques oriented in speech modality.	Shruti Singh; Muskaan Singh; Virender Kadyan;	arxiv-cs.CL	2024-08-27
661	A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this review paper, we provide an extensive overview of various transformer architectures adapted for computer vision tasks.	Gracile Astlin Pereira; Muhammad Hussain;	arxiv-cs.CV	2024-08-27
662	Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models Without Instruction-Following Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction fine-tuning is crucial for today’s large language models (LLMs) to learn to follow instructions and align with human preferences. Conventionally, supervised data, …	Juncheng Xie; Shensian Syu; Hung-yi Lee;	ArXiv	2024-08-27
663	The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of converting these pretrained models for deployment.	Junxiong Wang; Daniele Paliotta; Avner May; Alexander M. Rush; Tri Dao;	arxiv-cs.LG	2024-08-27
664	Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluated multiple models, including OpenAI’s gpt-3.5-turbo, gpt-4-turbo, gpt-4o, ZhipuAI’s glm-4, Anthropic’s claude-3-sonnet-20240229, and MoonShot’s moonshot-v1-8k, using a two-phase testing approach.	LIUCHANG XU et. al.	arxiv-cs.CL	2024-08-26
665	One-layer Transformers Fail to Solve The Induction Heads Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient …	Clayton Sanford; Daniel Hsu; Matus Telgarsky;	arxiv-cs.LG	2024-08-26
666	Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite considerable efforts in attack detection, intrusion detection systems remain mostly reactive, responding to specific patterns or observed anomalies. This work proposes a proactive approach to anticipate and mitigate malicious activities before they cause damage.	Alaeddine Diaf; Abdelaziz Amara Korba; Nour Elislem Karabadji; Yacine Ghamri-Doudane;	arxiv-cs.CR	2024-08-26
667	Bidirectional Awareness Induction in Autoregressive Seq2Seq Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Bidirectional Awareness Induction (BAI), a training method that leverages a subset of elements in the network, the Pivots, to perform bidirectional learning without breaking the autoregressive constraints.	Jia Cheng Hu; Roberto Cavicchioli; Alessandro Capotondi;	arxiv-cs.CL	2024-08-25
668	Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Methods: We used 300 gastroenterology board exam-style multiple-choice questions, 138 of which contain images to systematically assess the impact of model configurations and parameters and prompt engineering strategies utilizing GPT-3.5.	SEYED AMIR AHMAD SAFAVI-NAINI et. al.	arxiv-cs.CL	2024-08-25
669	LowCLIP: Adapting The CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address challenges in vision-language retrieval for low-resource languages, we integrated the CLIP model architecture and employed several techniques to balance computational efficiency with performance.	Ali Asgarov; Samir Rustamov;	arxiv-cs.CV	2024-08-25
670	Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine Against COVID-19 Literature: Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed.	XU TONG et. al.	arxiv-cs.CL	2024-08-24
671	Preliminary Investigations of A Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an innovative architecture that leverages the generative capabilities of zero-shot prompting in Large Language Models (LLMs) such as GPT-4(language only), the predictive ability of few-shot (in-context) learning in Large Multimodal Models (LMMs) such as GPT-4(V)ision, and fuses knowledge across image based and linguistic insights for accurate nanomaterial category prediction.	Sakhinana Sagar Srinivas; Geethan Sannidhi; Sreeja Gangasani; Chidaksh Ravuru; Venkataramana Runkana;	arxiv-cs.CV	2024-08-24
672	CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a CNN-Transformer rectified collaborative learning (CTRCL) framework to learn stronger CNN-based and Transformer-based models for MIS tasks via the bi-directional knowledge transfer between them.	LANHU WU et. al.	arxiv-cs.CV	2024-08-24
673	Enhancing Multi-hop Reasoning Through Knowledge Erasure in Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE).	MENGQI ZHANG et. al.	arxiv-cs.CL	2024-08-22
674	Enhancing Automated Program Repair with Solution Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a compelling question: How can we leverage DR scattered across the issue logs to efficiently enhance APR? To investigate this premise, we introduce DRCodePilot, an approach designed to augment GPT-4-Turbo’s APR capabilities by incorporating DR into the prompt instruction.	JIUANG ZHAO et. al.	arxiv-cs.SE	2024-08-21
675	The Self-Contained Negation Test Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we build on Gubelmann and Handschuh (2022), which studies the modification of PLMs’ predictions as a function of the polarity of inputs, in English.	David Kletz; Pascal Amsili; Marie Candito;	arxiv-cs.CL	2024-08-21
676	Clinical Context-aware Radiology Report Generation from Medical Images Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the use of the transformer model for radiology report generation from chest X-rays.	Sonit Singh;	arxiv-cs.CL	2024-08-21
677	Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.	MENGLIN YANG et. al.	kdd	2024-08-21
678	BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a pipeline for developing an in-house LLM to extract clinical information from radiology reports.	YUXUAN CHEN et. al.	arxiv-cs.CL	2024-08-21
679	Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands.	Pihe Hu; Shaolong Li; Longbo Huang;	arxiv-cs.LG	2024-08-21
680	The MERSA Dataset and A Transformer-Based Approach for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Multimodal Emotion Recognition and Sentiment Analysis (MERSA) dataset, which includes both natural and scripted speech recordings, transcribed text, physiological data, and self-reported emotional surveys from 150 participants collected over a two-week period.	Enshi Zhang; Rafael Trujillo; Christian Poellabauer;	acl	2024-08-20
681	Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer.	MULING WU et. al.	acl	2024-08-20
682	CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of a comprehensive benchmark impedes progress in this field. To bridge this gap, we introduce CharacterEval, a Chinese benchmark for comprehensive RPCA assessment, complemented by a tailored high-quality dataset.	QUAN TU et. al.	acl	2024-08-20
683	D2LLM: Decomposed and Distilled Large Language Models for Semantic Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present D2LLMs�Decomposed and Distilled LLMs for semantic search�that combines the best of both worlds.	Zihan Liao; Hang Yu; Jianguo Li; Jun Wang; Wei Zhang;	acl	2024-08-20
684	Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, methods leveraging pre-trained language models like BERT have been developed, which require less data and yield enhanced performance.	YUCHENG RUAN et. al.	arxiv-cs.CL	2024-08-20
685	Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level operating system microkernel, seL4.	Lichen Zhang; Shuai Lu; Nan Duan;	acl	2024-08-20
686	MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability�MELA, with 46K samples covering 10 languages from a diverse set of language families.	ZIYIN ZHANG et. al.	acl	2024-08-20
687	ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, and undergoes a comprehensive training regime with pre-training, SFT, and RLHF.	Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang;	acl	2024-08-20
688	Self-Evolving GPT: A Lifelong Autonomous Experiential Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential learning framework based on LLMs to explore whether LLMs can imitate human ability for learning and utilizing experience.	JINGLONG GAO et. al.	acl	2024-08-20
689	GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.	Virginia Felkner; Jennifer Thompson; Jonathan May;	acl	2024-08-20
690	Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Aim: Our goal is to improve AD detection performance of various ML/DL models.	Emmanuel Iko-Ojo Simon; Chirath Hettiarachchi; Alex Potanin; Hanna Suominen; Fatemeh Fard;	arxiv-cs.SE	2024-08-20
691	Language Models Can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks.	Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty;	acl	2024-08-20
692	CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Incorrect initial angles between Q and K can cause misestimation in modeling rotary position embedding of the closest tokens. To address this issue, we propose Collinear Constrained Attention mechanism, namely CoCA.	SHIYI ZHU et. al.	acl	2024-08-20
693	Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis.	Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu;	acl	2024-08-20
694	Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs By Sampling with People Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by methods from cognitive science, we propose an iterative method for simultaneously eliciting conversational tones and sentences, where participants alternate between two tasks: (1) one participant identifies the tone of a given sentence and (2) a different participant generates a sentence based on that tone.	Dun-Ming Huang; Pol Van Rijn; Ilia Sucholutsky; Raja Marjieh; Nori Jacoby;	acl	2024-08-20
695	MultiLegalPile: A 689GB Multilingual Legal Corpus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, so far, few datasets are available for specialized critical domains such as law and the available ones are often small and only in English. To fill this gap, we curate and release MultiLegalPile, a 689GB corpus in 24 languages from 17 jurisdictions.	Joel Niklaus; Veton Matoshi; Matthias St�rmer; Ilias Chalkidis; Daniel Ho;	acl	2024-08-20
696	Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate new commercial and open models since VerilogEval’s original release-including GPT-4o, GPT-4 Turbo, Llama3.1 (8B/70B/405B), Llama3 70B, Mistral Large, DeepSeek Coder (33B and 6.7B), CodeGemma 7B, and RTL-Coder-against an improved VerilogEval benchmark suite.	Nathaniel Pinckney; Christopher Batten; Mingjie Liu; Haoxing Ren; Brucek Khailany;	arxiv-cs.AR	2024-08-20
697	MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel map-guided GPT-based agent, dubbed MapGPT, which introduces an online linguistic-formed map to encourage the global exploration.	JIAQI CHEN et. al.	acl	2024-08-20
698	An Empirical Analysis on Large Language Models in Debate Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.	Xinyi Liu; Pinxin Liu; Hangfeng He;	acl	2024-08-20
699	Your Transformer Is Secretly Linear Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper reveals a novel linear characteristic exclusive to transformer decoders, including models like GPT, LLaMA, OPT, BLOOM and others.	ANTON RAZZHIGAEV et. al.	acl	2024-08-20
700	Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Syntactic Transformer language models aim to achieve better generalization through simultaneously modeling syntax trees and sentences.	Yida Zhao; Chao Lou; Kewei Tu;	acl	2024-08-20
701	Linear Transformers with Learnable Kernel Functions Are Better In-Context Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Mirroring the Transformer�s in-context adeptness, it became a strong contender in the field. In our work, we present a singular, elegant alteration to the Based kernel that amplifies its In-Context Learning abilities evaluated with the Multi-Query Associative Recall task and overall language modeling process, as demonstrated on the Pile dataset.	YAROSLAV AKSENOV et. al.	acl	2024-08-20
702	Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian.	CEM ÜYÜK et. al.	arxiv-cs.CL	2024-08-20
703	Rhyme-aware Chinese Lyric Generator Based on GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance.	YIXIAO YUAN et. al.	arxiv-cs.CL	2024-08-19
704	Demystifying The Communication Characteristics for Distributed Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use GPT-based language models as a case study of the transformer architecture due to their ubiquity.	QUENTIN ANTHONY et. al.	arxiv-cs.DC	2024-08-19
705	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs).	Aviv Bick; Kevin Y. Li; Eric P. Xing; J. Zico Kolter; Albert Gu;	arxiv-cs.LG	2024-08-19
706	How Well Do Large Language Models Serve As End-to-End Secure Code Producers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic investigation into LLMs’ inherent potential to generate code with fewer vulnerabilities.	JIANIAN GONG et. al.	arxiv-cs.SE	2024-08-19
707	GPT-based Textile Pilling Classification Using 3D Point Cloud Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PointGPT, the GPT-like big model of point cloud analysis, we incorporate the global features of the input point cloud extracted from the non-parametric network into it, thus proposing the PointGPT+NN model.	Yu Lu; YuYu Chen; Gang Zhou; Zhenghua Lan;	arxiv-cs.CV	2024-08-19
708	LLMSmartSec: Smart Contract Security Auditing with LLM and Annotated Control Flow Graph Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Historically, the complexity of identifying vulnerabilities in smart contracts required human-intensive audits to supplement imprecise automated code scans. The growing smart …	Viraaji Mothukuri; R. Parizi; James L. Massa;	2024 IEEE International Conference on Blockchain …	2024-08-19
709	GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges have resulted in travel difficulties for passengers in certain areas, while many drivers in other areas are unable to secure orders, leading to a decline in the overall quality of urban transportation services. To address these issues, this paper introduces GARLIC: a framework of GPT-Augmented Reinforcement Learning with Intelligent Control for vehicle dispatching.	XIAO HAN et. al.	arxiv-cs.LG	2024-08-19
710	Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare a keyword filtering approach, a RoBERTa model fine-tuned with generic data from CrisisLex, a base RoBERTa model trained with AL and a fine-tuned RoBERTa model trained with AL regarding classification performance.	David Hanny; Sebastian Schmidt; Bernd Resch;	arxiv-cs.CL	2024-08-19
711	A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets.	CLAUDIO M. V. DE ANDRADE et. al.	arxiv-cs.CL	2024-08-18
712	AI Based Multiagent Approach for Requirements Elicitation and Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Requirements Engineering (RE) plays a pivotal role in software development, encompassing tasks such as requirements elicitation, analysis, specification, and change management. …	MALIK ABDUL SAMI et. al.	ArXiv	2024-08-18
713	Attention Is A Smoothed Cubic Spline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We highlight a perhaps important but hitherto unobserved insight: The attention module in a transformer is a smoothed cubic spline.	Zehua Lai; Lek-Heng Lim; Yucong Liu;	arxiv-cs.AI	2024-08-18
714	From Specifications to Prompts: On The Future of Generative LLMs in Requirements Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative LLMs, such as GPT, have the potential to revolutionize Requirements Engineering (RE) by automating tasks in new ways. This column explores the novelties and introduces …	Andreas Vogelsang;	arxiv-cs.SE	2024-08-17
715	See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Designing tasks and finding LLMs’ limitations are becoming increasingly important. In this paper, we investigate the question of whether an LLM can discover its own limitations from the errors it makes.	YULONG CHEN et. al.	arxiv-cs.CL	2024-08-16
716	MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a pure Transformer-based SED model with masked-reconstruction based pre-training, termed MAT-SED.	Pengfei Cai; Yan Song; Kang Li; Haoyu Song; Ian McLoughlin;	arxiv-cs.SD	2024-08-16
717	Retail-GPT: Leveraging Retrieval Augmented Generation (RAG) for Building E-commerce Chat Assistants Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This work presents Retail-GPT, an open-source RAG-based chatbot designed to enhance user engagement in retail e-commerce by guiding users through product recommendations and …	Bruno Amaral Teixeira de Freitas; R. Lotufo;	ArXiv	2024-08-15
718	Extracting Sentence Embeddings from Pretrained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: Given 110M parameters BERT’s hidden representations from multiple layers and multiple tokens we tried various ways to extract optimal sentence representations.	Lukas Stankevičius; Mantas Lukoševičius;	arxiv-cs.CL	2024-08-15
719	Leveraging Web-Crawled Data for High-Quality Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We argue that although the web-crawled data often has formatting errors causing semantic inaccuracies, it can still serve as a valuable source for high-quality supervised fine-tuning in specific domains without relying on advanced models like GPT-4.	Jing Zhou; Chenglin Jiang; Wei Shen; Xiao Zhou; Xiaonan He;	arxiv-cs.CL	2024-08-15
720	Exploring Transformer Models for Sentiment Classification: A Comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning models have proven superior to classical machine learning approaches in various text classification tasks, such as sentiment analysis, question answering, news …	Ali Areshey; H. Mathkour;	Expert Syst. J. Knowl. Eng.	2024-08-14
721	Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This survey paper provides a comprehensive analysis of the utilization of Transformers and LLMs in cyber-threat detection systems.	Hamza Kheddar;	arxiv-cs.CR	2024-08-14
722	CodeMirage: Hallucinations in Code Generated By Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have shown promising potentials in program generation and no-code automation. However, LLMs are prone to generate hallucinations, i.e., they generate …	Vibhor Agarwal; Yulong Pei; Salwa Alamir; Xiaomo Liu;	ArXiv	2024-08-14
723	MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Models for Multimodal Surface Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting).	YONGQUAN HU et. al.	arxiv-cs.HC	2024-08-14
724	Generative AI for Automatic Topic Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to assess the reliability of three LLMs, namely flan, GPT-4o, and GPT-4 mini for topic labelling.	Diego Kozlowski; Carolina Pradier; Pierre Benz;	arxiv-cs.CL	2024-08-13
725	Evaluating Cultural Adaptability of A Large Language Model Via Simulation of Synthetic Personas Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis shows that specifying a person’s country of residence improves GPT-3.5’s alignment with their responses.	Louis Kwok; Michal Bravansky; Lewis D. Griffin;	arxiv-cs.CL	2024-08-13
726	MGH Radiology Llama: A Llama 3 70B Model for Radiology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the field of radiology has increasingly harnessed the power of artificial intelligence (AI) to enhance diagnostic accuracy, streamline workflows, and improve …	YUCHENG SHI et. al.	ArXiv	2024-08-13
727	Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel resource efficient deep learning and transformer based framework, Sumotosima (Summarizer for otoscopic images), an end-to-end pipeline for classification followed by summarization.	Eram Anwarul Khan; Anas Anwarul Haq Khan;	arxiv-cs.CV	2024-08-13
728	Pragmatic Inference of Scalar Implicature By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some.	Ye-eun Cho; Seong mook Kim;	arxiv-cs.CL	2024-08-13
729	A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the constantly evolving field of cybersecurity, it is imperative for analysts to stay abreast of the latest attack trends and pertinent information that aids in the investigation and attribution of cyber-attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecurity experts about cyber-attacks investigations and attribution.	Sampath Rajapaksha; Ruby Rani; Erisa Karafili;	arxiv-cs.CR	2024-08-12
730	Body Transformer: Leveraging Robot Embodiment for Policy Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. Therefore, we propose Body Transformer (BoT), an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process.	Carmelo Sferrazza; Dun-Ming Huang; Fangchen Liu; Jongmin Lee; Pieter Abbeel;	arxiv-cs.RO	2024-08-12
731	A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is a huge gap between LLM’s and human capabilities for understanding abstract concepts and reasoning. We discuss these issues in a larger philosophical context of human knowledge acquisition and the Turing test.	Vladimir Cherkassky; Eng Hock Lee;	arxiv-cs.CL	2024-08-12
732	Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the effectiveness of LLMs in detecting and classifying Common Weakness Enumerations (CWE) using different prompt and role strategies.	Kohei Dozono; Tiago Espinha Gasiba; Andrea Stocco;	arxiv-cs.SE	2024-08-12
733	MGH Radiology Llama: A Llama 3 70B Model for Radiology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an advanced radiology-focused large language model: MGH Radiology Llama.	YUCHENG SHI et. al.	arxiv-cs.CL	2024-08-12
734	The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability of findings across different scenarios. We address this gap by training language models with progressing complexity on trauma-related datasets, including genocide-related court data, a Reddit dataset on post-traumatic stress disorder (PTSD), counseling conversations, and Incel forum posts.	Miriam Schirmer; Tobias Leemann; Gjergji Kasneci; Jürgen Pfeffer; David Jurgens;	arxiv-cs.CL	2024-08-12
735	Spacetime $E(n)$-Transformer: Equivariant Attention for Spatio-temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an $E(n)$-equivariant Transformer architecture for spatio-temporal graph data.	Sergio G. Charles;	arxiv-cs.LG	2024-08-12
736	Is It A Work or Leisure Travel? Applying Text Classification to Identify Work-related Travel on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a model to predict whether a trip is leisure or work-related, utilizing state-of-the-art Automatic Text Classification (ATC) models such as BERT, RoBERTa, and BART to enhance the understanding of user travel purposes and improve recommendation accuracy in specific travel scenarios.	Lucas Félix; Washington Cunha; Jussara Almeida;	arxiv-cs.SI	2024-08-12
737	Cross-Lingual Conversational Speech Summarization with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We build a baseline cascade-based system using open-source speech recognition and machine translation models.	Max Nelson; Shannon Wotherspoon; Francis Keith; William Hartmann; Matthew Snover;	arxiv-cs.CL	2024-08-12
738	GPT-4 Emulates Average-Human Emotional Cognition from A Third-Person Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper extends recent investigations on the emotional reasoning abilities of Large Language Models (LLMs). Current research on LLMs has not directly evaluated the distinction …	Ala Nekouvaght Tak; Jonathan Gratch;	ArXiv	2024-08-11
739	Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches struggle with CQA due to two challenges: (1) precisely identifying necessary conditions and the logical relationship, and (2) verifying conditions to detect any that are missing. In this paper, we propose a novel prompting approach, Chain of condition, by first identifying all conditions and constructing their logical relationships explicitly according to the document, then verifying whether these conditions are satisfied, finally solving the logical expression to indicate any missing conditions and generating the answer accordingly.	Jiuheng Lin; Yuxuan Lai; Yansong Feng;	arxiv-cs.CL	2024-08-10
740	From Text to Insight: Leveraging Large Language Models for Performance Evaluation in Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through comparative analyses across two studies, including various task performance outputs, we demonstrate that LLMs can serve as a reliable and even superior alternative to human raters in evaluating knowledge-based performance outputs, which are a key contribution of knowledge workers.	Ning Li; Huaikang Zhou; Mingze Xu;	arxiv-cs.CL	2024-08-09
741	Evaluating The Capability of Large Language Models to Personalize Science Texts for Diverse Middle-school-age Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, GPT-4 was used to profile student learning preferences based on choices made during a training session.	Michael Vaccaro Jr; Mikayla Friday; Arash Zaghi;	arxiv-cs.HC	2024-08-09
742	Retrieval-augmented Code Completion for Local Projects Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on using LLMs with around 160 million parameters that are suitable for local execution and augmentation with retrieval from local projects.	Marko Hostnik; Marko Robnik-Šikonja;	arxiv-cs.SE	2024-08-09
743	Transformer Explainer: Interactive Learning of Text-Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model.	AEREE CHO et. al.	arxiv-cs.LG	2024-08-08
744	Multi-Class Intrusion Detection Based on Transformer for IoT Networks Using CIC-IoT-2023 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study uses deep learning methods to explore the Internet of Things (IoT) network intrusion detection method based on the CIC-IoT-2023 dataset. This dataset contains extensive …	Shu-Ming Tseng; Yan-Qi Wang; Yung-Chung Wang;	Future Internet	2024-08-08
745	Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles Using LLMs and LMMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores how LLMs and LMMs can assist journalistic practice by generating contextualised captions for images accompanying news articles.	Aliki Anagnostopoulou; Thiago Gouvea; Daniel Sonntag;	arxiv-cs.CL	2024-08-08
746	Towards Explainable Network Intrusion Detection Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current state-of-the-art NIDS rely on artificial benchmarking datasets, resulting in skewed performance when applied to real-world networking environments. Therefore, we compare the GPT-4 and LLama3 models against traditional architectures and transformer-based models to assess their ability to detect malicious NetFlows without depending on artificially skewed datasets, but solely on their vast pre-trained acquired knowledge.	Paul R. B. Houssel; Priyanka Singh; Siamak Layeghy; Marius Portmann;	arxiv-cs.CR	2024-08-08
747	Image-to-LaTeX Converter for Mathematical Formulas and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this project, we train a vision encoder-decoder model to generate LaTeX code from images of mathematical formulas and text.	Daniil Gurgurov; Aleksey Morshnev;	arxiv-cs.CL	2024-08-07
748	SocFedGPT: Federated GPT-based Adaptive Content Filtering System Leveraging User Interactions in Social Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study presents a multifaceted approach to enhancing user interaction and content relevance in social media platforms through a federated learning framework. We introduce …	Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder;	ArXiv	2024-08-07
749	Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge.	Steven Y. Feng; Noah D. Goodman; Michael C. Frank;	arxiv-cs.CL	2024-08-07
750	A Comparison of LLM Finetuning Methods & Evaluation Metrics with Travel Chatbot Use Case Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We used two pretrained LLMs utilized for fine-tuning research: LLaMa 2 7B, and Mistral 7B.	Sonia Meyer; Shreya Singh; Bertha Tam; Christopher Ton; Angel Ren;	arxiv-cs.CL	2024-08-07
751	Could ChatGPT Get An Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conceptualize these challenges through the lens of vulnerability, the potential for university assessments and learning outcomes to be impacted by student use of generative AI. We investigate the potential scale of this vulnerability by measuring the degree to which AI assistants can complete assessment questions in standard university-level STEM courses.	BEATRIZ BORGES et. al.	arxiv-cs.CY	2024-08-07
752	Evaluating Source Code Quality with Large Languagem Models: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to describe and analyze the results obtained using LLMs as a static analysis tool, evaluating the overall quality of code.	Igor Regis da Silva Simões; Elaine Venson;	arxiv-cs.SE	2024-08-07
753	FLASH: Federated Learning-Based LLMs for Advanced Query Processing in Social Networks Through RAG Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our paper introduces a novel approach to social network information retrieval and user engagement through a personalized chatbot system empowered by Federated Learning GPT. The …	Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder;	ArXiv	2024-08-06
754	Training LLMs to Recognize Hedges in Spontaneous Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: After an error analysis on the top performing approaches, we used an LLM-in-the-Loop approach to improve the gold standard coding, as well as to highlight cases in which hedges are ambiguous in linguistically interesting ways that will guide future research.	Amie J. Paige; Adil Soubki; John Murzaku; Owen Rambow; Susan E. Brennan;	arxiv-cs.CL	2024-08-06
755	Evaluating The Translation Performance of Large Language Models Based on Euas-20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the significant progress in translation performance achieved by large language models, machine translation still faces many challenges. Therefore, in this paper, we construct the dataset Euas-20 to evaluate the performance of large language models on translation tasks, the translation ability on different languages, and the effect of pre-training data on the translation ability of LLMs for researchers and developers.	Yan Huang; Wei Liu;	arxiv-cs.CL	2024-08-06
756	Accuracy and Consistency of LLMs in The Registered Dietitian Exam: The Impact of Prompt Engineering and Knowledge Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to employ the Registered Dietitian (RD) exam to conduct a standard and comprehensive evaluation of state-of-the-art LLMs, GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro, assessing both accuracy and consistency in nutrition queries.	Iman Azimi; Mohan Qi; Li Wang; Amir M. Rahmani; Youlin Li;	arxiv-cs.CL	2024-08-06
757	HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models.	Pratyush Dhingra; Janardhan Rao Doppa; Partha Pratim Pande;	arxiv-cs.AR	2024-08-06
758	PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent success of pre-trained models (PTMs) in natural language processing (NLP), we present PTM4Tag+, a tag recommendation framework for Stack Overflow posts that utilizes PTMs in language modeling.	JUNDA HE et. al.	arxiv-cs.SE	2024-08-05
759	Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG.	Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour;	arxiv-cs.SE	2024-08-05
760	Evaluating The Performance of Large Language Models for SDG Mapping (Technical Report) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we compare the performance of various language models on the Sustainable Development Goal (SDG) mapping task, using the output of GPT-4o as the baseline.	Hui Yin; Amir Aryani; Nakul Nambiar;	arxiv-cs.LG	2024-08-04
761	X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities.	HAOYUAN JIANG et. al.	ijcai	2024-08-03
762	QFormer: An Efficient Quaternion Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Secondly, the DCNNs or Transformer-based image denoising models usually have a large number of parameters, high computational complexity, and slow inference speed. To resolve these issues, this paper proposes a highly-efficient Quaternion Transformer (QFormer) for image denoising.	Bo Jiang; Yao Lu; Guangming Lu; Bob Zhang;	ijcai	2024-08-03
763	Class-consistent Contrastive Learning Driven Cross-dimensional Transformer for 3D Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer emerges as an active research topic in medical image analysis. Yet, three substantial challenges limit the effectiveness of both 2D and 3D Transformers in 3D medical …	Qikui Zhu; Chuan Fu; Shuo Li;	ijcai	2024-08-03
764	MiniCPM-V: A GPT-4V Level MLLM on Your Phone IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MiniCPM-V, a series of efficient MLLMs deployable on end-side devices.	YUAN YAO et. al.	arxiv-cs.CV	2024-08-03
765	Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants.	ZHUOYI LIN et. al.	ijcai	2024-08-03
766	TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires.	Yicheng Lin; Dandan Zhang; Yun Liu;	arxiv-cs.LG	2024-08-02
767	Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the basis of this analysis, we propose DefinitionEMB, a method that utilizes definitions to re-construct isotropically distributed and semantics-related token embeddings for encoder-based PLMs while maintaining original robustness during fine-tuning.	Ying Zhang; Dongyuan Li; Manabu Okumura;	arxiv-cs.CL	2024-08-02
768	Toward Automatic Relevance Judgment Using Vision-Language Models for Image-Text Retrieval Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain. This paper assesses …	Jheng-Hong Yang; Jimmy Lin;	ArXiv	2024-08-02
769	Efficacy of Large Language Models in Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the effectiveness of Large Language Models (LLMs) in interpreting existing literature through a systematic review of the relationship between Environmental, Social, and Governance (ESG) factors and financial performance.	Aaditya Shah; Shridhar Mehendale; Siddha Kanthi;	arxiv-cs.CL	2024-08-02
770	Toward Automatic Relevance Judgment Using Vision–Language Models for Image–Text Retrieval Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain.	Jheng-Hong Yang; Jimmy Lin;	arxiv-cs.IR	2024-08-02
771	High-Throughput Phenotyping of Clinical Text Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a performance comparison of GPT-4 and GPT-3.5-Turbo.	Daniel B. Hier; S. Ilyas Munzir; Anne Stahlfeld; Tayo Obafemi-Ajayi; Michael D. Carrithers;	arxiv-cs.CL	2024-08-02
772	Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces ‘Psycho Analyst’, a custom GPT model based on OpenAI’s GPT-4, optimized for pre-screening mental health disorders.	Jinwen Tang; Yi Shang;	arxiv-cs.CY	2024-08-02
773	Leveraging Large Language Models (LLMs) for Traffic Management at Urban Intersections: The Case of Mixed Traffic Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the ability of a Large Language Model (LLM), specifically, GPT-4o-mini to improve traffic management at urban intersections.	Sari Masri; Huthaifa I. Ashqar; Mohammed Elhenawy;	arxiv-cs.CL	2024-08-01
774	Granting GPT-4 License and Opportunity: Enhancing Accuracy and Confidence Estimation for Few-Shot Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present effort explores methods for effective confidence estimation with GPT-4 with few-shot learning for event detection in the BETTER ontology as a vehicle.	Steven Fincke; Adrien Bibal; Elizabeth Boschee;	arxiv-cs.AI	2024-08-01
775	MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception …	CONG JIN et. al.	IEEE Transactions on Circuits and Systems for Video …	2024-08-01
776	TR-TransGAN: Temporal Recurrent Transformer Generative Adversarial Network for Longitudinal MRI Dataset Expansion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Longitudinal magnetic resonance imaging (MRI) datasets have important implications for the study of degenerative diseases because such datasets have data from multiple points in …	CHEN-CHEN FAN et. al.	IEEE Transactions on Cognitive and Developmental Systems	2024-08-01
777	Bilateral Transformer 3D Planar Recovery Related Papers Related Patents Related Grants Related Venues Related Experts View	Fei Ren; Chunhua Liao; Zhina Xie;	Graph. Model.	2024-08-01
778	Bidirectional Interaction of CNN and Transformer Feature for Visual Tracking Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Empowered by the sophisticated long-range dependency modeling ability of Transformer, tracking performance has seen a dynamic increase in recent years. Approaches in this vein …	Baozhen Sun; Zhenhua Wang; Shilei Wang; Yongkang Cheng; Jifeng Ning;	IEEE Transactions on Circuits and Systems for Video …	2024-08-01
779	MAE-EEG-Transformer: A Transformer-based Approach Combining Masked Autoencoder and Cross-individual Data Augmentation Pre-training for EEG Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Miao Cai; Yu Zeng;	Biomed. Signal Process. Control.	2024-08-01
780	Unmasking Large Language Models By Means of OpenAI GPT-4 and Google AI: A Deep Instruction-based Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View	IDREES A. ZAHID et. al.	Intell. Syst. Appl.	2024-08-01
781	Performance of Recent Large Language Models for A Low-Resourced Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have shown significant advances in the past year.	Ravindu Jayakody; Gihan Dias;	arxiv-cs.CL	2024-07-31
782	The Llama 3 Herd of Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new set of foundation models, called Llama 3.	AARON GRATTAFIORI et. al.	arxiv-cs.AI	2024-07-31
783	OmniParser for Pure Vision Based GUI Agent Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface.	Yadong Lu; Jianwei Yang; Yelong Shen; Ahmed Awadallah;	arxiv-cs.CV	2024-07-31
784	Generative Expressive Conversational Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, due to the limitations of small-scale datasets containing scripted recording styles, they often fail to simulate real natural conversational styles. To address the above issues, we propose a novel generative expressive CSS system, termed GPT-Talker.	Rui Liu; Yifan Hu; Yi Ren; Xiang Yin; Haizhou Li;	arxiv-cs.CL	2024-07-31
785	Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ultimately, we find that the GPT models that we evaluated are not suitable for fully automated vulnerability scanning because the false positive and false negative rates are too high to likely be useful in practice.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CR	2024-07-31
786	Enhancing Agricultural Machinery Management Through Advanced LLM Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach that leverages large language models (LLMs), particularly GPT-4, combined with multi-round prompt engineering to enhance decision-making processes in agricultural machinery management.	Emily Johnson; Noah Wilson;	arxiv-cs.CL	2024-07-30
787	Robust Load Prediction of Power Network Clusters Based on Cloud-Model-Improved Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Presenting an innovative approach, the Cloud Model Improved Transformer (CMIT) method integrates the Transformer model with the cloud model utilizing the particle swarm optimization algorithm, with the aim of achieving robust and precise power load predictions.	Cheng Jiang; Gang Lu; Xue Ma; Di Wu;	arxiv-cs.LG	2024-07-30
788	Interpretable Pre-Trained Transformers for Heart Time-Series Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we employ this framework to the analysis of clinical heart time-series data, to create two pre-trained general purpose cardiac models, termed PPG-PT and ECG-PT.	Harry J. Davies; James Monsen; Danilo P. Mandic;	arxiv-cs.LG	2024-07-30
789	Legal Minds, Algorithmic Decisions: How LLMs Apply Constitutional Principles in Complex Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct an empirical analysis of how large language models (LLMs), specifically GPT-4, interpret constitutional principles in complex decision-making scenarios.	Camilla Bignotti; Carolina Camassa;	arxiv-cs.CL	2024-07-29
790	Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address sentiment analysis of Lithuanian five-star-based online reviews from multiple domains that we collect and clean.	Brigita Vileikytė; Mantas Lukoševičius; Lukas Stankevičius;	arxiv-cs.CL	2024-07-29
791	DuA: Dual Attentive Transformer in Long-Term Continuous EEG Emotion Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods encounter significant challenges in real-life scenarios where emotional states evolve over extended periods. To address this issue, we propose a Dual Attentive (DuA) transformer framework for long-term continuous EEG emotion analysis.	YUE PAN et. al.	arxiv-cs.HC	2024-07-29
792	MM-Transformer: A Transformer-Based Knowledge Graph Link Prediction Model That Fuses Multimodal Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multimodal knowledge graph completion necessitates the integration of information from multiple modalities (such as images and text) into the structural representation of entities …	DONGSHENG WANG et. al.	Symmetry	2024-07-29
793	AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We derive an analytical model for the dependence of optimal weights on data scale and introduce AutoScale, a novel, practical approach for optimizing data compositions at potentially large training data scales.	FEIYANG KANG et. al.	arxiv-cs.LG	2024-07-29
794	Motamot: A Dataset for Revealing The Supremacy of Large Language Models Over Transformer Models in Bengali Political Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate political sentiment analysis during Bangladeshi elections, specifically examining how effectively Pre-trained Language Models (PLMs) and Large Language Models (LLMs) capture complex sentiment characteristics.	FATEMA TUJ JOHORA FARIA et. al.	arxiv-cs.CL	2024-07-28
795	The Impact of LoRA Adapters for LLMs on Clinical NLP Classification Under Data Limitations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to the domain gap and limited data availability.	Thanh-Dung Le; Ti Ti Nguyen; Vu Nguyen Ha;	arxiv-cs.CL	2024-07-27
796	FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks.	Seyed Mojtaba Sadjadi; Zeinab Rajabi; Leila Rabiei; Mohammad-Shahram Moin;	arxiv-cs.CL	2024-07-27
797	QT-TDM: Planning With Transformer Dynamics Model and Autoregressive Q-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the success of the Transformer architecture in natural language processing and computer vision, we investigate the use of Transformers in Reinforcement Learning (RL), specifically in modeling the environment’s dynamics using Transformer Dynamics Models (TDMs).	Mostafa Kotb; Cornelius Weber; Muhammad Burhan Hafez; Stefan Wermter;	arxiv-cs.LG	2024-07-26
798	GPT Deciphering Fedspeak: Quantifying Dissent Among Hawks and Doves Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPT-4 to quantify dissent among members on the topic of inflation.	DENIS PESKOFF et. al.	arxiv-cs.AI	2024-07-26
799	Is Larger Always Better? Evaluating and Prompting Large Language Models for Non-generative Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models, for non-generative medical tasks utilizing renowned datasets.	YINGHAO ZHU et. al.	arxiv-cs.CL	2024-07-26
800	Using GPT-4 to Guide Causal Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are interested in the ability of LLMs to identify causal relationships.	Anthony C. Constantinou; Neville K. Kitson; Alessio Zanga;	arxiv-cs.AI	2024-07-26
801	Automatic Detection of Moral Values in Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues.	Vjosa Preniqi; Iacopo Ghinassi; Julia Ive; Kyriaki Kalimeri; Charalampos Saitis;	arxiv-cs.CY	2024-07-26
802	Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel joint graph learning approach that combines the rich contextual representations learned by pre-trained single-cell language models with the structured knowledge encoded in GRNs using graph neural networks (GNNs).	Sindhura Kommu; Yizhi Wang; Yue Wang; Xuan Wang;	arxiv-cs.LG	2024-07-25
803	The Power of Combining Data and Knowledge: GPT-4o Is An Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models to enhance LNM prediction performance.	Danqing Hu; Bing Liu; Xiaofeng Zhu; Nan Wu;	arxiv-cs.CL	2024-07-25
804	HDL-GPT: High-Quality HDL Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models.	BHUVNESH KUMAR et. al.	arxiv-cs.LG	2024-07-25
805	My Ontologist: Evaluating BFO-Based AI for Definition Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through iterative development of a specialized GPT model named My Ontologist, we aimed to generate BFO-conformant ontologies.	Carter Benson; Alec Sculley; Austin Liebers; John Beverley;	arxiv-cs.DB	2024-07-24
806	Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving.	Zuoyin Tang; Jianhua He; Dashuai Pei; Kezhong Liu; Tao Gao;	arxiv-cs.AI	2024-07-24
807	Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have exhibited remarkable proficiency in natural language understanding, prompting extensive exploration of their potential applications across …	Cui Long; Yongbin Liu; Chunping Ouyang; Ying Yu;	ArXiv	2024-07-24
808	SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we introduced SDoH-GPT, a simple and effective few-shot Large Language Model (LLM) method leveraging contrastive examples and concise instructions to extract SDoH without relying on extensive medical annotations or costly human intervention.	BERNARDO CONSOLI et. al.	arxiv-cs.CL	2024-07-24
809	Cost-effective Instruction Learning for Pathology Vision and Language Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we propose a cost-effective instruction learning framework for conversational pathology named as CLOVER.	KAITAO CHEN et. al.	arxiv-cs.AI	2024-07-24
810	Artificial Intelligence in Extracting Diagnostic Data from Dental Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research addresses the issue of missing structured data in dental records by extracting diagnostic information from unstructured text.	YAO-SHUN CHUANG et. al.	arxiv-cs.CL	2024-07-23
811	Can Large Language Models Automatically Jailbreak GPT-4V? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization.	YUANWEI WU et. al.	arxiv-cs.CL	2024-07-23
812	OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code.	FAN CUI et. al.	arxiv-cs.AR	2024-07-23
813	RadioRAG: Factual Large Language Models for Enhanced Diagnostics in Radiology Using Dynamic Retrieval Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have advanced the field of artificial intelligence (AI) in medicine. However LLMs often generate outdated or inaccurate information based on static …	SOROOSH TAYEBI et. al.	ArXiv	2024-07-22
814	KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the adaptation of Transformerbased models for edge devices through the quantisation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC-V platform.	Aness Al-Qawlaq; Ajay Kumar M; Deepu John;	arxiv-cs.AR	2024-07-22
815	Inverted Activations: Reducing Memory Footprint in Neural Network Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a modification to the handling of activation tensors in pointwise nonlinearity layers.	Georgii Novikov; Ivan Oseledets;	arxiv-cs.LG	2024-07-22
816	Dissecting Multiplication in Transformers: Insights Into LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on observation and analysis, we infer the reasons of transformers deficiencies in multiplication tasks lies in their difficulty in calculating successive carryovers and caching intermediate results, and confirmed this inference through experiments. Guided by these findings, we propose improvements to enhance transformers performance on multiplication tasks.	Luyu Qiu; Jianing Li; Chi Su; Chen Jason Zhang; Lei Chen;	arxiv-cs.CL	2024-07-22
817	Can GPT-4 Learn to Analyse Moves in Research Article Abstracts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we employ the affordances of GPT-4 to automate the annotation process by using natural language prompts.	Danni Yu; Marina Bondi; Ken Hyland;	arxiv-cs.CL	2024-07-22
818	Efficient Visual Transformer By Learnable Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Learnable Token Merging (LTM), or LTM-Transformer.	Yancheng Wang; Yingzhen Yang;	arxiv-cs.CV	2024-07-21
819	LLMs Left, Right, and Center: Assessing GPT’s Capabilities to Label Political Bias from Web Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale (far-left to far-right).	Raphael Hernandes; Giulio Corsi;	arxiv-cs.CL	2024-07-19
820	Unipa-GPT: Large Language Models for University-oriented QA in Italian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments we adopted both the Retrieval Augmented Generation (RAG) approach and fine-tuning to develop the system.	Irene Siragusa; Roberto Pirrone;	arxiv-cs.CL	2024-07-19
821	Can Open-Source LLMs Compete with Commercial Models? Exploring The Few-Shot Performance of Current GPT Models in Biomedical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3.5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning.	Samy Ateia; Udo Kruschwitz;	arxiv-cs.CL	2024-07-18
822	Evaluating Large Language Models for Anxiety and Depression Classification Using Counseling and Psychotherapy Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts.	Junwei Sun; Siqi Ma; Yiran Fan; Peter Washington;	arxiv-cs.CL	2024-07-18
823	Sharif-STR at SemEval-2024 Task 1: Transformer As A Regression Model for Fine-Grained Scoring of Textual Semantic Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the investigation of sentence-level STR within Track A (Supervised) by leveraging fine-tuning techniques on the RoBERTa transformer.	SEYEDEH FATEMEH EBRAHIMI et. al.	arxiv-cs.CL	2024-07-17
824	LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel LLMs-in-the-loop approach to develop supervised neural machine translation models optimized specifically for medical texts.	Bunyamin Keles; Murat Gunay; Serdar I. Caglar;	arxiv-cs.CL	2024-07-16
825	Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches …	Yaswanth Narsupalli; Abhranil Chandra; Sreevatsa Muppirala; Manish Gupta; Pawan Goyal;	ArXiv	2024-07-16
826	Large Language Models As Misleading Assistants in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users.	BETTY LI HOU et. al.	arxiv-cs.CL	2024-07-16
827	Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task.	SEYEDEH FATEMEH EBRAHIMI et. al.	arxiv-cs.CL	2024-07-16
828	Educational Personalized Learning Path Planning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its potential, traditional PLPP systems often lack adaptability, interactivity, and transparency. This paper proposes a novel approach integrating Large Language Models (LLMs) with prompt engineering to address these challenges.	Chee Ng; Yuen Fung;	arxiv-cs.CL	2024-07-16
829	GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution is a set of features, their properties, definitions, and examples in a machine-readable format, along with the code for RhetAnn and the GPT prompts and fine-tuning procedures for advancing state-of-the-art interpretable propaganda technique detection.	Kyle Hamilton; Luca Longo; Bojan Bozic;	arxiv-cs.CL	2024-07-16
830	Does Refusal Training in LLMs Generalize to The Past Tense? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We systematically evaluate this method on Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, GPT-4o mini, GPT-4o, o1-mini, o1-preview, and R2D2 models using GPT-3.5 Turbo as a reformulation model.	Maksym Andriushchenko; Nicolas Flammarion;	arxiv-cs.CL	2024-07-16
831	GPT-4V Cannot Generate Radiology Reports Yet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we perform a systematic evaluation of GPT-4V in generating radiology reports on two chest X-ray report datasets: MIMIC-CXR and IU X-Ray.	Yuyang Jiang; Chacha Chen; Dang Nguyen; Benjamin M. Mervak; Chenhao Tan;	arxiv-cs.CY	2024-07-16
832	ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by the need for lightweight, open source, and multilingual dialogue evaluators, this paper introduces GenResCoh (Generated Responses targeting Coherence).	John Mendonça; Isabel Trancoso; Alon Lavie;	arxiv-cs.CL	2024-07-16
833	R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rigorous insights are provided into the influence of jamming LLM word embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE).	ALADIN DJUHERA et. al.	arxiv-cs.LG	2024-07-16
834	A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies show that creating a high-quality training dataset for software engineering chatbots is expensive in terms of both resources and time. Aims: Therefore, in this paper, we present an automated transformer-based approach to augment software engineering chatbot datasets.	Ahmad Abdellatif; Khaled Badran; Diego Elias Costa; Emad Shihab;	arxiv-cs.SE	2024-07-16
835	Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To thoroughly examine the behaviours of Transformer-based MDS models, this paper presents five empirical studies on (1) measuring the impact of document boundary separators quantitatively; (2) exploring the effectiveness of different mainstream Transformer structures; (3) examining the sensitivity of the encoder and decoder; (4) discussing different training strategies; and (5) discovering the repetition in a summary generation.	Congbo Ma; Wei Emma Zhang; Dileepa Pitawela; Haojie Zhuang; Yanfeng Shu;	arxiv-cs.CL	2024-07-16
836	Scientific QA System with Verifiable Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the VerifAI project, a pioneering open-source scientific question-answering system, designed to provide answers that are not only referenced but also automatically vetted and verifiable.	ADELA LJAJIĆ et. al.	arxiv-cs.CL	2024-07-16
837	GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images Via VLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4o can decode hand gestures from forearm ultrasound data even with no fine-tuning, and improves with few-shot, in-context learning.	Keshav Bimbraw; Ye Wang; Jing Liu; Toshiaki Koike-Akino;	arxiv-cs.CV	2024-07-15
838	Leveraging LLM-Respondents for Item Evaluation: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, item calibration is time-consuming and costly, requiring a sufficient number of respondents for the response process. We explore using six different LLMs (GPT-3.5, GPT-4, Llama 2, Llama 3, Gemini-Pro, and Cohere Command R Plus) and various combinations of them using sampling methods to produce responses with psychometric properties similar to human answers.	Yunting Liu; Shreya Bhandari; Zachary A. Pardos;	arxiv-cs.CY	2024-07-15
839	Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present novel approaches that use a generative pretrained transformer (GPT) to identify paraphasias from transcripts as well as two end-to-end approaches that focus on modeling both automatic speech recognition (ASR) and paraphasia classification as multiple sequences vs. a single sequence.	Matthew Perez; Aneesha Sampath; Minxue Niu; Emily Mower Provost;	arxiv-cs.CL	2024-07-15
840	Hierarchical Local Temporal Feature Enhancing for Transformer-Based 3D Human Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advancements in transformer-based methods have yielded substantial success in 2D-to-3D human pose estimation. Transformer-based estimators have their inherent advantages …	Xin Yan; Chi-Man Pun; Haolun Li; Mengqi Liu; Hao Gao;	2024 IEEE International Conference on Multimedia and Expo …	2024-07-15
841	DistillSeq: A Framework for Safety Alignment Testing in Large Language Models Using Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we deploy two distinct strategies for generating malicious queries: one based on a syntax tree approach, and the other leveraging an LLM-based method.	Mingke Yang; Yuqi Chen; Yi Liu; Ling Shi;	arxiv-cs.SE	2024-07-14
842	ChatGPT-3.5 and -4.0 and Mechanical Engineering: Examining Performance on The FE Mechanical Engineering and Undergraduate Exams Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The launch of Generative Pretrained Transformer (ChatGPT) at the end of 2022 generated large interest in possible applications of artificial intelligence (AI) in science, …	Matthew Frenkel; Hebah Emara;	Comput. Appl. Eng. Educ.	2024-07-14
843	Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.	Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu;	sigir	2024-07-14
844	Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4).	GE GAO et. al.	arxiv-cs.CL	2024-07-14
845	Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, current works use GPT-4 to only predict the correct option without providing any explanation and thus do not provide any insight into the thinking process and reasoning used by GPT-4 or other LLMs. Therefore, we introduce a new domain-specific error taxonomy derived from collaboration with medical students.	SOUMYADEEP ROY et. al.	sigir	2024-07-14
846	Reflections on The Coding Ability of LLMs for Analyzing Market Research Surveys Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first systematic study of applying large language models (in our case, GPT-3.5 and GPT-4) for the automatic coding (multi-class classification) problem in market research.	Shi Zong; Santosh Kolagati; Amit Chaudhary; Josh Seltzer; Jimmy Lin;	sigir	2024-07-14
847	Legal Statute Identification: A Case Study Using State-of-the-Art Datasets and Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reproduce several LSI models on two popular LSI datasets and study the effect of the above-mentioned challenges.	Shounak Paul; Rajas Bhatt; Pawan Goyal; Saptarshi Ghosh;	sigir	2024-07-14
848	Generalizable Tip-of-the-Tongue Retrieval with LLM Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the generalization capabilities of existing retrieval methods with ToT queries in multiple domains.	Lu\'{\i}s Borges; Rohan Jha; Jamie Callan; Bruno Martins;	sigir	2024-07-14
849	Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking Over Larger Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages.	Vinay Setty;	sigir	2024-07-14
850	CodeV: Empowering LLMs for Verilog Generation Through Multi-Level Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs.	YANG ZHAO et. al.	arxiv-cs.PL	2024-07-14
851	Causality Extraction from Medical Text Using Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of natural language models, including large language models, to extract causal relations from medical texts, specifically from Clinical Practice Guidelines (CPGs).	Seethalakshmi Gopalakrishnan; Luciana Garbayo; Wlodek Zadrozny;	arxiv-cs.CL	2024-07-13
852	Document-level Clinical Entity and Relation Extraction Via Knowledge Base-Guided Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability.	Kriti Bhattarai; Inez Y. Oh; Zachary B. Abrams; Albert M. Lai;	arxiv-cs.CL	2024-07-13
853	Robustness of LLMs to Perturbations in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs’ robustness against the corrupt variations of the original text.	Ayush Singh; Navpreet Singh; Shubham Vatsal;	arxiv-cs.CL	2024-07-12
854	EVOLVE: Predicting User Evolution and Network Dynamics in Social Media Using Fine-Tuned GPT-like Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we propose a predictive method to understand how a user evolves on social media throughout their life and to forecast the next stage of their evolution.	Ismail Hossain; Md Jahangir Alam; Sai Puppala; Sajedul Talukder;	arxiv-cs.SI	2024-07-12
855	On Exact Bit-level Reversible Transformers Without Changing Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference.	Guoqiang Zhang; J. P. Lewis; W. B. Kleijn;	arxiv-cs.LG	2024-07-12
856	Show, Don’t Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To probe generalization, we introduce three new games: LEGO Connect Language (LCL) for spatial logic, a shape recognition game, and Guess-the-SMILES (GtS), an advanced spatial logic benchmark in chemistry.	Gonçalo Hora de Carvalho; Oscar Knap; Robert Pollice;	arxiv-cs.AI	2024-07-12
857	Movie Recommendation with Poster Attention Via Multi-modal Transformer Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal movie recommendation system by extract features of the well designed posters for each movie and the narrative text description of the movie.	Linhan Xia; Yicheng Yang; Ziou Chen; Zheng Yang; Shengxin Zhu;	arxiv-cs.IR	2024-07-12
858	The Two Sides of The Coin: Hallucination Generation and Detection with LLMs As Evaluators for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content.	ANH THU MARIA BUI et. al.	arxiv-cs.AI	2024-07-12
859	A Survey on Symbolic Knowledge Distillation of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This survey paper delves into the emerging and critical area of symbolic knowledge distillation in Large Language Models (LLMs). As LLMs like Generative Pre-trained Transformer-3 …	Kamal Acharya; Alvaro Velasquez; H. Song;	ArXiv	2024-07-12
860	Detect Llama — Finding Vulnerabilities in Smart Contracts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we test the hypothesis that although OpenAI’s GPT-4 performs well generally, we can fine-tune open-source models to outperform GPT-4 in smart contract vulnerability detection.	Peter Ince; Xiapu Luo; Jiangshan Yu; Joseph K. Liu; Xiaoning Du;	arxiv-cs.CR	2024-07-11
861	GPT-4 Is Judged More Human Than Humans in Displaced and Inverted Turing Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall.	Ishika Rathi; Sydney Taylor; Benjamin K. Bergen; Cameron R. Jones;	arxiv-cs.HC	2024-07-11
862	LLMs’ Morphological Analyses of Complex FST-generated Finnish Words Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms.	Anssi Moisio; Mathias Creutz; Mikko Kurimo;	arxiv-cs.CL	2024-07-11
863	Teaching Transformers Causal Reasoning Through Axiomatic Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data.	Aniket Vashishtha; Abhinav Kumar; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma;	arxiv-cs.LG	2024-07-10
864	FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the previous approaches has investigated the efficiency of LLM-based few-shot learning in domain-specific scenarios. To address this gap, we introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets, with a focus on industrial manufacturing and maintenance, while using multiple LLMs — GPT-4-32K, GPT-3.5-Turbo, LLaMA 2-chat, and Vicuna.	Yongjian Tang; Rakebul Hasan; Thomas Runkler;	arxiv-cs.CL	2024-07-10
865	Transformer Neural Networks with Spatiotemporal Attention for Predictive Control and Optimization of Industrial Processes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the context of real-time optimization and model predictive control of industrial systems, machine learning, and neural networks represent cutting-edge tools that hold promise …	Ethan R. Gallup; Jacob F. Tuttle; Jake Immonen; Blake W. Billings; Kody M. Powell;	2024 American Control Conference (ACC)	2024-07-10
866	ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose Random Subspace Adaptation (ROSA), a method that outperforms previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time.	Marawan Gamal Abdel Hameed; Aristides Milios; Siva Reddy; Guillaume Rabusseau;	arxiv-cs.LG	2024-07-10
867	Short Answer Scoring with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View	Lan Jiang; Nigel Bosch;	ACM Conference on Learning @ Scale	2024-07-09
868	Mixture-of-Modules: Reinventing Transformers As Dynamic Assemblies of Modules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization.	ZHUOCHENG GONG et. al.	arxiv-cs.CL	2024-07-09
869	A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine five feature extraction methods (TF-IDF, LSI, BERT, MiniLM, RoBERTa) and find that Term Frequency-Inverse Document Frequency (TF-IDF) outperforms the other four methods with a precision of 75\% and an F1 score of 64\%.	Refat Othman; Bruno Rossi; Russo Barbara;	arxiv-cs.CR	2024-07-09
870	Prompting Techniques for Secure Code Generation: A Systematic Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OBJECTIVE: In this study, we investigate the impact of different prompting techniques on the security of code generated from NL instructions by LLMs.	Catherine Tony; Nicolás E. Díaz Ferreyra; Markus Mutas; Salem Dhiff; Riccardo Scandariato;	arxiv-cs.SE	2024-07-09
871	Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce Multilingual Blending, a mixed-language query-response scheme designed to evaluate the safety alignment of various state-of-the-art LLMs (e.g., GPT-4o, GPT-3.5, Llama3) under sophisticated, multilingual conditions.	Jiayang Song; Yuheng Huang; Zhehua Zhou; Lei Ma;	arxiv-cs.CL	2024-07-09
872	Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To assess the LLM output, we propose completeness, soundness, clarity, syntax, and functioning code as metrics.	Inwon Kang; William Van Woensel; Oshani Seneviratne;	arxiv-cs.CL	2024-07-09
873	PEER: Expertizing Domain-Specific Tasks with A Multi-Agent Framework and Tuning Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework.	YIYING WANG et. al.	arxiv-cs.AI	2024-07-09
874	Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR).	YAOZONG GAN et. al.	arxiv-cs.CV	2024-07-08
875	Intent Aware Data Augmentation By Leveraging Generative AI for Stress Detection in Social Media Texts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Stress is a major issue in modern society. Researchers focus on identifying stress in individuals, linking language with mental health, and often utilizing social media posts. …	Minhah Saleem; Jihie Kim;	PeerJ Comput. Sci.	2024-07-08
876	Surprising Gender Biases in GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present seven experiments exploring gender biases in GPT.	Raluca Alexandra Fulgu; Valerio Capraro;	arxiv-cs.CY	2024-07-08
877	On The Power of Convolution Augmented Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent architectural recipes, such as state-space models, have bridged the performance gap. Motivated by this, we examine the benefits of Convolution-Augmented Transformer (CAT) for recall, copying, and length generalization tasks.	Mingchen Li; Xuechen Zhang; Yixiao Huang; Samet Oymak;	arxiv-cs.LG	2024-07-08
878	Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study underscores the crucial role of prompt engineering in maximizing the educational benefits of LLMs. By systematically categorizing and testing these strategies, we provide a comprehensive framework for both educators and students to optimize LLM-based learning experiences.	Tianyu Wang; Nianjun Zhou; Zhixiong Chen;	arxiv-cs.AI	2024-07-07
879	Flood Simulation: Integrating UAS Imagery and Ai-Generated Data With Diffusion Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The primary goal of early disaster impact assessments is to gather georeferenced data about affected areas. Floods, a major natural calamity, pose challenges in data collection …	Xiyang Hu; Maryam Rahnemoonfar;	IGARSS 2024 – 2024 IEEE International Geoscience and Remote …	2024-07-07
880	A Novel Automated Urban Building Analysis Framework Based on GPT and SAM Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Rapid urban development necessitates advanced methodologies for efficiently acquiring and analyzing detailed building information. This study proposes an automated framework, …	Yuchao Sun; Xianping Ma; Yizhen Yan; Man-On Pun; Bo Huang;	IGARSS 2024 – 2024 IEEE International Geoscience and Remote …	2024-07-07
881	Image-Conditional Diffusion Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT).	XINGYANG NIE et. al.	arxiv-cs.CV	2024-07-07
882	MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid advancement of Large Language Models (LLMs) and Large Multimodal Models (LMMs) has heightened the demand for AI-based scientific assistants capable of understanding …	ZEKUN LI et. al.	ArXiv	2024-07-06
883	Associative Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step.	Ivan Rodkin; Yuri Kuratov; Aydar Bulatov; Mikhail Burtsev;	arxiv-cs.CL	2024-07-05
884	Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored.	Nathan Herr; Fernando Acero; Roberta Raileanu; María Pérez-Ortiz; Zhibin Li;	arxiv-cs.AI	2024-07-05
885	Using LLMs to Label Medical Papers According to The CIViC Evidence Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP.	Markus Hisch; Xing David Wang;	arxiv-cs.CL	2024-07-05
886	Generalists Vs. Specialists: Evaluating Large Language Models for Urdu Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models–XLM-Roberta-large, mT5-large, and Llama-3-8b–that have been fine-tuned on specific tasks.	Samee Arif; Abdul Hameed Azeemi; Agha Ali Raza; Awais Athar;	arxiv-cs.CL	2024-07-05
887	Enhancing Multi-Agent Communication Collaboration Through GPT-Based Semantic Information Extraction and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	Xinfeng Deng; Li Zhou; Dezun Dong; Jibo Wei;	ACM Turing Award Celebration Conference 2024	2024-07-05
888	Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task.	Sachin Yadav; Tejaswi Choppa; Dominik Schlechtweg;	arxiv-cs.CL	2024-07-04
889	HYBRINFOX at CheckThat! 2024 – Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! competition. The specificity of the method is to use a hybrid …	MORGANE CASANOVA et. al.	Conference and Labs of the Evaluation Forum	2024-07-04
890	GPT-4 Vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains.	JIANHAO YAN et. al.	arxiv-cs.CL	2024-07-04
891	TrackPGD: Efficient Adversarial Attack Using Object Binary Masks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce TrackPGD, a novel white-box attack that utilizes predicted object binary masks to target robust transformer trackers.	Fatemeh Nourilenjan Nokabadi; Yann Batiste Pequignot; Jean-Francois Lalonde; Christian Gagné;	arxiv-cs.CV	2024-07-04
892	From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with a focus on their abilities in reasoning and explainability.	Stefanie Krause; Frieder Stolzenburg;	arxiv-cs.AI	2024-07-04
893	HYBRINFOX at CheckThat! 2024 — Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat!	MORGANE CASANOVA et. al.	arxiv-cs.CL	2024-07-04
894	CATT: Character-based Arabic Tashkeel Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new approach to training ATD models.	Faris Alasmary; Orjuwan Zaafarani; Ahmad Ghannam;	arxiv-cs.CL	2024-07-03
895	Large Language Models As Evaluators for Scientific Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study explores how well the state-of-the-art Large Language Models (LLMs), like GPT-4 and Mistral, can assess the quality of scientific summaries or, more fittingly, scientific syntheses, comparing their evaluations to those of human annotators.	Julia Evans; Jennifer D’Souza; Sören Auer;	arxiv-cs.CL	2024-07-03
896	Mast Kalandar at SemEval-2024 Task 8: On The Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness.	Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava; Radhika Mamidi;	arxiv-cs.CL	2024-07-03
897	GPT Prompt Engineering for Scheduling Appliances Usage for Energy Cost Optimization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we propose a novel approach that makes use of a GPT model and of prompt engineering to build a proper input to GPT, given a domestic energy dataset. Specifically, …	Marco Siino; I. Tinnirello;	2024 IEEE International Symposium on Measurements & …	2024-07-02
898	RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG.	YUE YU et. al.	arxiv-cs.CL	2024-07-02
899	Assessing The Code Clone Detection Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection.	Zixian Zhang; Takfarinas Saber;	arxiv-cs.SE	2024-07-02
900	Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.	MENGLIN YANG et. al.	arxiv-cs.LG	2024-07-01
901	DC Bias Content Extraction of Power Transformer Under AC and DC Environment and Its Suppression Measures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The phenomenon of transformer dc bias (TDB) will saturate the transformer core, resulting in the local overheating, accelerating the ageing of insulating material, and even …	ZHIWEI CHEN et. al.	IEEE Transactions on Industrial Electronics	2024-07-01
902	GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we construct a large-scale benchmark called GRASP, which consists of 16,000 grid-based environments where the agent is tasked with an energy collection problem.	Zhisheng Tang; Mayank Kejriwal;	arxiv-cs.AI	2024-07-01
903	Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: LLMs struggle to converge and consistently exploit even when explicitly prompted to do so, and are sensitive to prompt variations. To bridge this gap, we propose an agentic flow framework: LLM with Enhanced Algorithmic Dueling (LEAD), which integrates off-the-shelf DB algorithms with LLM agents through fine-grained adaptive interplay.	Fanzeng Xia; Hao Liu; Yisong Yue; Tongxin Li;	arxiv-cs.LG	2024-07-01
904	Image-to-Text Logic Jailbreak: Your Imagination Can Help You Do Anything Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the integration of visual and text inputs in VLMs, new security issues emerge, as malicious attackers can exploit multiple modalities to achieve their objectives.	Xiaotian Zou; Ke Li; Yongkang Chen;	arxiv-cs.CR	2024-07-01
905	Transformer Autoencoder for K-means Efficient Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View	Wenhao Wu; Weiwei Wang; Xixi Jia; Xiangchu Feng;	Eng. Appl. Artif. Intell.	2024-07-01
906	FATFusion: A Functional-anatomical Transformer for Medical Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View	Wei Tang; Fazhi He;	Inf. Process. Manag.	2024-07-01
907	MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions.	YUBO MA et. al.	arxiv-cs.CV	2024-07-01
908	TextCheater: A Query-Efficient Textual Adversarial Attack in The Hard-Label Setting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing a query-efficient attack strategy to generate high-quality adversarial examples under the hard-label black-box setting is a fundamental yet challenging problem, …	HAO PENG et. al.	IEEE Transactions on Dependable and Secure Computing	2024-07-01
909	Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based models have made significant advancements across various domains, largely due to the self-attention mechanism’s ability to capture contextual relationships in …	HULIN WANG et. al.	IEEE Transactions on Computers	2024-07-01
910	Prompting GPT -4 to Support Automatic Safety Case Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti;	Expert Syst. Appl.	2024-07-01
911	Token-disentangling Mutual Transformer for Multimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	GUANGHAO YIN et. al.	Eng. Appl. Artif. Intell.	2024-07-01
912	Adaptive Masked Autoencoder Transformer for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	XIANGRU CHEN et. al.	Appl. Soft Comput.	2024-07-01
913	Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts.	Kota Shamanth Ramanath Nayak; Leila Kosseim;	arxiv-cs.CL	2024-07-01
914	RoBERTa, ResNeXt and BiLSTM with Self-attention: The Ultimate Trio for Customer Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View	Amir Jabbary Lak; Reza Boostani; Farhan A. Alenizi; Amin Salih Mohammed; S. M. Fakhrahmad;	Appl. Soft Comput.	2024-07-01
915	Multi-Turn Hidden Backdoor in Large Language Model-powered Chatbot Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Model (LLM)-powered chatbot services like GPTs, simulating human-to-human conversation via machine-generated text, are used in numerous fields. They are enhanced by …	Bocheng Chen; Nikolay Ivanov; Guangjing Wang; Qiben Yan;	Proceedings of the 19th ACM Asia Conference on Computer and …	2024-07-01
916	A Study on The Effectiveness of GPT-4V in Classifying Driver Behavior Captured on Video Using Just A Few Frames Per Video Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper introduces an innovative study that evaluates the effectiveness of GPT-4V vision processing technology in identifying risk events within driving scenarios. These …	JOAO FELIPE GOBETI CALENZANI et. al.	2024 International Joint Conference on Neural Networks …	2024-06-30
917	LegalTurk Optimized BERT for Multi-Label Text Classification and NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies.	Farnaz Zeidi; Mehmet Fatih Amasyali; Çiğdem Erol;	arxiv-cs.CL	2024-06-30
918	WallFacer: Harnessing Multi-dimensional Ring Parallelism for Efficient Long Sequence Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods are either constrained by the number of attention heads or excessive communication overheads. To address this problem, we propose WallFacer, a multi-dimensional distributed training system for long sequences, fostering an efficient communication paradigm and providing additional tuning flexibility for communication arrangements.	ZIMING LIU et. al.	arxiv-cs.DC	2024-06-30
919	A Method for Tibetan Offensive Language Detection Based on Prompt Learning and Information Theory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Offensive language on social media is a serious social problem, which affects people’s mental health and social harmony. However, there is a lack of effective detection methods …	HANG REN et. al.	2024 International Joint Conference on Neural Networks …	2024-06-30
920	Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning has gained significant traction in natural language processing due to the emergence of state-of-the-art pre-trained language models (P.L.M.s). Unlike traditional …	Shadi Jaradat; Richi Nayak; Alexander Paz; Mohammed Elhenawy;	Algorithms	2024-06-30
921	LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, prior research harbors two primary concerns: firstly, a lack of contemplation regarding whether the natural language generated by LLM (LLMNL) truly aligns with human natural language (HNL), a critical foundational question; secondly, an oversight that augmented data is randomly generated by LLM, implying that not all data may possess equal training value, that could impede the performance of classifiers. To address these challenges, we introduce the scaling laws to intrinsically calculate LLMNL and HNL.	Zhenhua Wang; Guang Xu; Ming Ren;	arxiv-cs.CL	2024-06-29
922	Optimizing Uyghur Speech Synthesis By Combining Pretrained Cross-Lingual Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: End-to-end speech synthesis methodologies have exhibited considerable advancements for languages with abundant corpus resources. Nevertheless, such achievements are yet to be …	Kexin Lu; Zhihua Huang; Mingming Yin; Ke Chen;	ACM Transactions on Asian and Low-Resource Language …	2024-06-28
923	Machine Learning Predictors for Min-Entropy Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing data from Generalized Binary Autoregressive Models, a subset of Markov processes, we demonstrate that machine learning models (including a hybrid of convolutional and recurrent Long Short-Term Memory layers and the transformer-based GPT-2 model) outperform traditional NIST SP 800-90B predictors in certain scenarios.	Javier Blanco-Romero; Vicente Lorenzo; Florina Almenares Mendoza; Daniel Díaz-Sánchez;	arxiv-cs.LG	2024-06-28
924	Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users’ quit-vaping intentions.	SAI KRISHNA REVANTH VURUMA et. al.	arxiv-cs.CL	2024-06-28
925	Fine-tuned Network Relies on Generic Representation to Solve Unseen Cognitive Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning pretrained language models has shown promising results on a wide range of tasks, but when encountering a novel task, do they rely more on generic pretrained representation, or develop brand new task-specific solutions?	Dongyan Lin;	arxiv-cs.LG	2024-06-27
926	The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP).	Xiliang Zhu; Shayna Gardiner; Tere Roldán; David Rossouw;	arxiv-cs.CL	2024-06-27
927	FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FRED, a wafer-scale interconnect that is tailored for the high-BW requirements of wafer-scale networks and can efficiently execute communication patterns of different parallelization strategies.	Saeed Rashidi; William Won; Sudarshan Srinivasan; Puneet Gupta; Tushar Krishna;	arxiv-cs.AR	2024-06-27
928	BADGE: BADminton Report Generation and Evaluation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel framework named BADGE, designed for this purpose using LLM.	Shang-Hsuan Chiang; Lin-Wei Chao; Kuang-Da Wang; Chih-Chuan Wang; Wen-Chih Peng;	arxiv-cs.CL	2024-06-26
929	Automating Clinical Trial Eligibility Screening: Quantitative Analysis of GPT Models Versus Human Expertise Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Objective: This study quantitatively assesses the performance of GPT model in classifying patient eligibility for clinical trials, aiming to minimize the need for expert clinical …	ARTI DEVI et. al.	Proceedings of the 17th International Conference on …	2024-06-26
930	A Pyramid Gaussian Pooling Based CNN and Transformer Hybrid Network for Smoke Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Visual smoke semantic segmentation is a challenging task due to semi‐transparency, variable shapes, and complex textures of smoke. To improve segmentation performance, a …	Guiqian Wang; Feiniu Yuan; Hongdi Li; Zhijun Fang;	IET Image Process.	2024-06-26
931	Autonomous Prompt Engineering in Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prompt engineering is a crucial yet challenging task for optimizing the performance of large language models (LLMs) on customized tasks. This pioneering research introduces the …	Daan Kepel; Konstantina Valogianni;	ArXiv	2024-06-25
932	SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR).	Quan Mai; Susan Gauch; Douglas Adams;	arxiv-cs.CL	2024-06-25
933	This Paper Had The Smartest Reviewers — Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of speech and train machine learning models for automatic flattery detection.	LUKAS CHRIST et. al.	arxiv-cs.SD	2024-06-25
934	Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora.	YIMING LI et. al.	arxiv-cs.CL	2024-06-25
935	CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design.	NAFIS NEEHAL et. al.	arxiv-cs.CL	2024-06-25
936	This Paper Had The Smartest Reviewers – Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Flattery is an important aspect of human communication that facilitates social bonding, shapes perceptions, and influences behavior through strategic compliments and praise, …	LUKAS CHRIST et. al.	ArXiv	2024-06-25
937	Unambiguous Recognition Should Not Rely Solely on Natural Language Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This bias stems from the inherent characteristics of the dataset. To mitigate this bias, we propose a LaTeX printed text recognition model trained on a mixed dataset of pseudo-formulas and pseudo-text.	Renqing Luo; Yuhan Xu;	arxiv-cs.CV	2024-06-24
938	Exploring The Capability of Mamba in Speech Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compared Mamba with state-of-the-art Transformer variants for various speech applications, including ASR, text-to-speech, spoken language understanding, and speech summarization.	Koichi Miyazaki; Yoshiki Masuyama; Masato Murata;	arxiv-cs.SD	2024-06-24
939	GPT-4V Explorations: Mining Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the application of the GPT-4V(ision) large visual language model to autonomous driving in mining environments, where traditional systems often falter in understanding intentions and making accurate decisions during emergencies.	Zixuan Li;	arxiv-cs.CV	2024-06-24
940	Exploring Factual Entailment with NLI: A News Media Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel — a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles.	Guy Mor-Lan; Effi Levi;	arxiv-cs.CL	2024-06-24
941	DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models.	YUANG PENG et. al.	arxiv-cs.CV	2024-06-24
942	Using GPT-4 Turbo to Automatically Identify Defeaters in Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are convincing arguments, supported by a body of evidence and aiming at demonstrating that a system will function as intended. Producers of systems can rely …	K. K. SHAHANDASHTI et. al.	2024 IEEE 32nd International Requirements Engineering …	2024-06-24
943	Exploring The Capabilities of Large Language Models for The Generation of Safety Cases: The Case of GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of large language models (LLMs) and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across …	Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti;	2024 IEEE 32nd International Requirements Engineering …	2024-06-24
944	The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions.	Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz;	arxiv-cs.CL	2024-06-24
945	OlympicArena Medal Ranks: Who Is The Most Intelligent AI So Far? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)?	Zhen Huang; Zengzhi Wang; Shijie Xia; Pengfei Liu;	arxiv-cs.CL	2024-06-24
946	Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT).	Ni Wang; Dongliang Liao; Xing Xu;	arxiv-cs.CV	2024-06-23
947	GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent studies have identified limitations in LLMs’ ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph data structure problems along with 2000 test cases.	Qiming Wu; Zichen Chen; Will Corcoran; Misha Sra; Ambuj K. Singh;	arxiv-cs.AI	2024-06-23
948	Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate a broader view of knowledge location, that of concepts or clusters of related information, instead of disparate individual facts.	Christopher Burger; Yifan Hu; Thai Le;	arxiv-cs.LG	2024-06-22
949	The Role of Generative AI in Qualitative Research: GPT-4’s Contributions to A Grounded Theory Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present reflections on our experience using a generative AI model in qualitative research, to illuminate the AI’s contributions to our analytic process. Our analytic focus was …	Ravi Sinha; Idris Solola; Ha Nguyen; H. Swanson; LuEttaMae Lawrence;	Proceedings of the 2024 Symposium on Learning, Design and …	2024-06-21
950	How Effective Is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom’s Revised Taxonomy? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2024-06-21
951	Metacognitive Prompting Improves Understanding in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes.	Yuqing Wang; Yun Zhao;	naacl	2024-06-20
952	VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack.	Jonathan Rusert;	naacl	2024-06-20
953	Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3.	Sindhu Kishore; Hangfeng He;	naacl	2024-06-20
954	MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3.	SANCHIT AHUJA et. al.	naacl	2024-06-20
955	Does GPT Really Get It? A Hierarchical Scale to Quantify Human Vs AI’s Understanding of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences.	Mirabel Reid; Santosh S. Vempala;	arxiv-cs.AI	2024-06-20
956	CryptoGPT: A 7B Model Rivaling GPT-4 in The Task of Analyzing and Classifying Real-time Financial News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CryptoGPT: a 7B model competing with GPT-4 in a specific task — The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT.	Ying Zhang; Matthieu Petit Guillaume; Aurélien Krauth; Manel Labidi;	arxiv-cs.AI	2024-06-20
957	VLM Agents Generate Their Own Memories: Distilling Experience Into Embodied Programs of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce In-Context Abstraction Learning (ICAL), which iteratively refines suboptimal trajectories into high-quality data with optimized actions and detailed reasoning.	GABRIEL SARCH et. al.	arxiv-cs.CV	2024-06-20
958	Removing RLHF Protections in GPT-4 Via Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate.	QIUSI ZHAN et. al.	naacl	2024-06-20
959	Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance.	XIANGRU TANG et. al.	naacl	2024-06-20
960	A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems.	Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas;	naacl	2024-06-20
961	A Continued Pretrained LLM Approach for Automatic Medical Note Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing.	DONG YUAN et. al.	naacl	2024-06-20
962	Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang.	Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu;	naacl	2024-06-20
963	Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults.	Afonso de Sá Delgado Neto; Maximilian Egger; Mayank Bakshi; Rawad Bitar;	arxiv-cs.LG	2024-06-20
964	CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively.	Ming Jiang; Mansi Joshi;	naacl	2024-06-20
965	Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models.	Anej Svete; Ryan Cotterell;	naacl	2024-06-20
966	Does GPT-4 Pass The Turing Test? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness.	Cameron Jones; Ben Bergen;	naacl	2024-06-20
967	ChatGPT As Research Scientist: Probing GPT’s Capabilities As A Research Librarian, Research Ethicist, Data Generator and Data Predictor IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research …	Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R. Banaji;	arxiv-cs.AI	2024-06-20
968	SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning.	ARASH ARDAKANI et. al.	naacl	2024-06-20
969	On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck.	TING-RUI CHIANG et. al.	naacl	2024-06-20
970	Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature.	Anshuman Chhabra; Hadi Askari; Prasant Mohapatra;	naacl	2024-06-20
971	SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs.	Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng;	naacl	2024-06-20
972	A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem.	JIAQI LIU et. al.	arxiv-cs.RO	2024-06-19
973	Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles.	Hamdireza Rouzegar; Masoud Makrehchi;	arxiv-cs.CL	2024-06-19
974	Putting GPT-4o to The Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study …	SAKIB SHAHRIAR et. al.	ArXiv	2024-06-19
975	Fine-Tuning BERTs for Definition Extraction from Mathematical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fine-tuned three pre-trained BERT models on the task of definition extraction from mathematical English written in LaTeX.	Lucy Horowitz; Ryan Hathaway;	arxiv-cs.CL	2024-06-19
976	Generating Educational Materials with Different Levels of Readability Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning.	Chieh-Yang Huang; Jing Wei; Ting-Hao ‘Kenneth’ Huang;	arxiv-cs.CL	2024-06-18
977	ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time.	TEAM GLM et. al.	arxiv-cs.CL	2024-06-18
978	SwinStyleformer Is A Favorable Choice for Image Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects.	Jiawei Mao; Guangyi Zhao; Xuesong Yin; Yuanqi Chang;	arxiv-cs.CV	2024-06-18
979	Reality Check: Assessing GPT-4 in Fixing Real-World Software Vulnerabilities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Discovering and mitigating software vulnerabilities is a challenging task. These vulnerabilities are often caused by simple, otherwise (and in other contexts) harmless code …	ZOLTÁN SÁGODI et. al.	Proceedings of the 28th International Conference on …	2024-06-18
980	What Makes Two Language Models Think Alike? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question.	Jeanne Salle; Louis Jalouzot; Nur Lan; Emmanuel Chemla; Yair Lakretz;	arxiv-cs.CL	2024-06-18
981	ChatGPT: Perspectives from Human–computer Interaction and Psychology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The release of GPT-4 has garnered widespread attention across various fields, signaling the impending widespread adoption and application of Large Language Models (LLMs). However, …	Jiaxi Liu;	Frontiers in Artificial Intelligence	2024-06-18
982	Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a thorough analysis and discussion of the results.	ANKIT AICH et. al.	arxiv-cs.CL	2024-06-18
983	Minimal Self in Humanoid Robot Alter3 Driven By Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Alter3, a humanoid robot that demonstrates spontaneous motion generation through the integration of GPT-4, Large Language Model (LLM).	Takahide Yoshida; Suzune Baba; Atsushi Masumori; Takashi Ikegami;	arxiv-cs.RO	2024-06-17
984	Problematic Tokens: Tokenizer Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This misrepresentation results in the propagation of under-trained or untrained tokens, which perpetuate biases and pose serious concerns related to data security and ethical standards. We aim to dissect the tokenization mechanics of GPT-4o, illustrating how its simplified token-handling methods amplify these risks and offer strategic solutions to mitigate associated security and ethical issues.	Jin Yang; Zhiqiang Wang; Yanbin Lin; Zunduo Zhao;	arxiv-cs.CL	2024-06-17
985	A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a two-dimensional zero-shot evaluation method for DST using GPT-4, which divides the evaluation into two dimensions: accuracy and completeness.	Ming Gu; Yan Yang;	arxiv-cs.CL	2024-06-17
986	Cultural Conditioning or Placebo? On The Effectiveness of Socio-Demographic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI) or neutral (MMLU and ETHICS).	SAGNIK MUKHERJEE et. al.	arxiv-cs.CL	2024-06-17
987	Significant Productivity Gains Through Programming with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models like GPT and Codex drastically alter many daily tasks, including programming, where they can rapidly generate code from natural language or informal …	Thomas Weber; Maximilian Brandmaier; Albrecht Schmidt; Sven Mayer;	Proceedings of the ACM on Human-Computer Interaction	2024-06-17
988	Look Further Ahead: Testing The Limits of GPT-4 in Path Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still face challenges with long-horizon planning. To study this, we propose path planning tasks as a platform to evaluate LLMs’ ability to navigate long trajectories under geometric constraints.	Mohamed Aghzal; Erion Plaku; Ziyu Yao;	arxiv-cs.AI	2024-06-17
989	Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify a pitfall of vanilla iterative DPO – improved response quality can lead to increased verbosity.	JIE LIU et. al.	arxiv-cs.CL	2024-06-17
990	GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts.	Binnur Görer; Fatma Başak Aydemir;	arxiv-cs.SE	2024-06-17
991	DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales.	FAN ZHOU et. al.	arxiv-cs.DB	2024-06-17
992	WellDunn: On The Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model’s utility in clinical practice.	SEYEDALI MOHAMMADI et. al.	arxiv-cs.AI	2024-06-17
993	ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, we present Video Diffusion GPT (ViD-GPT).	Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao;	arxiv-cs.CV	2024-06-16
994	Dyamond: A 1T1C DRAM In-memory Computing Accelerator with Compact MAC-SIMD and Adaptive Column Addition Dataflow Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We propose Dyamond, a ITIC DRAM in-memory computing accelerator with column addition (CA) dataflow, for high density and energy efficiency. LSB-CA minimizes ADC readouts to …	SEONGYON HONG et. al.	2024 IEEE Symposium on VLSI Technology and Circuits (VLSI …	2024-06-16
995	KGPA: Robustness Evaluation for Large Language Models Via Cross-Domain Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs).	AIHUA PEI et. al.	arxiv-cs.CL	2024-06-16
996	Exposing The Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dataset MWP-MISTAKE, incorporating MWPs with both correct and incorrect reasoning steps generated through rule-based methods and smaller language models.	Joykirat Singh; Akshay Nambi; Vibhav Vineet;	arxiv-cs.CL	2024-06-16
997	Large Language Models for Automatic Milestone Detection in Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate an LLM’s performance on recordings of a group oral communication task in which utterances are often truncated or not well-formed.	ZHUOXU DUAN et. al.	arxiv-cs.CL	2024-06-16
998	Generating Tables from The Parametric Knowledge of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables.	Yevgeni Berkovitch; Oren Glickman; Amit Somech; Tomer Wolfson;	arxiv-cs.CL	2024-06-16
999	Breaking Boundaries: Investigating The Effects of Model Editing on Cross-linguistic Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts.	SOMNATH BANERJEE et. al.	arxiv-cs.CL	2024-06-16
1000	Distilling Opinions at Scale: Incremental Opinion Summarization Using XL-OPSUMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally.	SRI RAGHAVA MUDDU et. al.	arxiv-cs.CL	2024-06-16
1001	ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning Via Shared Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an approach to optimize Parameter Efficient Fine Tuning (PEFT) for Pretrained Language Models (PLMs) by implementing a Shared Low Rank Adaptation (ShareLoRA).	Yurun Song; Junchen Zhao; Ian G. Harris; Sangeetha Abdu Jyothi;	arxiv-cs.CL	2024-06-15
1002	Multilingual Large Language Models and Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks.	Daniil Gurgurov; Tanja Bäumel; Tatiana Anikina;	arxiv-cs.CL	2024-06-15
1003	GPT-Fabric: Folding and Smoothing Fabric By Leveraging Pre-Trained Foundation Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Fabric manipulation has applications in folding blankets, handling patient clothing, and protecting items with covers. It is challenging for robots to perform fabric manipulation …	Vedant Raval; Enyu Zhao; Hejia Zhang; S. Nikolaidis; Daniel Seita;	ArXiv	2024-06-14
1004	GPT-4o: Visual Perception Performance of Multimodal Large Language Models in Piglet Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The initial evaluation experiments in this study validate the potential of multimodal large language models in livestock scene video understanding and provide new directions and references for future research on animal behavior video understanding.	Yiqi Wu; Xiaodan Hu; Ziming Fu; Siling Zhou; Jiangong Li;	arxiv-cs.CV	2024-06-14
1005	Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work enables extensive hardware/mapping exploration by extending the DSE framework Stream towards support for transformers across a wide variety of hardware architectures and different execution schedules.	Steven Colleman; Arne Symons; Victor J. B. Jung; Marian Verhelst;	arxiv-cs.AR	2024-06-14
1006	Mean-Shift Feature Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models developed in NLP make a great impact on computer vision fields producing promising performance on various tasks.	Takumi Kobayashi;	cvpr	2024-06-13
1007	General Point Model Pretraining with Autoencoding and Autoregressive Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the General Language Model we propose a General Point Model (GPM) that seamlessly integrates autoencoding and autoregressive tasks in a point cloud transformer.	ZHE LI et. al.	cvpr	2024-06-13
1008	GPT-ology, Computational Models, Silicon Sampling: How Should We Think About LLMs in Cognitive Science? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models have taken the cognitive science world by storm. It is perhaps timely now to take stock of the various research paradigms that have been used to make …	Desmond C. Ong;	ArXiv	2024-06-13
1009	SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum we introduce SDPose a new self-distillation method for improving the performance of small transformer-based models.	SICHEN CHEN et. al.	cvpr	2024-06-13
1010	GPT-Fabric: Smoothing and Folding Fabric By Leveraging Pre-Trained Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-Fabric for the canonical tasks of fabric smoothing and folding, where GPT directly outputs an action informing a robot where to grasp and pull a fabric.	Vedant Raval; Enyu Zhao; Hejia Zhang; Stefanos Nikolaidis; Daniel Seita;	arxiv-cs.RO	2024-06-13
1011	Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze a mechanism used in two LMs to selectively inhibit items in a context in one task, and find that it underlies a commonly used abstraction across many context-retrieval behaviors.	Jack Merullo; Carsten Eickhoff; Ellie Pavlick;	arxiv-cs.CL	2024-06-13
1012	MoMask: Generative Masked Modeling of 3D Human Motions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MoMask a novel masked modeling framework for text-driven 3D human motion generation.	Chuan Guo; Yuxuan Mu; Muhammad Gohar Javed; Sen Wang; Li Cheng;	cvpr	2024-06-13
1013	MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.	Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi;	cvpr	2024-06-13
1014	Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models.	Han Cai; Muyang Li; Qinsheng Zhang; Ming-Yu Liu; Song Han;	cvpr	2024-06-13
1015	OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions without a large-scale animal text-motion dataset.	ZHANGSIHAO YANG et. al.	cvpr	2024-06-13
1016	Permutation Equivariance of Transformers and Its Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks.	HENGYUAN XU et. al.	cvpr	2024-06-13
1017	GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic versatile and human-aligned evaluation metric for text-to-3D generative models.	TONG WU et. al.	cvpr	2024-06-13
1018	Complex Image-Generative Diffusion Transformer for Audio Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance audio denoising performance, this paper introduces a complex image-generative diffusion transformer that captures more information from the complex Fourier domain.	Junhui Li; Pu Wang; Jialu Li; Youshan Zhang;	arxiv-cs.SD	2024-06-13
1019	Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Differentiable Channel Selection, or DCS-Transformer.	Yancheng Wang; Ping Li; Yingzhen Yang;	icml	2024-06-12
1020	Fine-Tuned ‘Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification.	Martin Juan José Bucher; Marco Martini;	arxiv-cs.CL	2024-06-12
1021	Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge.	Fangyun Wei; Xi Chen; Lin Luo;	icml	2024-06-12
1022	Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we first introduce LoCoV1, a 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long.	Jon Saad-Falcon; Daniel Y Fu; Simran Arora; Neel Guha; Christopher Re;	icml	2024-06-12
1023	Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning.	Jaehoon Kim; Seungwan Jin; Sohyun Park; Someen Park; Kyungsik Han;	arxiv-cs.CL	2024-06-12
1024	Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by previous theoretical study of static version of the attention multiplication problem [Zandieh, Han, Daliri, and Karbasi ICML 2023, Alman and Song NeurIPS 2023], we formally define a dynamic version of attention matrix multiplication problem.	Jan van den Brand; Zhao Song; Tianyi Zhou;	icml	2024-06-12
1025	Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2.	NICHOLAS CARLINI et. al.	icml	2024-06-12
1026	What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth.	Xingwu Chen; Difan Zou;	icml	2024-06-12
1027	AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively.	REDUAN ACHTIBAT et. al.	icml	2024-06-12
1028	Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses—that intuitively contain more learnable information and are harder to overfit—from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge.	Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion;	icml	2024-06-12
1029	How Language Model Hallucinations Can Snowball IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim.	Muru Zhang; Ofir Press; William Merrill; Alisa Liu; Noah A. Smith;	icml	2024-06-12
1030	PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality.	Praneeth Kacham; Vahab Mirrokni; Peilin Zhong;	icml	2024-06-12
1031	Trainable Transformer in Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new efficient construction, Transformer in Transformer (in short, TINT), that allows a transformer to simulate and fine-tune more complex models during inference (e.g., pre-trained language models).	Abhishek Panigrahi; Sadhika Malladi; Mengzhou Xia; Sanjeev Arora;	icml	2024-06-12
1032	Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference.	HAOQI WU et. al.	icml	2024-06-12
1033	InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval.	BOXIN WANG et. al.	icml	2024-06-12
1034	Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly supervise superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model?	COLLIN BURNS et. al.	icml	2024-06-12
1035	Timer: Generative Pre-trained Transformers Are Large Time Series Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM).	YONG LIU et. al.	icml	2024-06-12
1036	GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website.	Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su;	icml	2024-06-12
1037	In-Context Principle Learning from Mistakes IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples.	TIANJUN ZHANG et. al.	icml	2024-06-12
1038	Prodigy: An Expeditiously Adaptive Parameter-Free Learner IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally.	Konstantin Mishchenko; Aaron Defazio;	icml	2024-06-12
1039	Asymmetry in Low-Rank Adapters of Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices.	JIACHENG ZHU et. al.	icml	2024-06-12
1040	Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias terms.	Brian K Chen; Tianyang Hu; Hui Jin; Hwee Kuan Lee; Kenji Kawaguchi;	icml	2024-06-12
1041	Position: On The Possibilities of AI-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds.	SOURADIP CHAKRABORTY et. al.	icml	2024-06-12
1042	Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.	CHENG HAN et. al.	icml	2024-06-12
1043	Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed `OutEffHop`) and use it to address the outlier inefficiency problem of training gigantic transformer-based models.	JERRY YAO-CHIEH HU et. al.	icml	2024-06-12
1044	SpikeZIP-TF: Conversion Is All You Need for Transformer-based SNN Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel ANN-to-SNN conversion method called SpikeZIP-TF, where ANN and SNN are exactly equivalent, thus incurring no accuracy degradation.	KANG YOU et. al.	icml	2024-06-12
1045	Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models.	AmirMohammad Azadi; Baktash Ansari; Sina Zamani; Sauleh Eetemadi;	arxiv-cs.CL	2024-06-11
1046	Anomaly Detection on Unstable Logs with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs.	Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand;	arxiv-cs.SE	2024-06-11
1047	LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approaches for the SMM4H24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders.	Dasun Athukoralage; Thushari Atapattu; Menasha Thilakaratne; Katrina Falkner;	arxiv-cs.CL	2024-06-11
1048	LLM-Powered Multimodal AI Conversations for Diabetes Prevention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The global prevalence of diabetes remains high despite rising life expectancy with improved quality and access to healthcare services. The significant burden that diabetes imposes …	Dung Dao; Jun Yi Claire Teo; Wenru Wang; Hoang D. Nguyen;	Proceedings of the 1st ACM Workshop on AI-Powered Q&A …	2024-06-10
1049	Unveiling The Safety of GPT-4o: An Empirical Study Using Jailbreak Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this paper adopts a series of multi-modal and uni-modal jailbreak attacks on 4 commonly used benchmarks encompassing three modalities (ie, text, speech, and image), which involves the optimization of over 4,000 initial text queries and the analysis and statistical evaluation of nearly 8,000+ response on GPT-4o.	Zonghao Ying; Aishan Liu; Xianglong Liu; Dacheng Tao;	arxiv-cs.CR	2024-06-10
1050	LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent.	SIMRANJIT SINGH et. al.	arxiv-cs.DC	2024-06-10
1051	SecureNet: A Comparative Study of DeBERTa and Large Language Models for Phishing Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Phishing, whether through email, SMS, or malicious websites, poses a major threat to organizations by using social engineering to trick users into revealing sensitive information. …	Sakshi Mahendru; Tejul Pandit;	2024 IEEE 7th International Conference on Big Data and …	2024-06-10
1052	Improving ROUGE-1 By 6%: A Novel Multilingual Transformer for Abstractive News Summarization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language processing (NLP) has undergone a significant transformation, evolving from manually crafted rules to powerful deep learning techniques such as transformers. These …	Sandeep Kumar; Arun Solanki;	Concurr. Comput. Pract. Exp.	2024-06-10
1053	Validating LLM-Generated Programs with Metamorphic Prompt Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research is required to comprehensively explore these critical concerns surrounding LLM-generated code. In this paper, we propose a novel solution called metamorphic prompt testing to address these challenges.	Xiaoyin Wang; Dakai Zhu;	arxiv-cs.SE	2024-06-10
1054	In-Context Learning and Fine-Tuning GPT for Argument Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an ICL strategy for ATC combining kNN-based examples selection and majority vote ensembling.	Jérémie Cabessa; Hugo Hernault; Umer Mushtaq;	arxiv-cs.CL	2024-06-10
1055	Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture.	Martin Courtois; Malte Ostendorff; Leonhard Hennig; Georg Rehm;	arxiv-cs.CL	2024-06-10
1056	Detection of Malicious Smart Contracts By Fine-tuning GPT-3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper introduces a comprehensive framework for the detection and identification of malicious smart contracts, emphasizing their vulnerabilities. The framework leverages the …	Msvpj Sathvik; Hirak Mazumdar;	Secur. Priv.	2024-06-09
1057	Hidden Holes: Topological Aspects of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods developed in this paper are novel in the field and based on mathematical apparatus that might be unfamiliar to the target audience.	Stephen Fitz; Peter Romero; Jiyan Jonas Schneider;	arxiv-cs.CL	2024-06-09
1058	Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, resource-intensive VTs updating and high mobility of vehicles require intensive computation, communication, and storage resources, especially for their migration among RSUs with limited coverages. To address these issues, we propose an attribute-aware auction-based mechanism to optimize resource allocation during VTs migration by considering both price and non-monetary attributes, e.g., location and reputation.	YONGJU TONG et. al.	arxiv-cs.AI	2024-06-08
1059	MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature.	GYEONG HOON YI et. al.	arxiv-cs.CL	2024-06-08
1060	G-Transformer: Counterfactual Outcome Prediction Under Dynamic and Time-varying Treatment Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present G-Transformer for counterfactual outcome prediction under dynamic and time-varying treatment strategies.	Hong Xiong; Feng Wu; Leon Deng; Megan Su; Li-wei H Lehman;	arxiv-cs.LG	2024-06-08
1061	SelfDefend: LLMs Can Defend Themselves Against Jailbreaking in A Practical Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by how the traditional security concept of shadow stacks defends against memory overflow attacks, this paper introduces a generic LLM jailbreak defense framework called SelfDefend, which establishes a shadow LLM as a defense instance (in detection state) to concurrently protect the target LLM instance (in normal answering state) in the normal stack and collaborate with it for checkpoint-based access control.	XUNGUANG WANG et. al.	arxiv-cs.CR	2024-06-08
1062	Automata Extraction from Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automata extraction algorithm specifically designed for Transformer models.	Yihao Zhang; Zeming Wei; Meng Sun;	arxiv-cs.LG	2024-06-08
1063	Do LLMs Recognize Me, When I Is Not Me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first study examining indexical shift in any language, releasing a Turkish dataset specifically designed for this purpose.	Metehan Oğuz; Yusuf Umut Ciftci; Yavuz Faruk Bakman;	arxiv-cs.CL	2024-06-08
1064	VTrans: Accelerating Transformer Compression with Variational Information Bottleneck Based Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges, we propose VTrans, an iterative pruning framework guided by the Variational Information Bottleneck (VIB) principle.	Oshin Dutta; Ritvik Gupta; Sumeet Agarwal;	arxiv-cs.LG	2024-06-07
1065	BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense.	Baktash Ansari; Mohammadmostafa Rostamkhani; Sauleh Eetemadi;	arxiv-cs.CL	2024-06-07
1066	Transformer Conformal Prediction for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a conformal prediction method for time series using the Transformer architecture to capture long-memory and long-range dependencies.	Junghwan Lee; Chen Xu; Yao Xie;	arxiv-cs.LG	2024-06-07
1067	Low-Resource Cross-Lingual Summarization Through Few-Shot Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4.	Gyutae Park; Seojin Hwang; Hwanhee Lee;	arxiv-cs.CL	2024-06-07
1068	Logic Synthesis with Generative Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named ctrw (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting.	XIHAN LI et. al.	arxiv-cs.LO	2024-06-07
1069	Can GPT Embeddings Enhance Visual Exploration of Literature Datasets? A Case Study on Isostatic Pressing Research Related Papers Related Patents Related Grants Related Venues Related Experts View	Hongjiang Lv; Zhibin Niu; Wei Han; Xiang Li;	J. Vis.	2024-06-07
1070	Mixture-of-Agents Enhances Large Language Model Capabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology.	Junlin Wang; Jue Wang; Ben Athiwaratkun; Ce Zhang; James Zou;	arxiv-cs.CL	2024-06-07
1071	GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents’ performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of LLM agents.	ANTHONY COSTARELLI et. al.	arxiv-cs.CL	2024-06-06
1072	MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Multi-path Enhanced Taylor (MET) Transformer based U-net for Speech Enhancement (MUSE), a lightweight speech enhancement network built upon the Unet architecture.	Zizhen Lin; Xiaoting Chen; Junyu Wang;	arxiv-cs.SD	2024-06-06
1073	Exploring The Latest LLMs for Leaderboard Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document).	Salomon Kabongo; Jennifer D’Souza; Sören Auer;	arxiv-cs.CL	2024-06-06
1074	Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations.	NEEMESH YADAV et. al.	arxiv-cs.CL	2024-06-06
1075	From Tarzan to Tolkien: Controlling The Language Proficiency Level of LLMs for Content Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of controlling the difficulty level of text generated by Large Language Models (LLMs) for contexts where end-users are not fully proficient, such as language learners.	Ali Malik; Stephen Mayhew; Chris Piech; Klinton Bicknell;	arxiv-cs.CL	2024-06-05
1076	The Good, The Bad, and The Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states.	MIKHAIL MOZIKOV et. al.	arxiv-cs.AI	2024-06-05
1077	CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition.	YE ZENG et. al.	arxiv-cs.IT	2024-06-05
1078	Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Global Clipper and Global Hybrid Clipper, effective mitigation strategies specifically designed for transformer-based models.	QUTUB SYED SHA et. al.	arxiv-cs.CV	2024-06-05
1079	Learning to Grok: Emergence of In-context Learning and Skill Composition in Modular Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks.	Tianyu He; Darshil Doshi; Aritra Das; Andrey Gromov;	arxiv-cs.LG	2024-06-04
1080	Probing The Category of Verbal Aspect in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian.	Anisia Katinskaia; Roman Yangarber;	arxiv-cs.CL	2024-06-04
1081	Multi-layer Learnable Attention Mask for Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Comprehensive experimental validation on various datasets, such as MADv2, QVHighlights, ImageNet 1K, and MSRVTT, demonstrates the efficacy of the LAM, exemplifying its ability to enhance model performance while mitigating redundant computations. This pioneering approach presents a significant advancement in enhancing the understanding of complex scenarios, such as in movie understanding.	Wayner Barrios; SouYoung Jin;	arxiv-cs.CV	2024-06-04
1082	A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs).	Remi Genet; Hugo Inzirillo;	arxiv-cs.LG	2024-06-04
1083	Randomized Geometric Algebra Methods for Convex Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce randomized algorithms to Clifford’s Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces.	Yifei Wang; Sungyoon Kim; Paul Chu; Indu Subramaniam; Mert Pilanci;	arxiv-cs.LG	2024-06-04
1084	Too Big to Fail: Larger Language Models Are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous findings show that changes in PPL when masking attention layers in pre-trained transformer-based NLMs reflect linguistic anomalies associated with Alzheimer’s disease dementia. Building upon this, we explore a novel bidirectional attention head ablation method that exhibits properties attributed to the concepts of cognitive and brain reserve in human brain studies, which postulate that people with more neurons in the brain and more efficient processing are more resilient to neurodegeneration.	Changye Li; Zhecheng Sheng; Trevor Cohen; Serguei Pakhomov;	arxiv-cs.CL	2024-06-04
1085	Eliciting The Priors of Large Language Models Using Iterated In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a prompt-based workflow for eliciting prior distributions from LLMs.	Jian-Qiao Zhu; Thomas L. Griffiths;	arxiv-cs.CL	2024-06-03
1086	Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our empirical study focuses on evaluating adversarial robustness of object trackers based on bounding box versus binary mask predictions, and attack methods at different levels of perturbations.	Fatemeh Nourilenjan Nokabadi; Jean-François Lalonde; Christian Gagné;	arxiv-cs.CV	2024-06-03
1087	Mapping Study Variables to Common Data Elements Using GPT for Sheets: Towards Standardized Data Collection and Sharing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secondary use or reuse of biomedical research data has drawn significant attention and is of growing importance. Non-standardized representation and wide variability of clinical …	Pritham Ram; Na Hong; Hua Xu; Xiaoqian Jiang;	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
1088	SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for semantic understanding for complex tasks like debugging and program repair.	YANGRUIBO DING et. al.	arxiv-cs.CL	2024-06-03
1089	Performance Evaluation of Multimodal Large Language Models (LLaVA and GPT-4-based ChatGPT) in Medical Image Classification Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have gained significant attention due to their prospective applications in medicine. Utilizing multimodal LLMs can potentially assist clinicians in …	Yuhang Guo; Zhiyu Wan;	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
1090	Superhuman Performance in Urology Board Questions By An Explainable Large Language Model Enabled for Context Integration of The European Association of Urology Guidelines: The UroBot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: UroBot was developed using OpenAI’s GPT-3.5, GPT-4, and GPT-4o models, employing retrieval-augmented generation (RAG) and the latest 2023 guidelines from the European Association of Urology (EAU).	MARTIN J. HETZ et. al.	arxiv-cs.CL	2024-06-03
1091	Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.	CHENG HAN et. al.	arxiv-cs.CV	2024-06-03
1092	Seeing Beyond Borders: Evaluating LLMs in Multilingual Ophthalmological Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs), such as GPT-3.5 [1] and GPT-4 [2], have significant potential for transforming several aspects of patient care from clinical note summarization to …	DAVID RESTREPO et. al.	2024 IEEE 12th International Conference on Healthcare …	2024-06-03
1093	In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties?	GRZEGORZ KASZUBA et. al.	arxiv-cs.LG	2024-06-03
1094	Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs.	SHIQI LIU et. al.	arxiv-cs.CL	2024-06-02
1095	Drive As Veteran: Fine-tuning of An Onboard Large Language Model for Highway Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Due to the limitations of network communication conditions for online calling GPT, the onboard deployment of Large Language Models for autonomous driving is in need. In this …	YUJIN WANG et. al.	2024 IEEE Intelligent Vehicles Symposium (IV)	2024-06-02
1096	RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks.	Md. Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Md. Ashad Alam;	arxiv-cs.CL	2024-06-01
1097	Multimodal Metadata Assignment for Cultural Heritage Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset.	LUIS REI et. al.	arxiv-cs.CV	2024-06-01
1098	Multi-granularity Cross Transformer Network for Person Re-identification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	Yanping Li; Duoqian Miao; Hongyun Zhang; Jie Zhou; Cairong Zhao;	Pattern Recognit.	2024-06-01
1099	Low-Contrast Medical Image Segmentation Via Transformer and Boundary Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Low-contrast medical image segmentation is a challenging task that requires full use of local details and global context. However, existing convolutional neural networks (CNNs) …	YINGLIN ZHANG et. al.	IEEE Transactions on Emerging Topics in Computational …	2024-06-01
1100	Utilizing Passage‐level Relevance and Kernel Pooling for Enhancing BERT‐based Document Reranking Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The pre‐trained language model (PLM) based on the Transformer encoder, namely BERT, has achieved state‐of‐the‐art results in the field of Information Retrieval. Existing …	MIN PAN et. al.	Computational Intelligence	2024-06-01
1101	Hyneter:Hybrid Network Transformer for Multiple Computer Vision Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this article, we point out that the essential differences between convolutional neural network (CNN)-based and transformer-based detectors, which cause worse performance of …	Dong Chen; Duoqian Miao; Xuerong Zhao;	IEEE Transactions on Industrial Informatics	2024-06-01
1102	EdgeTran: Device-Aware Co-Search of Transformers for Efficient Inference on Mobile Edge Platforms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while …	Shikhar Tuli; N. Jha;	IEEE Transactions on Mobile Computing	2024-06-01
1103	Attribute-Based Injection Transformer for Personalized Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Personal attributes have been proven to be useful for sentiment analysis. However, previous models of learning attribute-specific language representations are suboptimal because …	Youjia Zhang; Jin Wang; Liang-Chih Yu; Dan Xu; Xuejie Zhang;	IEEE Transactions on Emerging Topics in Computational …	2024-06-01
1104	Bidirectional Interaction of CNN and Transformer for Image Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View	Jialu Liu; Maoguo Gong; Yuan Gao; Yihe Lu; Hao Li;	Knowl. Based Syst.	2024-06-01
1105	Hunt-inspired Transformer for Visual Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhibin Zhang; Wanli Xue; Yuxi Zhou; Kaihua Zhang; Shengyong Chen;	Pattern Recognit.	2024-06-01
1106	Beyond Boundaries: A Human-like Approach for Question Answering Over Structured and Unstructured Information Sources Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering …	Jens Lehmann; Dhananjay Bhandiwad; Preetam Gattogi; S. Vahdati;	Transactions of the Association for Computational …	2024-06-01
1107	SwinFG: A Fine-grained Recognition Scheme Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Zhipeng Ma; Xiaoyu Wu; Anzhuo Chu; Lei Huang; Zhiqiang Wei;	Expert Syst. Appl.	2024-06-01
1108	FuzzyTP-BERT: Enhancing Extractive Text Summarization with Fuzzy Topic Modeling and Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View	Aytuğ Onan; Hesham A. Alhumyani;	J. King Saud Univ. Comput. Inf. Sci.	2024-06-01
1109	Explainable Attention Pruning: A Metalearning-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Pruning, as a technique to reduce the complexity and size of transformer-based models, has gained significant attention in recent years. While various models have been …	P. Rajapaksha; Noel Crespi;	IEEE Transactions on Artificial Intelligence	2024-06-01
1110	A Transformer and Convolution-Based Learning Framework for Automatic Modulation Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic modulation classification (AMC) is a typical pattern classification task that is an intermediate process between signal detection and demodulation. Deep learning methods …	Wenxuan Ma; Zhuoran Cai; Chuan Wang;	IEEE Communications Letters	2024-06-01
1111	LiteFormer: A Lightweight and Efficient Transformer for Rotating Machine Fault Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, …	WENJUN SUN et. al.	IEEE Transactions on Reliability	2024-06-01
1112	Beyond Metrics: Evaluating LLMs’ Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our evaluation includes both quantitative analysis using metrics like F1 score and qualitative assessment of LLMs’ explanations for their predictions. We find that, while Mistral-7b and Mixtral-8x7b achieved high F1 scores, they and other LLMs such as GPT-3.5-Turbo, Llama-2-70b, and Gemma-7b struggled with understanding linguistic and contextual nuances, as well as lack of transparency in their decision-making process as observed from their explanations.	MILLICENT OCHIENG et. al.	arxiv-cs.CL	2024-06-01
1113	Transformer-based Fall Detection in Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	Adrián Núñez-Marcos; I. Arganda-Carreras;	Eng. Appl. Artif. Intell.	2024-06-01
1114	How Random Is Random? Evaluating The Randomness and Humaness of LLMs’ Coin Flips Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One uniquely human trait is our inability to be random. We see and produce patterns where there should not be any and we do so in a predictable way. LLMs are supplied with human …	K. V. Koevering; Jon Kleinberg;	ArXiv	2024-05-31
1115	A Comparison of Correspondence Analysis with PMI-based Word Embedding Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we link correspondence analysis (CA) to the factorization of the PMI matrix.	Qianqian Qi; Ayoub Bagheri; David J. Hessen; Peter G. M. van der Heijden;	arxiv-cs.CL	2024-05-31
1116	Bi-Directional Transformers Vs. Word2vec: Discovering Vulnerabilities in Lifted Compiled Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting vulnerabilities within compiled binaries is challenging due to lost high-level code structures and other factors such as architectural dependencies, compilers, and optimization options. To address these obstacles, this research explores vulnerability detection using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa to learn semantics from intermediate representation (LLVM IR) code.	Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier;	arxiv-cs.CR	2024-05-30
1117	Learning General Policies for Planning Through GPT Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning …	NICHOLAS ROSSETTI et. al.	International Conference on Automated Planning and …	2024-05-30
1118	The Point of View of A Sentiment: Towards Clinician Bias Detection in Psychiatric Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging large language models, this work aims to identify the sentiment expressed in psychiatric clinical notes based on the reader’s point of view.	Alissa A. Valentine; Lauren A. Lepow; Alexander W. Charney; Isotta Landi;	arxiv-cs.CL	2024-05-30
1119	Divide-and-Conquer Meets Consensus: Unleashing The Power of Functions in Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus.	JINGCHANG CHEN et. al.	arxiv-cs.CL	2024-05-30
1120	Automatic Graph Topology-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers.	CHAO WANG et. al.	arxiv-cs.NE	2024-05-30
1121	Ensemble Model With Bert, Roberta and Xlnet For Molecular Property Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a novel approach for predicting molecular properties with high accuracy without the need for extensive pre-training. Employing ensemble learning and supervised …	Junling Hu;	ArXiv	2024-05-30
1122	DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances.	JIA LI et. al.	arxiv-cs.CL	2024-05-30
1123	Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RIR robustly improves knowledge-intensive visual question answering (VQA) of GPT-4V by 37-43%, GPT-4 Turbo by 25-27%, and GPT-4o by 18-20% in terms of open-ended VQA evaluation metrics. To our surprise, we discover that RIR helps the model to better access its own world knowledge.	Jialiang Xu; Michael Moor; Jure Leskovec;	arxiv-cs.CL	2024-05-29
1124	Beyond Agreement: Diagnosing The Rationale Alignment of Automated Essay Scoring Methods Based on Linguistically-informed Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that BERT-like models primarily focus on sentence-level features, whereas LLMs such as GPT-3.5, GPT-4 and Llama-3 are sensitive to conventions & accuracy, language complexity, and organization, indicating a more comprehensive rationale alignment with scoring rubrics.	Yupei Wang; Renfen Hu; Zhe Zhao;	arxiv-cs.CL	2024-05-29
1125	Multi-objective Cross-task Learning Via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new learning-based framework by leveraging the strong reasoning capability of the GPT-based architecture to automate surgical robotic tasks.	Jiawei Fu; Yonghao Long; Kai Chen; Wang Wei; Qi Dou;	arxiv-cs.RO	2024-05-29
1126	Voice Jailbreak Attacks Against GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first systematic measurement of jailbreak attacks against the voice mode of GPT-4o.	Xinyue Shen; Yixin Wu; Michael Backes; Yang Zhang;	arxiv-cs.CR	2024-05-29
1127	STAT: Shrinking Transformers After Training Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present STAT: a simple algorithm to prune transformer models without any fine-tuning. STAT eliminates both attention heads and neurons from the network, while preserving …	Megan Flynn; Alexander Wang; Dean Edward Alvarez; Christopher De Sa; Anil Damle;	ArXiv	2024-05-29
1128	Towards Next-Generation Urban Decision Support Systems Through AI-Powered Generation of Scientific Ontology Using Large Language Models – A Case in Optimizing Intermodal Freight Transportation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management …	JOSE TUPAYACHI et. al.	ArXiv	2024-05-29
1129	MDS-ViTNet: Improving Saliency Prediction for Eye-Tracking with Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking.	Polezhaev Ignat; Goncharenko Igor; Iurina Natalya;	arxiv-cs.CV	2024-05-29
1130	Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Language models, such as GPT-3 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks, using instruction fine-tuning. …	PENG LI et. al.	Proc. ACM Manag. Data	2024-05-29
1131	Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As interest in reformulating the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA.	Simranjit Singh; Georgios Pavlakos; Dimitrios Stamoulis;	arxiv-cs.CV	2024-05-29
1132	A Multi-Source Retrieval Question Answering Framework Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information.	RIDONG WU et. al.	arxiv-cs.IR	2024-05-29
1133	AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we rethink the approach to jailbreaking LLMs and formally define three essential properties from the attacker’ s perspective, which contributes to guiding the design of jailbreak methods.	JIAWEI CHEN et. al.	arxiv-cs.CV	2024-05-29
1134	LMO-DP: Optimizing The Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$).	QIN YANG et. al.	arxiv-cs.CR	2024-05-29
1135	Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Repeat Ranking method – where we evaluate the same responses multiple times and train only on those responses which are consistently ranked.	Peter Devine;	arxiv-cs.CL	2024-05-29
1136	Data-Efficient Approach to Humanoid Control Via Fine-Tuning A Pre-Trained GPT on Action Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories.	Siddharth Padmanabhan; Kazuki Miyazawa; Takato Horii; Takayuki Nagai;	arxiv-cs.RO	2024-05-28
1137	Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks.	Shubham Vatsal; Ayush Singh;	arxiv-cs.CL	2024-05-28
1138	Notes on Applicability of GPT-4 to Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics.	Łukasz Borchmann;	arxiv-cs.CL	2024-05-28
1139	Delving Into Differentially Private Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such `reduction’ is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively.	YOULONG DING et. al.	arxiv-cs.LG	2024-05-28
1140	I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach aims to revolutionize teachers’ assessment of students’ practices by leveraging Generative Artificial Intelligence (GenAI) to offer detailed insights into classroom dynamics.	UNGGI LEE et. al.	arxiv-cs.HC	2024-05-28
1141	Look Ahead Text Understanding and LLM Stitching Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper proposes a look ahead text understanding problem with look ahead section identification (LASI) as an example. This problem may appear in generative AI as well as human …	Junlin Julian Jiang; Xin Li;	International Conference on Web and Social Media	2024-05-28
1142	Deployment of Large Language Models to Control Mobile Robots at The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated.	PASCAL SIKORSKI et. al.	arxiv-cs.RO	2024-05-27
1143	CTranS: A Multi-Resolution Convolution-Transformer Network for Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Achieving accurate medical image segmentation requires considering both global contextual information and local regional details. Compared to traditional convolutional neural …	Zhendi Gong; Andrew P. French; Guoping Qiu; Xin Chen;	2024 IEEE International Symposium on Biomedical Imaging …	2024-05-27
1144	How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they …	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	ArXiv	2024-05-27
1145	Multi-objective Representation for Numbers in Clinical Narratives Using CamemBERT-bio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to classify numerical values extracted from medical documents across seven distinct physiological categories, employing CamemBERT-bio.	Boammani Aser Lompo; Thanh-Dung Le;	arxiv-cs.CL	2024-05-27
1146	Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous approaches to 3D human motion generation have achieved notable success, they often rely on extensive training and are limited to specific tasks. To address these challenges, we introduce Motion-Agent, an efficient conversational framework designed for general human motion generation, editing, and understanding.	QI WU et. al.	arxiv-cs.CV	2024-05-27
1147	Vision-and-Language Navigation Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposal, the Vision-and-Language Navigation Generative Pretrained Transformer (VLN-GPT), adopts a transformer decoder model (GPT2) to model trajectory sequence dependencies, bypassing the need for historical encoding modules.	Wen Hanlin;	arxiv-cs.AI	2024-05-27
1148	Toward A Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered a …	M. EMANI et. al.	2024 IEEE International Parallel and Distributed Processing …	2024-05-27
1149	RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm.	TIANYU YU et. al.	arxiv-cs.CL	2024-05-27
1150	InversionView: A General-Purpose Method for Reading Information from Neural Activations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activations.	Xinting Huang; Madhur Panwar; Navin Goyal; Michael Hahn;	arxiv-cs.LG	2024-05-27
1151	LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker.	Shaohua Dong; Yunhe Feng; Qing Yang; Yuewei Lin; Heng Fan;	arxiv-cs.CV	2024-05-27
1152	Assessing LLMs Suitability for Knowledge Graph Completion Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Recent work has shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot …	Vasile Ionut Remus Iga; Gheorghe Cosmin Silaghi;	arxiv-cs.CL	2024-05-27
1153	Performance Evaluation of Reddit Comments Using Machine Learning and Natural Language Processing Methods in Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments.	Xiaoxia Zhang; Xiuyuan Qi; Zixin Teng;	arxiv-cs.CL	2024-05-26
1154	Higher-Order Transformer Derivative Estimates for Explicit Pathwise Learning Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider realistic transformers with multiple (non-linearized) attention heads per block and layer normalization.	Yannick Limmer; Anastasis Kratsios; Xuwei Yang; Raeid Saqur; Blanka Horvath;	arxiv-cs.LG	2024-05-26
1155	M3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three …	MINGSHUANG LUO et. al.	ArXiv	2024-05-25
1156	M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation.	MINGSHUANG LUO et. al.	arxiv-cs.CV	2024-05-25
1157	Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens.	HOAI-CHAU TRAN et. al.	arxiv-cs.LG	2024-05-25
1158	A Registration Method of Overlap Aware Point Clouds Based on Transformer-to-Transformer Regression Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has recently become widely adopted in point cloud registration. Nevertheless, Transformer is unsuitable for handling dense point clouds due to resource constraints and …	YAFEI ZHAO et. al.	Remote. Sens.	2024-05-25
1159	Activator: GLU Activation Function As The Core Component of A Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimental assessments conducted by this research show that both proposed modifications and reductions offer competitive performance in relation to baseline architectures, in support of the aims of this work in establishing a more efficient yet capable alternative to the traditional attention mechanism as the core component in designing transformer architectures.	Abdullah Nazhat Abdullah; Tarkan Aydin;	arxiv-cs.CV	2024-05-24
1160	Incremental Comprehension of Garden-Path Sentences By Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa.	ANDREW LI et. al.	arxiv-cs.CL	2024-05-24
1161	Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS).	Barış Büyüktaş; Kenneth Weitzel; Sebastian Völkers; Felix Zailskas; Begüm Demir;	arxiv-cs.CV	2024-05-24
1162	Enhancing Non-player Characters in Unity 3D Using GPT-3.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This case study presents a comprehensive integration process of OpenAI’s GPT-3.5 large language model (LLM) into Unity 3D to enhance non-player characters (NPCs) in video games …	John Sissler;	ACM Games: Research and Practice	2024-05-24
1163	PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis.	ZICHENG WANG et. al.	arxiv-cs.CV	2024-05-24
1164	GPTZoo: A Large-scale Dataset of GPTs for The Research Community Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support academic research on GPTs, we introduce GPTZoo, a large-scale dataset comprising 730,420 GPT instances.	Xinyi Hou; Yanjie Zhao; Shenao Wang; Haoyu Wang;	arxiv-cs.SE	2024-05-24
1165	A Comparative Analysis of Distributed Training Strategies for GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques developed to address these challenges, enabling the efficient and scalable training of Large Language Models.	Ishan Patwardhan; Shubham Gandhi; Om Khare; Amit Joshi; Suraj Sawant;	arxiv-cs.DC	2024-05-24
1166	Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD).	Gao Tianci;	arxiv-cs.RO	2024-05-24
1167	SMART: Scalable Multi-agent Real-time Motion Generation Via Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens.	Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan;	arxiv-cs.RO	2024-05-24
1168	Comet: A Communication-efficient and Performant Approximation for Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel plug-in method Comet to effectively reduce the communication cost without compromising the inference performance.	Xiangrui Xu; Qiao Zhang; Rui Ning; Chunsheng Xin; Hongyi Wu;	arxiv-cs.LG	2024-05-24
1169	The Buffer Mechanism for Multi-Step Information Reasoning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy based on their inherent structure and horizontal thinking strategy based on Chain of Thought to achieve multi-step reasoning.	ZHIWEI WANG et. al.	arxiv-cs.AI	2024-05-24
1170	GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey.	Virginia K. Felkner; Jennifer A. Thompson; Jonathan May;	arxiv-cs.CL	2024-05-24
1171	SMART: Scalable Multi-agent Real-time Simulation Via Next-token Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their …	Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan;	ArXiv	2024-05-24
1172	SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings.	GUIBAO SHEN et. al.	arxiv-cs.CV	2024-05-24
1173	AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce AutoCoder, the first Large Language Model to surpass GPT-4 Turbo (April 2024) and GPT-4o in pass@1 on the Human Eval benchmark test ($\mathbf{90.9\%}$ vs. …	Bin Lei; Yuchen Li; Qiuwu Chen;	ArXiv	2024-05-23
1174	CulturePark: Boosting Cross-cultural Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.	CHENG LI et. al.	arxiv-cs.AI	2024-05-23
1175	CEEBERT: Cross-Domain Inference in Early Exit BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point.	Divya Jyoti Bajpai; Manjesh Kumar Hanawal;	arxiv-cs.CL	2024-05-23
1176	An Evaluation of Estimative Uncertainty in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares estimative uncertainty in commonly used large language models (LLMs) like GPT-4 and ERNIE-4 to that of humans, and to each other.	Zhisheng Tang; Ke Shen; Mayank Kejriwal;	arxiv-cs.CL	2024-05-23
1177	JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.	KUN ZHOU et. al.	arxiv-cs.CL	2024-05-23
1178	Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to The Edge of Generalization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with.	Boshi Wang; Xiang Yue; Yu Su; Huan Sun;	arxiv-cs.CL	2024-05-23
1179	Efficient Point Transformer with Dynamic Token Aggregating for LiDAR Point Cloud Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: They also tend to be slow due to requiring time-consuming point cloud sampling and grouping processes. To address these issues, we propose an efficient point TransFormer with Dynamic Token Aggregating (DTA-Former) for point cloud representation and processing.	Dening Lu; Jun Zhou; Linlin Xu; Jonathan Li;	arxiv-cs.CV	2024-05-23
1180	ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD.	Luan Thanh Nguyen;	arxiv-cs.CL	2024-05-22
1181	Transformer in Touch: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to comprehensively outline the application and development of Transformers in tactile technology.	Jing Gao; Ning Cheng; Bin Fang; Wenjuan Han;	arxiv-cs.LG	2024-05-21
1182	Contextualized Word Embeddings Expose Ethnic Biases in News Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The web is a major source for news and information. Yet, news can perpetuate and amplify biases and stereotypes. Prior work has shown that training static word embeddings can …	Guusje Thijs; D. Trilling; A. Kroon;	Proceedings of the 16th ACM Web Science Conference	2024-05-21
1183	Towards Authoring Open-Ended Behaviors for Narrative Puzzle Games with Large Language Model Support Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing games with branching story lines, object annotations, scene details, and dialog can be challenging due to the intensive authoring required. We investigate the potential …	Britney Ngaw; Grishma Jena; João Sedoc; Aline Normoyle;	Proceedings of the 19th International Conference on the …	2024-05-21
1184	How Reliable AI Chatbots Are for Disease Prediction from Patient Complaints? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making.	Ayesha Siddika Nipu; K M Sajjadul Islam; Praveen Madiraju;	arxiv-cs.AI	2024-05-21
1185	Advancing Web Science Through Foundation Model for Tabular Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As the landscape of web science expands, handling the vast datasets collected from the Web while preserving computational efficiency and privacy remains a significant challenge. …	Inwon Kang;	Companion Publication of the 16th ACM Web Science Conference	2024-05-21
1186	Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs).	MOHAMED AMINE FERRAG et. al.	arxiv-cs.CR	2024-05-21
1187	Cardistry: Exploring A GPT Model Workflow As An Adapted Method of Gaminiscing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cardistry is an application that enables users to create their own playing cards for use in evocative storytelling games. It is driven by OpenAI’s Generative Pre-trained …	BRANDON LYMAN et. al.	Proceedings of the 19th International Conference on the …	2024-05-21
1188	Exploring The Gap: The Challenge of Achieving Human-like Generalization for Concept-based Translation Instruction Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study utilizes concept description instructions and few-shot learning examples to examine the effectiveness of a large language model (GPT-4) in generating Chinese-to-English …	Ming Qian; Chuiqing Kong;	AAAI Spring Symposia	2024-05-20
1189	Automated Hardware Logic Obfuscation Framework Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process.	BANAFSHEH SABER LATIBARI et. al.	arxiv-cs.CR	2024-05-20
1190	From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, …	PRIYANKA NANAYAKKARA et. al.	2024 IEEE Symposium on Security and Privacy (SP)	2024-05-19
1191	DaVinci at SemEval-2024 Task 9: Few-shot Prompting GPT-3.5 for Unconventional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types.	Suyash Vardhan Mathur; Akshett Rai Jindal; Manish Shrivastava;	arxiv-cs.CL	2024-05-19
1192	Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning …	Congpeng Du; Seok-Bum Ko; Hao Zhang;	2024 IEEE International Symposium on Circuits and Systems …	2024-05-19
1193	Zero-Shot Stance Detection Using Contextual Data Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this approach, we aim to fine-tune an existing model at test time.	Ghazaleh Mahmoudi; Babak Behkamkia; Sauleh Eetemadi;	arxiv-cs.CL	2024-05-19
1194	Enhancing User Experience in Large Language Models Through Human-centered Design: Integrating Theoretical Insights with An Experimental Study to Meet Diverse Software Learning Needs with A Single Document Knowledge Base Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The experimental results demonstrate the effect of different elements’ forms and organizational methods in the document, as well as GPT’s relevant configurations, on the interaction effectiveness between GPT and software learners.	Yuchen Wang; Yin-Shan Lin; Ruixin Huang; Jinyin Wang; Sensen Liu;	arxiv-cs.HC	2024-05-19
1195	Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques.	Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; Chen Hajaj;	arxiv-cs.CR	2024-05-18
1196	Benchmarking Large Language Models on CFLUE – A Chinese Financial Language Understanding Evaluation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In light of recent breakthroughs in large language models (LLMs) that have revolutionized natural language processing (NLP), there is an urgent need for new benchmarks to keep …	Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo;	Annual Meeting of the Association for Computational …	2024-05-17
1197	Benchmarking Large Language Models on CFLUE — A Chinese Financial Language Understanding Evaluation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CFLUE, the Chinese Financial Language Understanding Evaluation benchmark, designed to assess the capability of LLMs across various dimensions.	Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo;	arxiv-cs.CL	2024-05-17
1198	GPTs Window Shopping: An Analysis of The Landscape of Custom ChatGPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Customization comes in the form of prompt-tuning, analysis of reference resources, browsing, and external API interactions, alongside a promise of revenue sharing for created custom GPTs. In this work, we peer into the window of the GPT Store and measure its impact.	Benjamin Zi Hao Zhao; Muhammad Ikram; Mohamed Ali Kaafar;	arxiv-cs.SI	2024-05-17
1199	Quantitative Analysis of GPT-4 Model: Optimizing Patient Eligibility Classification for Clinical Trials and Reducing Expert Judgment Dependency Related Papers Related Patents Related Grants Related Venues Related Experts View	ARTI DEVI et. al.	International Conference on Medical and Health Informatics	2024-05-17
1200	GPT Store Mining and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings aim to enhance understanding of the GPT ecosystem, providing valuable insights for future research, development, and policy-making in generative AI.	Dongxun Su; Yanjie Zhao; Xinyi Hou; Shenao Wang; Haoyu Wang;	arxiv-cs.LG	2024-05-16
1201	Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, no comparative study examining different LLMs has yet been reported for web-form-test generation.	TAO LI et. al.	arxiv-cs.SE	2024-05-16
1202	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 1.55B parameters.	RHEA SANJAY SUKTHANKER et. al.	arxiv-cs.LG	2024-05-16
1203	Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP).	Tong Zhan; Chenxi Shi; Yadong Shi; Huixiang Li; Yiyu Lin;	arxiv-cs.CL	2024-05-15
1204	Comparing The Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to compare the performance of two large language models, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings.	Birger Moell;	arxiv-cs.CL	2024-05-15
1205	GPT-3.5 for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models.	Anisia Katinskaia; Roman Yangarber;	arxiv-cs.CL	2024-05-14
1206	Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis Between Emotional Stimuli Prompt, Fine-Tuning, and In-Context Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Textual emotion recognition (TER) has significant commercial potential since it can be used as an excellent tool to monitor a brand/business reputation, understand customer …	E. Nfaoui; Hanane Elfaik;	J. Theor. Appl. Electron. Commer. Res.	2024-05-14
1207	Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Continual learning, which acts as an effective tool for detecting newly emerged deepfake audio while maintaining performance on older types, lacks a well-constructed and user-friendly evaluation framework. To address this gap, we introduce EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection.	Xiaohui Zhang; Jiangyan Yi; Jianhua Tao;	arxiv-cs.SD	2024-05-14
1208	Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes a concurrent programming framework for quantitatively analyzing the efficiency challenges in serving multiple long-context requests under limited size of GPU high-bandwidth memory (HBM) regime.	Yao Fu;	arxiv-cs.LG	2024-05-14
1209	Open-vocabulary Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the \textbf{Brain Prompt GPT (BP-GPT)}.	Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He;	arxiv-cs.HC	2024-05-13
1210	Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs.	CHENGYUE WU et. al.	arxiv-cs.CL	2024-05-13
1211	Can GNN Be Good Adapter for LLMs? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs.	XUANWEN HUANG et. al.	www	2024-05-13
1212	Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Memes are important because they serve as conduits for expressing emotions, opinions, and social commentary online, providing valuable insight into public sentiment, trends, and …	F. Abdullakutty; Usman Naseem;	Companion Proceedings of the ACM on Web Conference 2024	2024-05-13
1213	Relationalizing Tables with Large Language Models: The Promise and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural …	Zezhou Huang; Eugene Wu;	2024 IEEE 40th International Conference on Data Engineering …	2024-05-13
1214	Using ChatGPT for Thematic Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The utilisation of AI-driven tools, notably ChatGPT, within academic research is increasingly debated from several perspectives including ease of implementation, and potential …	Aleksei Turobov; Diane Coyle; Verity Harding;	ArXiv	2024-05-13
1215	Large Language Models: Principles and Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The last few years have been marked by several breakthroughs in the domain of generative AI. Large language models such as GPT-4 are able to solve a plethora of tasks, ranging …	Immanuel Trummer;	2024 IEEE 40th International Conference on Data Engineering …	2024-05-13
1216	Decision Mamba Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models.	André Correia; Luís A. Alexandre;	arxiv-cs.LG	2024-05-13
1217	PRECYSE: Predicting Cybersickness Using Transformer for Multimodal Time-Series Sensor Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cybersickness, a factor that hinders user immersion in VR, has been the subject of ongoing attempts to predict it using AI. Previous studies have used CNN and LSTM for prediction …	Dayoung Jeong; Kyungsik Han;	Proceedings of the ACM on Interactive, Mobile, Wearable and …	2024-05-13
1218	Transformer Models for Brazilian Portuguese Question Generation: An Experimental Study Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Unlike tasks such as translation or summarization, generating meaningful questions necessitates a profound understanding of context, semantics, and syntax. This complexity arises …	Julia da Rocha Junqueira; U. Corrêa; Larissa A. de Freitas;	The International FLAIRS Conference Proceedings	2024-05-13
1219	The Personality Dimensions GPT-3 Expresses During Human-Chatbot Interactions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models such as GPT-3 and ChatGPT can mimic human-to-human conversation with unprecedented fidelity, which enables many applications such as conversational agents …	N. Kovačević; Christian Holz; Markus Gross; Rafael Wampfler;	Proceedings of the ACM on Interactive, Mobile, Wearable and …	2024-05-13
1220	COLA: Cross-city Mobility Transformer for Human Trajectory Simulation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to explore the intriguing problem of mobility transfer across cities, grasping the universal patterns of human trajectories to augment the powerful Transformer with external mobility data.	Yu Wang; Tongya Zheng; Yuxuan Liang; Shunyu Liu; Mingli Song;	www	2024-05-13
1221	Coding Historical Causes of Death Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death.	BJØRN PEDERSEN et. al.	arxiv-cs.LG	2024-05-13
1222	ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: SMS, or short messaging service, is a widely used and cost-effective communication medium that has sadly turned into a haven for unwanted messages, commonly known as SMS spam. …	Mohammad Amaz Uddin; Muhammad Nazrul Islam; L. Maglaras; Helge Janicke; Iqbal H. Sarker;	ArXiv	2024-05-12
1223	L(u)PIN: LLM-based Political Ideology Nowcasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method to analyze ideological positions of individual parliamentary representatives by leveraging the latent knowledge of LLMs.	Ken Kato; Annabelle Purnomo; Christopher Cochrane; Raeid Saqur;	arxiv-cs.CL	2024-05-12
1224	Can Language Models Explain Their Own Classification Behavior? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules.	Dane Sherburn; Bilal Chughtai; Owain Evans;	arxiv-cs.LG	2024-05-12
1225	Limited Ability of LLMs to Simulate Human Psychological Behaviours: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we prompt OpenAI’s flagship models, GPT-3.5 and GPT-4, to assume different personas and respond to a range of standardized measures of personality constructs.	Nikolay B Petrov; Gregory Serapio-García; Jason Rentfrow;	arxiv-cs.CL	2024-05-12
1226	GPTs in Mafia-like Game Simulation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this research, we explore the potential of Generative AI models, focusing on their application in role-playing simulations through Spyfall, a renowned mafia-style game. By …	Munyeong Kim;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-11
1227	RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel zero-shot video captioning framework named Retrieval-Enhanced Test-Time Adaptation (RETTA), which takes advantage of existing pretrained large-scale vision and language models to directly generate captions with test-time adaptation.	YUNCHUAN MA et. al.	arxiv-cs.CV	2024-05-11
1228	ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Peer influence plays a crucial role in promoting classroom participation, where behaviors from active students can contribute to a collective classroom learning experience. …	ZIYI LIU et. al.	Proceedings of the CHI Conference on Human Factors in …	2024-05-11
1229	Integrating Expertise in LLMs: Crafting A Customized Nutrition Assistant with Refined Template Instructions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have the potential to contribute to the fields of nutrition and dietetics in generating food product explanations that facilitate informed food …	Annalisa Szymanski; Brianna L Wimer; Oghenemaro Anuyah; H. Eicher-Miller; Ronald A Metoyer;	Proceedings of the CHI Conference on Human Factors in …	2024-05-11
1230	An Autoethnographic Reflection of Prompting A Custom GPT Based on Oneself Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: What if you could have a chat with yourself? OpenAI’s introduction of custom GPTs in November 2023 provides an opportunity for non-technical users to create specialized generative …	Priscilla Y. Lo;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-11
1231	ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design.	Francisco Olivos; Minhui Liu;	arxiv-cs.CY	2024-05-10
1232	Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution.	IBAI RAMIREZ et. al.	arxiv-cs.LG	2024-05-10
1233	Data-Driven Strategies for Complex System Forecasts: The Role of Textual Big Data and State-Space Transformers in Decision Support Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this research, an innovative state space-based Transformer model is proposed to address the challenges of complex system prediction tasks. By integrating state space theory, …	HUAIRONG HUO et. al.	Syst.	2024-05-10
1234	TacoERE: Cluster-aware Compression for Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm.	YONG GUAN et. al.	arxiv-cs.CL	2024-05-10
1235	Multimodal LLMs Struggle with Basic Visual Network Analysis: A VNA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose.	Evan M. Williams; Kathleen M. Carley;	arxiv-cs.CV	2024-05-10
1236	A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task.	Dongwei Sun; Yajie Bao; Junmin Liu; Xiangyong Cao;	arxiv-cs.CV	2024-05-10
1237	Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder.	YAO GE et. al.	arxiv-cs.CL	2024-05-09
1238	People Cannot Distinguish GPT-4 from A Human in A Turing Test IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or …	Cameron R. Jones; Benjamin K. Bergen;	ArXiv	2024-05-09
1239	Optimizing Software Vulnerability Detection Using RoBERTa and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View	Cho Xuan Do; Nguyen Trong Luu; Phuong Thi Lan Nguyen;	Autom. Softw. Eng.	2024-05-08
1240	Leveraging GenAI for An Intelligent Tutoring System for R: A Quantitative Evaluation of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The tremendous advances in Artificial Intelligence (AI) open new opportunities for education, with Intelligent Tutoring Systems (ITS) powered by Generative Artificial Intelligence …	LUKAS FRANK et. al.	2024 IEEE Global Engineering Education Conference (EDUCON)	2024-05-08
1241	Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals.	Aylin Gunal; Baihan Lin; Djallel Bouneffouf;	arxiv-cs.CL	2024-05-08
1242	Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference.	HAOQI WU et. al.	arxiv-cs.CR	2024-05-08
1243	Integrating Pepper Robot and GPT for Neuromyth Educational Conversation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of neuromyths, or false beliefs about brain function and learning, has been a significant challenge in the field of education. These myths often hinders the learning …	Abdelhadi Hireche; Abdelkader Nasreddine Belkacem;	2024 IEEE Global Engineering Education Conference (EDUCON)	2024-05-08
1244	Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by recent work that has utilised very powerful LLMs, such as GPT-4, to evaluate the outputs produced by less powerful models, we conduct an automated analysis of the quality of the feedback produced by several open source models using a dataset from an introductory programming course.	CHARLES KOUTCHEME et. al.	arxiv-cs.CL	2024-05-08
1245	A Transformer with Stack Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism.	Jiaoda Li; Jennifer C. White; Mrinmaya Sachan; Ryan Cotterell;	arxiv-cs.CL	2024-05-07
1246	The Silicon Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness.	Lena Armstrong; Abbey Liu; Stephen MacNeil; Danaë Metaxa;	arxiv-cs.CY	2024-05-07
1247	Evaluating Text Summaries Generated By Large Language Models Using OpenAI’s GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research examines the effectiveness of OpenAI’s GPT models as independent evaluators of text summaries generated by six transformer-based models from Hugging Face: DistilBART, BERT, ProphetNet, T5, BART, and PEGASUS.	Hassan Shakil; Atqiya Munawara Mahi; Phuoc Nguyen; Zeydy Ortiz; Mamoun T. Mardini;	arxiv-cs.CL	2024-05-07
1248	Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries.	Hassan Shakil; Zeydy Ortiz; Grant C. Forbes;	arxiv-cs.CL	2024-05-07
1249	GPT-Enabled Cybersecurity Training: A Tailored Approach for Effective Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the limitations of traditional Cybersecurity Awareness and Training (CSAT) programs and proposes an innovative solution using Generative Pre-Trained Transformers (GPT) to address these shortcomings.	Nabil Al-Dhamari; Nathan Clarke;	arxiv-cs.CR	2024-05-07
1250	How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms.	Jorge García-Carrasco; Alejandro Maté; Juan Trujillo;	arxiv-cs.LG	2024-05-07
1251	Structured Click Control in Transformer-based Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness of the response, we propose a structured click intent model based on graph neural networks, which adaptively obtains graph nodes via the global similarity of user-clicked Transformer tokens.	Long Xu; Yongquan Chen; Rui Huang; Feng Wu; Shiwu Lai;	arxiv-cs.CV	2024-05-07
1252	Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This anchored bias challenges the integrity of GPT-2’s decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias.	Ruizhe Li; Yanjun Gao;	arxiv-cs.CL	2024-05-06
1253	Hire Me or Not? Examining Language Model’s Behavior with Occupation Attributes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the impressive performance in various downstream tasks, large language models (LLMs) have been widely integrated into production pipelines, like recruitment and recommendation systems.	Damin Zhang; Yi Zhang; Geetanjali Bihani; Julia Rayz;	arxiv-cs.CL	2024-05-06
1254	Addressing Data Scarcity in The Medical Domain: A GPT-Based Approach for Synthetic Data Generation and Feature Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research confronts the persistent challenge of data scarcity in medical machine learning by introducing a pioneering methodology that harnesses the capabilities of Generative …	F. Sufi;	Inf.	2024-05-06
1255	Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs have not yet been used to characterize synergistic learning in students’ collaborative discourse. In this exploratory work, we take a first step towards adopting a human-in-the-loop prompt engineering approach with GPT-4-Turbo to summarize and categorize students’ synergistic learning during collaborative discourse.	Clayton Cohn; Caitlin Snyder; Justin Montenegro; Gautam Biswas;	arxiv-cs.CL	2024-05-06
1256	Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation.	Keith Burghardt; Kai Chen; Kristina Lerman;	arxiv-cs.CL	2024-05-06
1257	Detecting Anti-Semitic Hate Speech Using Transformer-based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we developed a new data labeling technique and established a proof of concept targeting anti-Semitic hate speech, utilizing a variety of transformer models such as BERT (arXiv:1810.04805), DistillBERT (arXiv:1910.01108), RoBERTa (arXiv:1907.11692), and LLaMA-2 (arXiv:2307.09288), complemented by the LoRA fine-tuning approach (arXiv:2106.09685).	Dengyi Liu; Minghao Wang; Andrew G. Catlin;	arxiv-cs.CL	2024-05-06
1258	Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems.	Bingzhang Wang; Muhammad Monjurul Karim; Chenxi Liu; Yinhai Wang;	arxiv-cs.MA	2024-05-05
1259	Can Large Language Models Make The Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents reports on a series of experiments with a novel dataset evaluating how well Large Language Models (LLMs) can mark (i.e. grade) open text responses to short answer questions, Specifically, we explore how well different combinations of GPT version and prompt engineering strategies performed at marking real student answers to short answer across different domain areas (Science and History) and grade-levels (spanning ages 5-16) using a new, never-used-before dataset from Carousel, a quizzing platform.	Owen Henkel; Adam Boxer; Libby Hills; Bill Roberts;	arxiv-cs.CL	2024-05-05
1260	Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, …	Sven Jacobs; Steffen Jaschke;	2024 36th International Conference on Software Engineering …	2024-05-05
1261	Unraveling The Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the underexplored area of evaluating LLMs in low-resourced languages such as Bengali.	Fatema Tuj Johora Faria; Mukaffi Bin Moin; Asif Iftekher Fahim; Pronay Debnath; Faisal Muhammad Shah;	arxiv-cs.CL	2024-05-05
1262	U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on self-attention with downsampled tokens, we propose a series of U-shaped DiTs (U-DiTs) in the paper and conduct extensive experiments to demonstrate the extraordinary performance of U-DiT models.	YUCHUAN TIAN et. al.	arxiv-cs.CV	2024-05-04
1263	SCATT: Transformer Tracking with Symmetric Cross-attention Related Papers Related Patents Related Grants Related Venues Related Experts View	Jianming Zhang; Wentao Chen; Jiangxin Dai; Jin Zhang;	Appl. Intell.	2024-05-04
1264	A Combination of BERT and Transformer for Vietnamese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction.	Hieu Ngo Trung; Duong Tran Ham; Tin Huynh; Kiem Hoang;	arxiv-cs.CL	2024-05-04
1265	REASONS: A Benchmark for REtrieval and Automated CitationS Of ScieNtific Sentences Using Public and Proprietary LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article.	DEEPA TILWANI et. al.	arxiv-cs.CL	2024-05-03
1266	Structural Pruning of Pre-trained Language Models Via Neural Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process.	Aaron Klein; Jacek Golebiowski; Xingchen Ma; Valerio Perrone; Cedric Archambeau;	arxiv-cs.LG	2024-05-03
1267	Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on The Travelling Salesman Problem Using GPT-3.5 Turbo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP).	Mahmoud Masoud; Ahmed Abdelhay; Mohammed Elhenawy;	arxiv-cs.CL	2024-05-03
1268	The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This study introduces a systematic framework to compare the efficacy of Large Language Models (LLMs) for fine-tuning across various cheminformatics tasks. Employing a uniform …	Youngmin Lee; Andrew S. I. D. Lang; Duoduo Cai; Wheat R. Stephen;	ArXiv	2024-05-02
1269	UQA: Corpus for Urdu Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers.	Samee Arif; Sualeha Farid; Awais Athar; Agha Ali Raza;	arxiv-cs.CL	2024-05-02
1270	Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. We investigate the ability of …	TOLGA BUZ et. al.	STARSEM	2024-05-02
1271	The Effectiveness of LLMs As Annotators: A Comparative Overview and Empirical Analysis of Direct Representation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data.	Maja Pavlovic; Massimo Poesio;	arxiv-cs.CL	2024-05-02
1272	Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing.	TOLGA BUZ et. al.	arxiv-cs.CL	2024-05-02
1273	Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements.	SEUNGONE KIM et. al.	arxiv-cs.CL	2024-05-02
1274	Unveiling The Inherent Needs: GPT Builder As Participatory Design Tool for Exploring Needs and Expectation of AI with Middle-Aged Users Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A generative session that directly involves users in the design process is an effective way to design user-centered experiences by uncovering intrinsic needs. However, engaging …	Huisung Kwon; Y. J. Choi; Sunok Lee; Sangsu Lee;	Extended Abstracts of the CHI Conference on Human Factors …	2024-05-02
1275	Empowering IoT with Generative AI: Applications, Case Studies, and Limitations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rise of the Generative Pre-Trained Transformer(GPT) language model, more commonly used as ChatGPT has brought a spotlight on the ever-developing field of Generative AI (GAI).} …	Siva Sai; Mizaan Kanadia; V. Chamola;	IEEE Internet of Things Magazine	2024-05-01
1276	A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges.	Ayaz Mehmood; Muhammad Tayyab Zamir; Muhammad Asif Ayub; Nasir Ahmad; Kashif Ahmad;	arxiv-cs.CL	2024-05-01
1277	Transformer Dense Center Network for Liver Tumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	JINLIN MA et. al.	Biomed. Signal Process. Control.	2024-05-01
1278	Chat-GPT; Validating Technology Acceptance Model (TAM) in Education Sector Via Ubiquitous Learning Mechanism IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	N. SAIF et. al.	Comput. Hum. Behav.	2024-05-01
1279	FedViT: Federated Continual Learning of Vision Transformer at Edge Related Papers Related Patents Related Grants Related Venues Related Experts View	XIAOJIANG ZUO et. al.	Future Gener. Comput. Syst.	2024-05-01
1280	Collaborative Compensative Transformer Network for Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View	Jun Chen; Heye Zhang; Mingming Gong; Zhifan Gao;	Pattern Recognit.	2024-05-01
1281	Font Transformer for Few-shot Font Generation Related Papers Related Patents Related Grants Related Venues Related Experts View	Xu Chen; Lei Wu; Yongliang Su; Lei Meng; Xiangxu Meng;	Comput. Vis. Image Underst.	2024-05-01
1282	How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system.	JIONGHAO LIN et. al.	arxiv-cs.CL	2024-05-01
1283	Vision Transformer: To Discover The four Secrets of Image Patches Related Papers Related Patents Related Grants Related Venues Related Experts View	TAO ZHOU et. al.	Inf. Fusion	2024-05-01
1284	Semantic Perceptive Infrared and Visible Image Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	XIN YANG et. al.	Pattern Recognit.	2024-05-01
1285	Energy-informed Graph Transformer Model for Solid Mechanical Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View	Bo Feng; Xiaoping Zhou;	Commun. Nonlinear Sci. Numer. Simul.	2024-05-01
1286	Do Large Language Models Understand Conversational Implicature — A Case Study with A Chinese Sitcom Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $\textit{My Own Swordsman}$.	Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu;	arxiv-cs.CL	2024-04-30
1287	Harmonic LLMs Are Trustworthy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an intuitive method to test the robustness (stability and explainability) of any black-box LLM in real-time via its local deviation from harmoniticity, denoted as $\gamma$.	Nicholas S. Kersting; Mohammad Rahman; Suchismitha Vedala; Yang Wang;	arxiv-cs.LG	2024-04-30
1288	Do Large Language Models Understand Conversational Implicature – A Case Study with A Chinese Sitcom Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce …	Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu;	ArXiv	2024-04-30
1289	How Can I Improve? Using GPT to Highlight The Desired and Undesired Parts of Open-ended Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our aim is to equip tutors with actionable, explanatory feedback during online training lessons.	JIONGHAO LIN et. al.	arxiv-cs.CL	2024-04-30
1290	Ethical Reasoning and Moral Value Alignment of LLMs Depend on The Language We Prompt Them in Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs — GPT-4, ChatGPT, and Llama2-70B-Chat — perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted.	Utkarsh Agarwal; Kumar Tanmay; Aditi Khandelwal; Monojit Choudhury;	arxiv-cs.CL	2024-04-29
1291	RSCaMa: Remote Sensing Image Change Captioning with State Space Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite previous methods progressing in the spatial change perception, there are still weaknesses in joint spatial-temporal modeling. To address this, in this paper, we propose a novel RSCaMa model, which achieves efficient joint spatial-temporal modeling through multiple CaMa layers, enabling iterative refinement of bi-temporal features.	CHENYANG LIU et. al.	arxiv-cs.CV	2024-04-29
1292	Can GPT-4 Do L2 Analytic Assessment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components.	Stefano Bannò; Hari Krishna Vydana; Kate M. Knill; Mark J. F. Gales;	arxiv-cs.CL	2024-04-29
1293	Structured Named Entity Recognition (NER) in Biomedical Texts Using Pre-Trained Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The field of Natural Language Processing (NLP) has witnessed remarkable progress in recent years, particularly in the domain of biomedical text analysis. Named Entity Recognition …	Pinar Savci; Bihter Das;	2024 12th International Symposium on Digital Forensics and …	2024-04-29
1294	GPT-4 Passes Most of The 297 Written Polish Board Certification Examinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: We developed a software program to download and process PES exams and tested the performance of GPT models using OpenAI Application Programming Interface.	Jakub Pokrywka; Jeremi Kaczmarek; Edward Gorzelańczyk;	arxiv-cs.CL	2024-04-29
1295	Time Machine GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative.	Felix Drinkall; Eghbal Rahimikia; Janet B. Pierrehumbert; Stefan Zohren;	arxiv-cs.CL	2024-04-29
1296	Comparative Analysis of Generic and Fine-Tuned Large Language Models for Conversational Agent Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the rapidly evolving domain of conversational agents, the integration of Large Language Models (LLMs) into Chatbot Development Platforms (CDPs) is a significant innovation. …	LAURA VILLA et. al.	Robotics	2024-04-29
1297	Normalization of Arabic Dialects Into Modern Standard Arabic Using BERT and GPT-2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present an encoder-decored based model for normalization of Arabic dialects using both BERT and GPT-2 based models. Arabic is a language of many dialects that not only differ …	Khalid Alnajjar; Mika Hämäläinen;	J. Data Min. Digit. Humanit.	2024-04-29
1298	PatentGPT: A Large Language Model for Intellectual Property Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain.	ZILONG BAI et. al.	arxiv-cs.CL	2024-04-28
1299	Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the study of how subwording affects the understanding capacity of language models has been very few and only limited to a handful of languages. To reduce this gap we used 6 different tokenization schemes to pretrain relatively small language models in Nepali and used the representations learned to finetune on several downstream tasks.	Nishant Luitel; Nirajan Bekoju; Anand Kumar Sah; Subarna Shakya;	arxiv-cs.CL	2024-04-28
1300	GPT for Games: A Scoping Review (2020-2023) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a scoping review of 55 articles to explore GPT’s potential for games, offering researchers a comprehensive understanding of the current applications and identifying both emerging trends and unexplored areas.	Daijin Yang; Erica Kleinman; Casper Harteveld;	arxiv-cs.HC	2024-04-27
1301	MRScore: Evaluating Radiology Report Generation with LLM-based Reward System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MRScore, an automatic evaluation metric tailored for radiology report generation by leveraging Large Language Models (LLMs).	YUNYI LIU et. al.	arxiv-cs.CL	2024-04-27
1302	Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work addresses the task of detecting conspiracy theories in German Telegram messages.	Milena Pustet; Elisabeth Steffen; Helena Mihaljević;	arxiv-cs.CL	2024-04-27
1303	8-bit Transformer Inference and Fine-tuning for Edge Accelerators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models achieve state-of-the-art accuracy on natural language processing (NLP) and vision tasks, but demand significant computation and memory resources, which makes it …	JEFFREY YU et. al.	Proceedings of the 29th ACM International Conference on …	2024-04-27
1304	CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments.	KAIXUAN HUANG et. al.	arxiv-cs.AI	2024-04-27
1305	UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our team’s participation in the MEDIQA-ClinicalNLP 2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating …	PARTH VASHISHT et. al.	ArXiv	2024-04-27
1306	ChatGPT Is Here to Help, Not to Replace Anybody — An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, 52 first-year CS students were surveyed in order to assess their views on technologies with code-generation capabilities, both from academic and professional perspectives.	Bruno Pereira Cipriano; Pedro Alves;	arxiv-cs.ET	2024-04-26
1307	Enhancing Legal Compliance and Regulation Analysis with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks.	Shabnam Hassani;	arxiv-cs.SE	2024-04-26
1308	UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt — A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework.	PARTH VASHISHT et. al.	arxiv-cs.AI	2024-04-26
1309	ChatGPT Is Here to Help, Not to Replace Anybody – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) like GPT and Bard are capable of producing code based on textual descriptions, with remarkable efficacy. Such technology will have profound …	Bruno Pereira Cipriano; P. Alves;	ArXiv	2024-04-26
1310	Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT As A Pivot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.	Michelle Terblanche; Kayode Olaleye; Vukosi Marivate;	arxiv-cs.CL	2024-04-26
1311	Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative artificial intelligences, particularly large language models (LLMs), play an increasingly prominent role in human decision-making contexts, necessitating transparency …	Lydia Uhler; Verena Jordan; Jürgen Buder; Markus Huff; Frank Papenmeier;	arxiv-cs.CL	2024-04-25
1312	Exploring Internal Numeracy in Language Models: A Case Study on ALBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models.	Ulme Wennberg; Gustav Eje Henter;	arxiv-cs.CL	2024-04-25
1313	Player-Driven Emergence in LLM-Driven Game Narrative Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives.	XIANGYU PENG et. al.	arxiv-cs.CL	2024-04-25
1314	TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters.	LIANG ZHANG et. al.	arxiv-cs.CV	2024-04-25
1315	Towards Efficient Patient Recruitment for Clinical Trials: Application of A Prompt-Based Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR.	Mojdeh Rahmanian; Seyed Mostafa Fakhrahmad; Seyedeh Zahra Mousavi;	arxiv-cs.CL	2024-04-24
1316	GeckOpt: LLM System Efficiency Via Intent-Based Tool Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By …	Michael Fore; Simranjit Singh; Dimitrios Stamoulis;	Proceedings of the Great Lakes Symposium on VLSI 2024	2024-04-24
1317	A Comprehensive Survey on Evaluating Large Language Model Applications in The Medical Industry IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective and ethical deployment.	Yining Huang; Keke Tang; Meilian Chen; Boyuan Wang;	arxiv-cs.CL	2024-04-24
1318	An Automated Learning Model for Twitter Sentiment Analysis Using Ranger AdaBelief Optimizer Based Bidirectional Long Short Term Memory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment analysis is an automated approach which is utilized in process of analysing textual data to describe public opinion. The sentiment analysis has major role in creating …	Sasirekha Natarajan; Smitha Kurian; P. Divakarachari; Przemysław Falkowski‐Gilski;	Expert Syst. J. Knowl. Eng.	2024-04-24
1319	BERT Vs GPT for Financial Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The paper benchmarks several Transformer models [4], to show how these models can judge sentiment from a news event. This signal can then be used for downstream modelling and …	Edward Sharkey; Philip C. Treleaven;	ArXiv	2024-04-24
1320	The Promise and Challenges of Using LLMs to Accelerate The Screening Process of Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening.	Aleksi Huotala; Miikka Kuutila; Paul Ralph; Mika Mäntylä;	arxiv-cs.CL	2024-04-24
1321	Automated Creation of Source Code Variants of A Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CR	2024-04-24
1322	From Complexity to Clarity: How AI Enhances Perceptions of Scientists and The Public’s Understanding of Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public’s understanding of science.	David M. Markowitz;	arxiv-cs.CL	2024-04-23
1323	Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on a specific use case, pharmaceutical manufacturing investigations, and propose that leveraging historical records of manufacturing incidents and deviations in an organization can be beneficial for addressing and closing new cases, or de-risking new manufacturing campaigns.	Hossein Salami; Brandye Smith-Goettler; Vijay Yadav;	arxiv-cs.CL	2024-04-23
1324	Transformers Can Represent $n$-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models.	Anej Svete; Ryan Cotterell;	arxiv-cs.CL	2024-04-23
1325	Pyramid Hierarchical Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer).	Muhammad Ahmad; Muhammad Hassaan Farooq Butt; Manuel Mazzara; Salvatore Distifano;	arxiv-cs.CV	2024-04-23
1326	PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs.	SHASHI KANT GUPTA et. al.	arxiv-cs.CL	2024-04-23
1327	Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance.	Qianru Meng; Xiao Zhang; Guus Ramackers; Visser Joost;	arxiv-cs.SE	2024-04-23
1328	Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates GPT-4V’s ability to interpret meteorological charts and communicate weather hazards appropriately to the user, despite challenges of hallucinations, where generative AI delivers coherent, confident, but incorrect responses. We assess GPT-4V’s competence via its web interface ChatGPT in two tasks: (1) generating a severe-weather outlook from weather-chart analysis and conducting self-evaluation, revealing an outlook that corresponds well with a Storm Prediction Center human-issued forecast; and (2) producing hazard summaries in Spanish and English from weather charts.	JOHN R. LAWSON et. al.	arxiv-cs.CL	2024-04-22
1329	Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text.	Elijah Pelofske; Vincent Urias; Lorie M. Liebrock;	arxiv-cs.CL	2024-04-22
1330	Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Marking, a novel grading task that enhances automated grading systems by performing an in-depth analysis of student responses and providing students with visual highlights.	Shashank Sonkar; Naiming Liu; Debshila B. Mallick; Richard G. Baraniuk;	arxiv-cs.CL	2024-04-22
1331	How Well Can LLMs Echo Us? Evaluating AI Chatbots’ Role-Play Ability with ECHO Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an oversight limits the potential for advancements in digital human clones and non-player characters in video games. To bridge this gap, we introduce ECHO, an evaluative framework inspired by the Turing test.	MAN TIK NG et. al.	arxiv-cs.CL	2024-04-22
1332	Pre-Calc: Learning to Use The Calculator Improves Numeracy in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively.	Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate;	arxiv-cs.CL	2024-04-22
1333	Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This paper presents a preliminary evaluation of GPT-4-Vision, a state-of-the-art deep learning model, and its capabilities in transforming Unified Modeling Language (UML) class diagrams into fully operating Java class files.	Gábor Antal; Richárd Vozár; Rudolf Ferenc;	arxiv-cs.SE	2024-04-22
1334	What Do Transformers Know About Government? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence.	JUE HOU et. al.	arxiv-cs.CL	2024-04-22
1335	Transformer-Driven Resource Allocation for Enhanced Multi-Carrier NOMA Downlink Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a transformer-driven resource allocation strategy to optimize channel assignment and power allocation in multi-carrier non-orthogonal multiple access (NOMA) …	Liang Leon Dong;	2024 IEEE Wireless Communications and Networking Conference …	2024-04-21
1336	SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM’s SVG Editing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For quantitative evaluation of LLMs’ ability to edit SVG, we propose SVGEditBench.	Kunato Nishina; Yusuke Matsui;	arxiv-cs.CV	2024-04-21
1337	Automated Text Mining of Experimental Methodologies from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the fine-tuned DistilBERT, a methodology-specific, pre-trained generative classification language model for mining biomedicine texts.	Ziqing Guo;	arxiv-cs.CL	2024-04-21
1338	Do English Named Entity Recognizers Work Well on Global Englishes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As such, it is unclear whether they generalize for analyzing use of English globally. To test this, we build a newswire dataset, the Worldwide English NER Dataset, to analyze NER model performance on low-resource English variants from around the world.	Alexander Shan; John Bauer; Riley Carlson; Christopher Manning;	arxiv-cs.CL	2024-04-20
1339	Toward A New Era of Rapid Development: Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This …	Gábor Antal; Rich’ard Voz’ar; Rudolf Ferenc;	2024 IEEE/ACM International Workshop on Large Language …	2024-04-20
1340	Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a solution, we propose a combined intrinsic-extrinsic evaluation framework for subword tokenization.	KHUYAGBAATAR BATSUREN et. al.	arxiv-cs.CL	2024-04-20
1341	Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism.	Danqing Ma; Meng Wang; Ao Xiang; Zongqing Qi; Qin Yang;	arxiv-cs.CV	2024-04-19
1342	Enhancing Child Safety in Online Gaming: The Development and Application of Protectbot, An AI-Powered Chatbot Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study introduces Protectbot, an innovative chatbot framework designed to improve safety in children’s online gaming environments. At its core, Protectbot incorporates …	Anum Faraz; Fardin Ahsan; Jinane Mounsef; Ioannis Karamitsos; A. Kanavos;	Inf.	2024-04-19
1343	TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling.	Aleksei Dorkin; Kairit Sirts;	arxiv-cs.CL	2024-04-19
1344	Enabling Natural Zero-Shot Prompting on Encoder Models Via Statement-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label.	Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji;	arxiv-cs.CL	2024-04-19
1345	Linearly-evolved Transformer for Pan-sharpening Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource satellites.To address this challenge between favorable performance and expensive computation, we tailor an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework.	JUNMING HOU et. al.	arxiv-cs.CV	2024-04-19
1346	Crowdsourcing Public Attitudes Toward Local Services Through The Lens of Google Maps Reviews: An Urban Density-based Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel data source and methodological framework that can be easily adapted to different regions, offering useful insights into public sentiment toward the built environment and shedding light on how planning policies can be designed to handle related challenges.	Lingyao Li; Songhua Hu; Atiyya Shaw; Libby Hemphill;	arxiv-cs.SI	2024-04-19
1347	Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce two new methods, Dubo-SQL v1 and v2.	Dayton G. Thorpe; Andrew J. Duberstein; Ian A. Kinsey;	arxiv-cs.CL	2024-04-18
1348	Evaluation of Different Machine Learning and Deep Learning Techniques for Hate Speech Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Detecting online hate speech is important for creating safer online spaces. In this paper, we evaluate the performance of several machine learning (ML) and deep learning (DL) …	Nabil Shawkat; Jamil Saquer; Hazim Shatnawi;	Proceedings of the 2024 ACM Southeast Conference	2024-04-18
1349	MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm.	Jinwu Wang; Wei Mao; Miaomiao Liu;	arxiv-cs.SD	2024-04-18
1350	Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. …	Nils Graef;	arxiv-cs.LG	2024-04-18
1351	Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles.	Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch;	arxiv-cs.CL	2024-04-18
1352	EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research.	Jimenez Eladio; Hao Wu;	arxiv-cs.CL	2024-04-18
1353	Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation.	Yucheng Lin; Yuhan Xia; Yunfei Long;	arxiv-cs.CL	2024-04-18
1354	CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties.	Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu;	arxiv-cs.AI	2024-04-17
1355	Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications.	Wei Chen; Zhiyuan Li;	arxiv-cs.CL	2024-04-17
1356	CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions.	Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak;	arxiv-cs.CL	2024-04-16
1357	Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios.	PEIYUAN ZHI et. al.	arxiv-cs.RO	2024-04-15
1358	AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance.	Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara;	arxiv-cs.CV	2024-04-15
1359	Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, I argue that we can extract from the way the transformer architecture works a theory of the relationship between context and meaning.	Jumbly Grindrod;	arxiv-cs.CL	2024-04-15
1360	Spatial–Temporal Graph Attention Gated Recurrent Transformer Network for Traffic Flow Forecasting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the significant increase in the number of motor vehicles, road-related issues, such as traffic congestion and accidents, have also escalated. The development of an accurate …	Di Wu; Kai Peng; Shangguang Wang; Victor C. M. Leung;	IEEE Internet of Things Journal	2024-04-15
1361	Zero-shot Building Age Classification from Facade Image Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. A building’s age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images …	ZICHAO ZENG et. al.	ArXiv	2024-04-15
1362	Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper introduces fourteen novel datasets for the evaluation of Large Language Models’ safety in the context of enterprise tasks. A method was devised to evaluate a model’s …	David Nadeau; Mike Kroutikov; Karen McNeil; Simon Baribeau;	ArXiv	2024-04-15
1363	Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain.	Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu;	arxiv-cs.CV	2024-04-15
1364	Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility.	SIQIAO XUE et. al.	arxiv-cs.AI	2024-04-15
1365	Leveraging GPT-like LLMs to Automate Issue Labeling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Issue labeling is a crucial task for the effective management of software projects. To date, several approaches have been put forth for the automatic assignment of labels to issue …	Giuseppe Colavito; F. Lanubile; Nicole Novielli; L. Quaranta;	2024 IEEE/ACM 21st International Conference on Mining …	2024-04-15
1366	Few-shot Name Entity Recognition on StackOverflow IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning.	Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo;	arxiv-cs.CL	2024-04-14
1367	Improving Domain Generalization in Speech Emotion Recognition with Whisper Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers have been used successfully in a variety of settings, including Speech Emotion Recognition (SER). However, use of the latest transformer base models in domain …	Erik Goron; Lena Asai; Elias Rut; Martin Dinov;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1368	Planning to Guide LLM for Code Coverage Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Code coverage serves as a crucial metric to assess testing effectiveness, measuring the degree to which a test suite exercises different facets of the code, such as statements, …	Hridya Dhulipala; Aashish Yadavally; Tien N. Nguyen;	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1369	GPT-4 Driven Cinematic Music Generation Through Text Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech …	Muhammad Taimoor Haseeb; Ahmad Hammoudeh; Gus G. Xia;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1370	A Scalable Sparse Transformer Model for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extracting the melody of a singing voice is an essential task within the realm of music information retrieval (MIR). Recently, transformer based models have drawn great attention …	Shuai Yu; Jun Liu; Yi Yu; Wei Li;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1371	Hybrid Convolution-Transformer for Lightweight Single Image Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid development of deep learning has driven the breakthrough in performance of single image super-resolution (SISR). However, many existing works deepen the network to …	Jiuqiang Li; Yutong Ke;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1372	TD-GPT: Target Protein-Specific Drug Molecule Generation GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Drug discovery faces challenges due to the vast chemical space and complex drug-target interactions. This paper proposes a novel deep learning framework TD-GPT for targeted drug …	ZHENGDA HE et. al.	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1373	The Impact of Knowledge Distillation on The Energy Consumption and Runtime Efficiency of NLP Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Context. While models like BERT and GPT are powerful, they require substantial resources. Knowledge distillation can be employed as a technique to enhance their efficiency. Yet, …	YE YUAN et. al.	2024 IEEE/ACM 3rd International Conference on AI …	2024-04-14
1374	Fine Tuning Large Language Model for Secure Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI pair programmers, such as GitHub’s Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the …	Junjie Li; Aseem Sangalay; Cheng Cheng; Yuan Tian; Jinqiu Yang;	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1375	Assessing The Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are structured arguments that allow verifying the correct implementation of the created systems’ non-functional requirements (e.g., safety, security). This …	KIMYA KHAKZAD SHAHANDASHTI et. al.	2024 IEEE/ACM First International Conference on AI …	2024-04-14
1376	OpenTE: Open-Structure Table Extraction From Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents an Open-Structure Table Extraction (OpenTE) task, which aims to extract a table with intrinsic semantic, calculational, and hierarchical structure from …	Haoyu Dong; Mengkang Hu; Qinyu Xu; Haocheng Wang; Yue Hu;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1377	Inducing Inductive Bias in Vision Transformer for EEG Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Human brain signals are highly complex and dynamic in nature. Electroencephalogram (EEG) devices capture some of this complexity, both in space and in time, with a certain …	Rabindra Khadka; Pedro G. Lind; G. Mello; M. Riegler; Anis Yazidi;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1378	A Hybrid CNN-Transformer for Focal Liver Lesion Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The early diagnosis of focal liver lesions (FLLs) plays a key role in the successful treatment of liver cancer. To effectively diagnose focal liver lesions, we used …	LING ZHAO et. al.	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1379	LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NER Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Flat-LAttice Transformer (FLAT) has achieved notable success in Chinese named entity recognition (NER) by integrating lexical information into the widely-used Transformer …	Zongcheng Ji; Yinlong Xiao;	ICASSP 2024 – 2024 IEEE International Conference on …	2024-04-14
1380	A Lightweight Transformer-based Neural Network for Large-scale Masonry Arch Bridge Point Cloud Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to …	Yixiong Jing; Brian Sheil; S. Acikgoz;	Comput. Aided Civ. Infrastructure Eng.	2024-04-14
1381	Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code.	MENGNAN QI et. al.	arxiv-cs.PL	2024-04-12
1382	CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs.	Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran;	arxiv-cs.CL	2024-04-12
1383	Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap.	Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych;	arxiv-cs.CL	2024-04-12
1384	Can Deep Learning Large Language Models Be Used to Unravel Knowledge Graph Creation? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research focuses on advancing RE methodologies by employing and comparing various NLP models for analyzing medical relationships, particularly concerning Gastroesophageal …	Sydney Anuyah; Sunandan Chakraborty;	Proceedings of the International Conference on Computing, …	2024-04-12
1385	Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels.	William Gantt; Aaron Steven White;	arxiv-cs.CL	2024-04-12
1386	Inheritune: Training Smaller Yet More Attentive Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Layers in this state are unable to learn anything meaningful and mostly redundant; we refer to these as lazy layers. The goal of this paper is to train smaller models by eliminating this structural inefficiency without compromising performance.	Sunny Sanyal; Ravid Shwartz-Ziv; Alexandros G. Dimakis; Sujay Sanghavi;	arxiv-cs.CL	2024-04-12
1387	Measuring Geographic Diversity of Foundation Models with A Natural Language-based Geo-guessing Experiment on GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. Generative AI based on foundation models provides a first glimpse into the world represented by machines trained on vast amounts of multimodal data ingested by these …	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	ArXiv	2024-04-11
1388	Map Reading and Analysis with GPT-4V(ision) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In late 2023, the image-reading capability added to a Generative Pre-trained Transformer (GPT) framework provided the opportunity to potentially revolutionize the way we view and …	Jinwen Xu; Ran Tao;	ISPRS Int. J. Geo Inf.	2024-04-11
1389	LLM Agents Can Autonomously Exploit One-day Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems.	Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang;	arxiv-cs.CR	2024-04-11
1390	Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented.	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	arxiv-cs.CY	2024-04-11
1391	Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items.	Andreas Säuberli; Simon Clematide;	arxiv-cs.CL	2024-04-11
1392	Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images.	Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi;	arxiv-cs.RO	2024-04-11
1393	From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates.	Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu;	arxiv-cs.CL	2024-04-11
1394	Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs.	Miriam Anschütz; Edoardo Mosca; Georg Groh;	arxiv-cs.CL	2024-04-10
1395	Automated Mapping of Common Vulnerabilities and Exposures to MITRE ATT&CK Tactics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Effectively understanding and categorizing vulnerabilities is vital in the ever-evolving cybersecurity landscape, since only one exposure can have a devastating effect on the …	Ioana Branescu; Octavian Grigorescu; Mihai Dascălu;	Inf.	2024-04-10
1396	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	arxiv-cs.CV	2024-04-09
1397	Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere.	Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan;	arxiv-cs.CL	2024-04-09
1398	Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data.	YANJIE LI et. al.	arxiv-cs.LG	2024-04-09
1399	PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English.	Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar;	arxiv-cs.CL	2024-04-08
1400	OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets.	MEHRAN SAFAYANI et. al.	arxiv-cs.CL	2024-04-08
1401	VulnHunt-GPT: A Smart Contract Vulnerabilities Detector Based on OpenAI ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Smart contracts are self-executing programs that can run on a blockchain. Due to the fact of being immutable after their deployment on blockchain, it is crucial to ensure their …	Biagio Boi; Christian Esposito; Sokjoon Lee;	Proceedings of the 39th ACM/SIGAPP Symposium on Applied …	2024-04-08
1402	Use of A Structured Knowledge Base Enhances Metadata Curation By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the potential of large language models (LLMs), specifically GPT-4, to improve adherence to metadata standards.	SOWMYA S. SUNDARAM et. al.	arxiv-cs.AI	2024-04-08
1403	Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data.	AHMAD IDRISSI-YAGHIR et. al.	arxiv-cs.CL	2024-04-08
1404	Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to compare the performance of GPT with traditional deep learning models (Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT)) in extracting acupoint-related location relations and assess the impact of pretraining and fine-tuning on GPT’s performance.	YIMING LI et. al.	arxiv-cs.CL	2024-04-08
1405	Clinical Trials Protocol Authoring Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies.	Morteza Maleki; SeyedAli Ghahari;	arxiv-cs.CE	2024-04-07
1406	PagPassGPT: Pattern Guided Password Guessing Via Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT).	XINGYU SU et. al.	arxiv-cs.CR	2024-04-07
1407	Initial Exploration of Zero-Shot Privacy Utility Tradeoffs in Tabular Data Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We investigate the application of large language models (LLMs), specifically GPT-4, to scenarios involving the tradeoff between privacy and utility in tabular data. Our approach …	Bishwas Mandal; G. Amariucai; Shuangqing Wei;	2024 International Joint Conference on Neural Networks …	2024-04-07
1408	RecGPT: Generative Personalized Prompts for Sequential Recommendation Via ChatGPT Training Paradigm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as …	YABIN ZHANG et. al.	ArXiv	2024-04-06
1409	Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel approach, Joint Visual and Text Prompting (VTPrompt), that employs fine-grained visual information to enhance the capability of MLLMs in VQA, especially for object-oriented perception.	SONGTAO JIANG et. al.	arxiv-cs.CL	2024-04-06
1410	Scope Ambiguities in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models — GPT-2, GPT-3/3.5, Llama 2 and GPT-4 — treat scope ambiguous sentences, and compare this with human judgments.	Gaurav Kamath; Sebastian Schuster; Sowmya Vajjala; Siva Reddy;	arxiv-cs.CL	2024-04-05
1411	Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models.	JERRY YAO-CHIEH HU et. al.	arxiv-cs.LG	2024-04-04
1412	Hierarchical Patch Aggregation Transformer for Motion Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View	Yujie Wu; Lei Liang; Siyao Ling; Zhisheng Gao;	Neural Process. Lett.	2024-04-04
1413	Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the …	SHUO CHEN et. al.	arxiv-cs.LG	2024-04-04
1414	NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation Using Few-Shot Multi-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate.	Anish Pahilajani; Samyak Rajesh Jain; Devasha Trivedi;	arxiv-cs.CL	2024-04-03
1415	UTeBC-NLP at SemEval-2024 Task 9: Can LLMs Be Lateral Thinkers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT) and direct prompting, enhancing with informative descriptions, and employing contextualizing prompts using a retrieval augmented generation (RAG) pipeline.	Pouya Sadeghi; Amirhossein Abaskohi; Yadollah Yaghoobzadeh;	arxiv-cs.CL	2024-04-03
1416	Task Agnostic Architecture for Algorithm Induction Via Implicit Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking this trend of generalization to the extreme suggests the possibility of a single deep network architecture capable of solving all tasks. This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed.	Sahil J. Sindhi; Ignas Budvytis;	arxiv-cs.LG	2024-04-03
1417	BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this, our team engaged in SemEval-2024 Task 4, a hierarchical multi-label classification task designed to identify rhetorical and psychological persuasion techniques embedded within memes. To tackle this problem, we introduced a caption generation step to assess the modality gap and the impact of additional semantic information from images, which improved our result.	Amirhossein Abaskohi; Amirhossein Dabiriaghdam; Lele Wang; Giuseppe Carenini;	arxiv-cs.CL	2024-04-03
1418	GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo.	Ali Pesaranghader; Nikhil Verma; Manasa Bharadwaj;	arxiv-cs.CL	2024-04-03
1419	FGeo-TP: A Language Model-Enhanced Solver for Euclidean Geometry Problems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The application of contemporary artificial intelligence techniques to address geometric problems and automated deductive proofs has always been a grand challenge to the …	Yiming He; Jia Zou; Xiaokai Zhang; Na Zhu; Tuo Leng;	Symmetry	2024-04-03
1420	METAL: Towards Multilingual Meta-Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a framework for an end-to-end assessment of LLMs as evaluators in multilingual scenarios.	Rishav Hada; Varun Gumma; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram;	arxiv-cs.CL	2024-04-02
1421	Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this way, we achieve 100% attack success rate — according to GPT-4 as a judge — on Vicuna-13B, Mistral-7B, Phi-3-Mini, Nemotron-4-340B, Llama-2-Chat-7B/13B/70B, Llama-3-Instruct-8B, Gemma-7B, GPT-3.5, GPT-4o, and R2D2 from HarmBench that was adversarially trained against the GCG attack.	Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion;	arxiv-cs.CR	2024-04-02
1422	SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SGSH–a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG.	SHASHA GUO et. al.	arxiv-cs.CL	2024-04-02
1423	Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks.	AMIRHOSSEIN ABASKOHI et. al.	arxiv-cs.CL	2024-04-02
1424	Release of Pre-Trained Models for The Japanese Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI democratization aims to create a world in which the average person can utilize AI techniques.	KEI SAWADA et. al.	arxiv-cs.CL	2024-04-02
1425	SwinSOD: Salient Object Detection Using Swin-transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Shuang Wu; Guangjian Zhang; Xuefeng Liu;	Image Vis. Comput.	2024-04-01
1426	GPT-COPE: A Graph-Guided Point Transformer for Category-Level Object Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Category-level object pose estimation aims to predict the 6D pose and 3D metric size of objects from given categories. Due to significant intra-class shape variations among …	Lu Zou; Zhangjin Huang; Naijie Gu; Guoping Wang;	IEEE Transactions on Circuits and Systems for Video …	2024-04-01
1427	Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models.	HAN CAI et. al.	arxiv-cs.CV	2024-04-01
1428	Syntactic Robustness for LLM-based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on prompts that ask for code that generates solutions to variables in an equation, when given coefficients of the equation as input.	Laboni Sarker; Mara Downing; Achintya Desai; Tevfik Bultan;	arxiv-cs.SE	2024-04-01
1429	Vision Transformer Models for Mobile/edge Devices: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View	SEUNG IL LEE et. al.	Multim. Syst.	2024-04-01
1430	RDTN: Residual Densely Transformer Network for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	Yan Li; Xiaofei Yang; Dong Tang; Zheng-yang Zhou;	Expert Syst. Appl.	2024-04-01
1431	ScopeViT: Scale-Aware Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	XUESONG NIE et. al.	Pattern Recognit.	2024-04-01
1432	Large Language Model Evaluation Via Multi AI Agents: Preliminary Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite extensive efforts to examine LLMs from various perspectives, there is a noticeable lack of multi-agent AI models specifically designed to evaluate the performance of different LLMs. To address this gap, we introduce a novel multi-agent AI model that aims to assess and compare the performance of various LLMs.	Zeeshan Rasheed; Muhammad Waseem; Kari Systä; Pekka Abrahamsson;	arxiv-cs.SE	2024-04-01
1433	Time Domain Speech Enhancement with CNN and Time-attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	N. Saleem; T. S. Gunawan; Sami Dhahbi; Sami Bourouis;	Digit. Signal Process.	2024-04-01
1434	BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research.	Jiarong Xian; Jibao Yuan; Peiwei Zheng; Dexian Chen; Nie yuntao;	arxiv-cs.CL	2024-04-01
1435	LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.	Zilong Wang; Xufang Luo; Xinyang Jiang; Dongsheng Li; Lili Qiu;	arxiv-cs.CL	2024-04-01
1436	Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of zero-shot Large Multimodal Models (LMMs) in the domain of drone perception.	Christian Limberg; Artur Gonçalves; Bastien Rigault; Helmut Prendinger;	arxiv-cs.CV	2024-04-01
1437	Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify and address ethical issues through empirical studies.	Richard Kimera; Yun-Seon Kim; Heeyoul Choi;	arxiv-cs.CL	2024-04-01
1438	An Innovative GPT-based Open-source Intelligence Using Historical Cyber Incident Reports Related Papers Related Patents Related Grants Related Venues Related Experts View	F. Sufi;	Nat. Lang. Process. J.	2024-04-01
1439	TWIN-GPT: Digital Twins for Clinical Trials Via Large Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a large language model-based digital twin creation approach, called TWIN-GPT.	YUE WANG et. al.	arxiv-cs.LG	2024-04-01
1440	EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new benchmark – EvoCodeBench to address the preceding problems, which has three primary advances.	Jia Li; Ge Li; Xuanming Zhang; Yihong Dong; Zhi Jin;	arxiv-cs.CL	2024-03-31
1441	Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune Your Model Unless You Have Access to GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a PEFT method to improve the consistency of LLMs by merging adapters that were fine-tuned separately using triplet and language modelling objectives.	Aryo Pradipta Gema; Giwon Hong; Pasquale Minervini; Luke Daines; Beatrice Alex;	arxiv-cs.CL	2024-03-30
1442	Cross-lingual Named Entity Corpus for Slavic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a corpus manually annotated with named entities for six Slavic languages – Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian.	Jakub Piskorski; Michał Marcińczuk; Roman Yangarber;	arxiv-cs.CL	2024-03-30
1443	Jetsons at FinNLP 2024: Towards Understanding The ESG Impact of A News Article Using Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task.	PARAG PRAVIN DAKLE et. al.	arxiv-cs.CL	2024-03-30
1444	Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite efforts to create open-source models, their limited parameters often result in insufficient multi-step reasoning capabilities required for solving complex medical problems. To address this, we introduce Meerkat, a new family of medical AI systems ranging from 7 to 70 billion parameters.	HYUNJAE KIM et. al.	arxiv-cs.CL	2024-03-30
1445	A Hybrid Transformer and Attention Based Recurrent Neural Network for Robust and Interpretable Sentiment Analysis of Tweets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this.	Md Abrar Jahin; Md Sakib Hossain Shovon; M. F. Mridha; Md Rashedul Islam; Yutaka Watanobe;	arxiv-cs.CL	2024-03-30
1446	A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In pursuit of suitable data augmentation methods, this study explores both established legacy approaches and contemporary practices such as Large Language Models (LLM), including GPT in Hate Speech detection.	Md Saroar Jahan; Mourad Oussalah; Djamila Romaissa Beddia; Jhuma kabir Mim; Nabil Arhab;	arxiv-cs.CL	2024-03-30
1447	Transformer Based Pluralistic Image Completion with Reduced Information Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer. To mitigate these issues, we propose a new transformer based framework called PUT.	QIANKUN LIU et. al.	arxiv-cs.CV	2024-03-30
1448	Shallow Cross-Encoders for Low-Latency Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, keeping search latencies low is important for user satisfaction and energy usage. In this paper, we show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings since they can estimate the relevance of more documents in the same time budget.	Aleksandr V. Petrov; Sean MacAvaney; Craig Macdonald;	arxiv-cs.IR	2024-03-29
1449	ReALM: Reference Resolution As Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality.	JOEL RUBEN ANTONY MONIZ et. al.	arxiv-cs.CL	2024-03-29
1450	ChatGPT V.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can ChatGPT detect media bias? This study seeks to answer this question by leveraging the Media Bias Identification Benchmark (MBIB) to assess ChatGPT’s competency in distinguishing six categories of media bias, juxtaposed against fine-tuned models such as BART, ConvBERT, and GPT-2.	Zehao Wen; Rabih Younes;	arxiv-cs.CL	2024-03-29
1451	Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive.	Ahmad Diab; Rr. Nefriana; Yu-Ru Lin;	arxiv-cs.CL	2024-03-29
1452	Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks.	ANG LV et. al.	arxiv-cs.CL	2024-03-28
1453	A Review of Multi-Modal Large Language and Vision Models IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have recently emerged as a focal point of research and application, driven by their unprecedented ability to understand and generate text with …	Kilian Carolan; Laura Fennelly; A. Smeaton;	ArXiv	2024-03-28
1454	Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator’s behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT).	Norman Di Palo; Edward Johns;	arxiv-cs.RO	2024-03-28
1455	TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-modal large language models (MLLMs), such as GPT-4, exhibit great comprehension capabilities on human instruction, as well as zero-shot ability on new downstream multi-modal …	YUNKAI CHEN et. al.	ACM Transactions on Knowledge Discovery from Data	2024-03-28
1456	Decision Mamba: Reinforcement Learning Via Sequence Modeling with Selective State Spaces IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios.	Toshihiro Ota;	arxiv-cs.LG	2024-03-28
1457	Reshaping Free-Text Radiology Notes Into Structured Reports With Generative Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a pipeline to extract information from free-text radiology reports, that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma.	LAURA BERGOMI et. al.	arxiv-cs.CL	2024-03-27
1458	Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed three approaches for leveraging LLMs for text classification: employing LLMs as zero-shot classifiers, us-ing LLMs as annotators to annotate training data for supervised classifiers, and utilizing LLMs with few-shot examples for augmentation of manually annotated data.	Yuting Guo; Anthony Ovadje; Mohammed Ali Al-Garadi; Abeed Sarker;	arxiv-cs.CL	2024-03-27
1459	3P-LLM: Probabilistic Path Planning Using Large Language Model for Autonomous Robot Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning.	Ehsan Latif;	arxiv-cs.RO	2024-03-27
1460	AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data.	FELIX VIRGO et. al.	arxiv-cs.CL	2024-03-27
1461	PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students’ physics labs.	Ehsan Latif; Ramviyas Parasuraman; Xiaoming Zhai;	arxiv-cs.RO	2024-03-27
1462	The Topos of Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a theoretical analysis of the expressivity of the transformer architecture through the lens of topos theory.	Mattia Jacopo Villani; Peter McBurney;	arxiv-cs.LG	2024-03-27
1463	SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs.	Brian Formento; Wenjie Feng; Chuan Sheng Foo; Luu Anh Tuan; See-Kiong Ng;	arxiv-cs.CL	2024-03-27
1464	RankMamba: Benchmarking Mamba’s Document Ranking Performance in The Era of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine \mamba’s efficacy through the lens of a classical IR task — document ranking.	Zhichao Xu;	arxiv-cs.IR	2024-03-27
1465	A Survey on Large Language Models from Concept to Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series.	Chen Wang; Jin Zhao; Jiaqi Gong;	arxiv-cs.CL	2024-03-27
1466	From Traditional Recommender Systems to GPT-Based Chatbots: A Survey of Recent Developments and Future Directions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recommender systems are a key technology for many applications, such as e-commerce, streaming media, and social media. Traditional recommender systems rely on collaborative …	TAMIM M. AL-HASAN et. al.	Big Data Cogn. Comput.	2024-03-27
1467	Evaluating The Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of Large Language Models (LLMs) has led to a parallel rise in the development of Large Multimodal Models (LMMs), which have begun to transform a variety of applications. These sophisticated multimodal models are designed to interpret and analyze complex data by integrating multiple modalities such as text and images, thereby opening new avenues for a range of applications.	Fouad Trad; Ali Chehab;	arxiv-cs.AI	2024-03-26
1468	Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking.	HAI-LONG NGUYEN et. al.	arxiv-cs.CL	2024-03-26
1469	Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a task for testing lexical-syntactic flexibility — the degree to which models can generalize over words in a construction with a non-prototypical part of speech.	David R. Mortensen; Valentina Izrailevitch; Yunze Xiao; Hinrich Schütze; Leonie Weissweiler;	arxiv-cs.CL	2024-03-26
1470	Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models Using Minimal Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing’ method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer.	Linyang He; Peili Chen; Ercong Nie; Yuanning Li; Jonathan R. Brennan;	arxiv-cs.CL	2024-03-25
1471	State Space Models As Foundation Models: A Control Theoretic Overview Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by …	Carmen Amo Alonso; Jerome Sieber; M. Zeilinger;	ArXiv	2024-03-25
1472	Grammatical Vs Spelling Error Correction: An Investigation Into The Responsiveness of Transformer-based Language Models Using BART and MarianMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims at analyzing different kinds of error that occurs in text documents.	Rohit Raju; Peeta Basa Pati; SA Gandheesh; Gayatri Sanjana Sannala; Suriya KS;	arxiv-cs.CL	2024-03-25
1473	CYGENT: A Cybersecurity Conversational Agent with Log Summarization Powered By GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability.	Prasasthy Balasubramanian; Justin Seby; Panos Kostakos;	arxiv-cs.CR	2024-03-25
1474	Towards Algorithmic Fidelity: Mental Health Representation Across Demographics in Synthetic Vs. Human-generated Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we …	Shinka Mori; Oana Ignat; Andrew Lee; Rada Mihalcea;	International Conference on Language Resources and …	2024-03-25
1475	LLM-Guided Formal Verification Coupled with Mutation Testing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The increasing complexity of modern hardware designs poses significant challenges for design verification, particularly defining and verifying properties and invariants manually. …	Muhammad Hassan; Sallar Ahmadi-Pour; Khushboo Qayyum; C. Jha; Rolf Drechsler;	2024 Design, Automation & Test in Europe Conference & …	2024-03-25
1476	Automatic Short Answer Grading for Finnish with ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic short answer grading (ASAG) seeks to mitigate the burden on teachers by leveraging computational methods to evaluate student-constructed text responses. Large language …	Li-Hsin Chang; Filip Ginter;	AAAI Conference on Artificial Intelligence	2024-03-24
1477	Reflective Microresonator Based Microwave Photonic Sensor Assisted By Sparse Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We demonstrate a sparse transformer assisted microwave photonic sensor using a microring cascaded with an inverse designed reflector. Even with a small dataset, the …	XIAOYI TIAN et. al.	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1478	GPT-Enabled Digital Twin Assistant for Multi-task Cooperative Management in Autonomous Optical Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A GPT-enabled digital twin (DT) assistant is implemented with the capabilities of intention understanding, analysis, reasoning, and complex multi-task collaboration, which …	YAO ZHANG et. al.	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1479	Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL).	MINYU CHEN et. al.	arxiv-cs.AI	2024-03-24
1480	Anomaly Detection and Localization in Optical Networks Using Vision Transformer and SOP Monitoring Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce an innovative vision transformer approach to identify and precisely locate high-risk events, including fiber cut precursors, in state-of-polarization derived …	K. ABDELLI et. al.	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1481	A Transformer Approach for Electricity Price Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach to electricity price forecasting (EPF) using a pure Transformer model.	Oscar Llorente; Jose Portela;	arxiv-cs.LG	2024-03-24
1482	Optical Transport Networks Converging Edge Compute and Central Cloud: An Enabler For 6G Services Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Intent-based networking that automates the operation of converged edge compute and central cloud 6G infrastructures through optical transport networks is proposed. GPT AI is used …	A. Tzanakaki; M. Anastasopoulos; Victoria-Maria Alevizaki;	2024 Optical Fiber Communications Conference and Exhibition …	2024-03-24
1483	LlamBERT: Large-scale Low-cost Data Annotation in NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LlamBERT, a hybrid approach that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer encoders like BERT and RoBERTa.	Bálint Csanády; Lajos Muzsai; Péter Vedres; Zoltán Nádasdy; András Lukács;	arxiv-cs.CL	2024-03-23
1484	Using Large Language Models for OntoClean-based Ontology Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology.	Yihang Zhao; Neil Vetter; Kaveh Aryan;	arxiv-cs.AI	2024-03-23
1485	VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the purpose of future research, CafeBERT is made publicly available for research purposes.	Phong Nguyen-Thuan Do; Son Quoc Tran; Phu Gia Hoang; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-03-23
1486	Evaluating GPT-4’s Proficiency in Addressing Cryptography Examinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: . In the rapidly advancing domain of artificial intelligence, ChatGPT, powered by the GPT-4 model, has emerged as a state-of-the-art interactive agent, exhibiting substantial …	Vasily Mikhalev; Nils Kopal; B. Esslinger;	IACR Cryptol. ePrint Arch.	2024-03-23
1487	Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.	Akshay Krishnamurthy; Keegan Harris; Dylan J. Foster; Cyril Zhang; Aleksandrs Slivkins;	arxiv-cs.LG	2024-03-22
1488	Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish.	Lanxin Xu; Shuo Wang;	arxiv-cs.CV	2024-03-22
1489	GPT-Connect: Interaction Between Text-Driven Human Motion Generator and 3D Scenes in A Training-free Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, intuitively training a separate scene-aware motion generator in a supervised way can require a large amount of motion samples to be troublesomely collected and annotated in a large scale of different 3D scenes. To handle this task rather in a relatively convenient manner, in this paper, we propose a novel GPT-connect framework.	Haoxuan Qu; Ziyan Guo; Jun Liu;	arxiv-cs.CV	2024-03-22
1490	On Zero-Shot Counterspeech Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech – counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind.	Punyajoy Saha; Aalok Agrawal; Abhik Jana; Chris Biemann; Animesh Mukherjee;	arxiv-cs.CL	2024-03-22
1491	MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 – Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection.	SADIYA SAYARA CHOWDHURY PUSPO et. al.	arxiv-cs.CL	2024-03-22
1492	Comprehensive Evaluation and Insights Into The Use of Large Language Models in The Automation of Behavior-Driven Development Acceptance Test Formulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this manuscript, we propose a novel approach to enhance BDD practices using large language models (LLMs) to automate acceptance test generation.	SHANTHI KARPURAPU et. al.	arxiv-cs.SE	2024-03-22
1493	LLM-based Extraction of Contradictions from Patents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI’s GPT-4.	Stefan Trapp; Joachim Warschat;	arxiv-cs.CL	2024-03-21
1494	K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In many literary texts, emotions are indirectly conveyed through descriptions of actions, facial expressions, and appearances, necessitating emotion inference for narrative understanding. In this paper, we introduce K-Act2Emo, a Korean commonsense knowledge graph (CSKG) comprising 1,900 indirect emotional expressions and the emotions inferable from them.	Kyuhee Kim; Surin Lee; Sangah Lee;	arxiv-cs.CL	2024-03-21
1495	Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature.	Sheetal Harris; Jinshuo Liu; Hassan Jalil Hadi; Yue Cao;	arxiv-cs.CL	2024-03-20
1496	Extracting Emotion Phrases from Tweets Using BART Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we applied an approach to sentiment analysis based on a question-answering framework.	Mahdi Rezapour;	arxiv-cs.CL	2024-03-20
1497	Open Access NAO (OAN): A ROS2-based Software Framework for HRI Applications with The NAO Robot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new software framework for HRI experimentation with the sixth version of the common NAO robot produced by the United Robotics Group.	Antonio Bono; Kenji Brameld; Luigi D’Alfonso; Giuseppe Fedele;	arxiv-cs.RO	2024-03-20
1498	Retina Vision Transformer (RetinaViT): Introducing Scaled Patches Into Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Humans see low and high spatial frequency components at the same time, and combine the information from both to form a visual scene. Drawing on this neuroscientific inspiration, we propose an altered Vision Transformer architecture where patches from scaled down versions of the input image are added to the input of the first Transformer Encoder layer.	Yuyang Shu; Michael E. Bain;	arxiv-cs.CV	2024-03-20
1499	Evaluate Chat-GPT’s Programming Capability in Swift Through Real University Exam Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this study, we evaluate the programming capabilities of OpenAI’s GPT‐3.5 and GPT‐4 models using Swift‐based exam questions from a third‐year university course. The results …	Zizhuo Zhang; Lian Wen; Yanfei Jiang; Yongli Liu;	Softw. Pract. Exp.	2024-03-20
1500	Generating Automatic Feedback on UI Mockups with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the potential of using large language models for automatic feedback.	Peitong Duan; Jeremy Warner; Yang Li; Bjoern Hartmann;	arxiv-cs.HC	2024-03-19