Paper Digest: Recent Papers on Transformer
Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to write, review, get answers and more. Try us today and unlock the full potential of our services for free!
TABLE 1: Paper Digest: Recent Papers on Transformer
Paper | Author(s) | Source | Date | |
---|---|---|---|---|
1 | Using Language Models to Disambiguate Lexical Choices in Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Providing weaker models with high-quality lexical rules improves accuracy substantially, in some cases reaching or outperforming GPT-4. |
Josh Barua; Sanjay Subramanian; Kayo Yin; Alane Suhr; | arxiv-cs.CL | 2024-11-08 |
2 | Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams. |
Adriana Caraeni; Alexander Scarlatos; Andrew Lan; | arxiv-cs.CY | 2024-11-07 |
3 | Adversarial Robustness of In-Context Learning in Transformers for Linear Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the vulnerability of in-context learning in transformers to \textit{hijacking attacks} focusing on the setting of linear regression tasks. |
Usman Anwar; Johannes Von Oswald; Louis Kirsch; David Krueger; Spencer Frei; | arxiv-cs.LG | 2024-11-07 |
4 | GPT Semantic Cache: Reducing LLM Costs and Latency Via Semantic Embedding Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GPT Semantic Cache, a method that leverages semantic caching of query embeddings in in-memory storage (Redis). |
Sajal Regmi; Chetan Phakami Pun; | arxiv-cs.LG | 2024-11-07 |
5 | High Entropy Alloy Property Predictions Using Transformer-based Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a language transformer-based machine learning model to predict key mechanical properties of high-entropy alloys (HEAs), addressing the challenges due to their complex, multi-principal element compositions and limited experimental data. |
Spyros Kamnis; Konstantinos Delibasis; | arxiv-cs.CE | 2024-11-07 |
6 | Understanding The Effects of Human-written Paraphrases in LLM-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we devise a new data collection strategy to collect Human & LLM Paraphrase Collection (HLPC), a first-of-its-kind dataset that incorporates human-written texts and paraphrases, as well as LLM-generated texts and paraphrases. |
Hiu Ting Lau; Arkaitz Zubiaga; | arxiv-cs.CL | 2024-11-06 |
7 | A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to explore how LLMs can alleviate the burden of manual summarization, streamline workflow efficiencies, and support informed decision-making in healthcare settings. |
YIMING LI et. al. | arxiv-cs.CL | 2024-11-06 |
8 | Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examine implications of architectural differences between GPT-2 and LLaMa as well as LlaMa and Mamba. |
RYAN CAMPBELL et. al. | arxiv-cs.LG | 2024-11-06 |
9 | TATAA: Programmable Mixed-Precision Transformer Acceleration with A Transformable Arithmetic Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Apart from the vast amount of linear operations needed due to their sizes, modern transformer models are increasingly reliance on precise non-linear computations that make traditional low-bitwidth quantization methods and fixed-dataflow matrix accelerators ineffective for end-to-end acceleration. To address this need to accelerate both linear and non-linear operations in a unified and programmable framework, this paper introduces TATAA. |
JIAJUN WU et. al. | arxiv-cs.AR | 2024-11-06 |
10 | Rethinking Decoders for Transformer-based Semantic Segmentation: Compression Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we argue that there are fundamental connections between semantic segmentation and compression, especially between the Transformer decoders and Principal Component Analysis (PCA). |
Qishuai Wen; Chun-Guang Li; | arxiv-cs.CV | 2024-11-05 |
11 | Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the feasibility of using large language models (LLMs), specifically GPT-4o (ChatGPT), for automated grading of conceptual questions in an undergraduate Mechanical Engineering course. |
RUJUN GAO et. al. | arxiv-cs.CY | 2024-11-05 |
12 | Enhancing Transformer Training Efficiency with Dynamic Dropout Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Dynamic Dropout, a novel regularization technique designed to enhance the training efficiency of Transformer models by dynamically adjusting the dropout rate based on training epochs or validation loss improvements. |
Hanrui Yan; Dan Shao; | arxiv-cs.LG | 2024-11-05 |
13 | From Medprompt to O1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Following on the Medprompt study with GPT-4, we systematically evaluate the o1-preview model across various medical benchmarks. |
HARSHA NORI et. al. | arxiv-cs.CL | 2024-11-05 |
14 | Automatic Generation of Question Hints for Mathematics Problems Using Large Language Models in Educational Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present here the study of several dimensions: 1) identifying error patterns made by simulated students on secondary-level math exercises; 2) developing various prompts for GPT-4o as a teacher and evaluating their effectiveness in generating hints that enable simulated students to self-correct; and 3) testing the best-performing prompts, based on their ability to produce relevant hints and facilitate error correction, with Llama-3-8B-Instruct as the teacher, allowing for a performance comparison with GPT-4o. |
Junior Cedric Tonga; Benjamin Clement; Pierre-Yves Oudeyer; | arxiv-cs.CL | 2024-11-05 |
15 | Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify representation collapse in the model’s intermediate layers as a key factor limiting their reasoning capabilities. |
MD RIFAT AREFIN et. al. | arxiv-cs.LG | 2024-11-04 |
16 | Wave Network: An Ultra-Small Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an innovative token representation and update method in a new ultra-small language model: the Wave network. |
Xin Zhang; Victor S. Sheng; | arxiv-cs.CL | 2024-11-04 |
17 | Ask, and It Shall Be Given: Turing Completeness of Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function. |
Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong; | arxiv-cs.LG | 2024-11-04 |
18 | Advancements and Limitations of LLMs in Replicating Human Color-word Associations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compared multiple generations of LLMs (from GPT-3 to GPT-4o) against human color-word associations using data collected from over 10,000 Japanese participants, involving 17 colors and words from eight categories in Japanese. |
Makoto Fukushima; Shusuke Eshita; Hiroshige Fukuhara; | arxiv-cs.CL | 2024-11-04 |
19 | Evaluating The Ability of Large Language Models to Generate Verifiable Specifications in VeriFast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, prior work has not explored how well LLMs can perform specification generation for specifications based in an ownership logic, such as separation logic. To address this gap, this paper explores the effectiveness of large language models (LLMs), specifically OpenAI’s GPT models, in generating fully correct specifications based on separation logic for static verification of human-written programs in VeriFast. |
MARILYN REGO et. al. | arxiv-cs.SE | 2024-11-04 |
20 | Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging advancements in natural language processing, this study presents a systematic approach to enrich tabular datasets with features derived from large language model embeddings. |
Gjergji Kasneci; Enkelejda Kasneci; | arxiv-cs.LG | 2024-11-03 |
21 | Can Large Language Model Predict Employee Attrition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning (ML) advancements offer more scalable and accurate solutions, but large language models (LLMs) introduce new potential in human resource management by interpreting nuanced employee communication and detecting subtle turnover cues. |
Xiaoye Ma; Weiheng Liu; Changyi Zhao; Liliya R. Tukhvatulina; | arxiv-cs.LG | 2024-11-02 |
22 | Enhancing Neural Network Interpretability with Feature-Aligned Sparse Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose \textsc{Mutual Feature Regularization} \textbf{(MFR)}, a regularization technique for improving feature learning by encouraging SAEs trained in parallel to learn similar features. |
Luke Marks; Alasdair Paren; David Krueger; Fazl Barez; | arxiv-cs.LG | 2024-11-02 |
23 | LLMs: A Game-Changer for Software Engineers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a critical analysis of technical strengths, limitations, real-world case studies, and future research directions, this paper argues that LLMs are not just reshaping how software is developed but are redefining the role of developers. |
Md Asraful Haque; | arxiv-cs.SE | 2024-11-01 |
24 | GameGen-X: Interactive Open-world Game Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos. |
Haoxuan Che; Xuanhua He; Quande Liu; Cheng Jin; Hao Chen; | arxiv-cs.CV | 2024-11-01 |
25 | Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism. |
BIN LEI et. al. | arxiv-cs.AI | 2024-11-01 |
26 | Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we introduce the Lingma SWE-GPT series, comprising Lingma SWE-GPT 7B and 72B. |
YINGWEI MA et. al. | arxiv-cs.SE | 2024-11-01 |
27 | GPT or BERT: Why Not Both? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a simple way to merge masked language modeling with causal language modeling. |
Lucas Georges Gabriel Charpentier; David Samuel; | arxiv-cs.CL | 2024-10-31 |
28 | Handwriting Recognition in Historical Documents with Multimodal LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I evaluate the accuracy of handwritten document transcriptions generated by Gemini against the current state of the art Transformer based methods. |
Lucian Li; | arxiv-cs.CV | 2024-10-31 |
29 | Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases. |
MUHAMMED SAEED et. al. | arxiv-cs.CL | 2024-10-31 |
30 | Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces. |
Nouar AlDahoul; Myles Joshua Toledo Tan; Harishwar Reddy Kasireddy; Yasir Zaki; | arxiv-cs.CV | 2024-10-31 |
31 | GPT for Games: An Updated Scoping Review (2020-2024) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to illustrate the state of the art in innovative GPT applications in games, offering a foundation to enrich game development and enhance player experiences through cutting-edge AI innovations. |
Daijin Yang; Erica Kleinman; Casper Harteveld; | arxiv-cs.AI | 2024-10-31 |
32 | Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN). |
IBNE HASSAN et. al. | arxiv-cs.CV | 2024-10-31 |
33 | EDT: An Efficient Diffusion Transformer Framework Inspired By Human-like Sketching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To reduce the computation budget of transformer-based DPMs, this work proposes the Efficient Diffusion Transformer (EDT) framework. |
Xinwang Chen; Ning Liu; Yichen Zhu; Feifei Feng; Jian Tang; | arxiv-cs.CV | 2024-10-31 |
34 | An Empirical Analysis of GPT-4V’s Performance on Fashion Aesthetic Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time. |
YUKI HIRAKAWA et. al. | arxiv-cs.CV | 2024-10-31 |
35 | IO Transformer: Evaluating SwinV2-Based Reward Models for Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines SwinV2-based reward models, called the Input-Output Transformer (IO Transformer) and the Output Transformer. |
Maxwell Meyer; Jack Spruyt; | arxiv-cs.CV | 2024-10-31 |
36 | LoFLAT: Local Feature Matching Using Focused Linear Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper. |
Naijian Cao; Renjie He; Yuchao Dai; Mingyi He; | arxiv-cs.CV | 2024-10-30 |
37 | EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The former hurts the fairness of benchmarks, and the latter hinders practitioners from selecting superior LLMs for specific programming domains. To address these two limitations, we propose a new benchmark – EvoCodeBench, which has the following advances: (1) Evolving data. |
JIA LI et. al. | arxiv-cs.CL | 2024-10-30 |
38 | ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching. |
JUNJIE NI et. al. | arxiv-cs.CV | 2024-10-30 |
39 | Automated Personnel Selection for Software Engineers Using LLM-Based Profile Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a fresh dataset and technique as well as shows how transformer models could improve recruiting procedures. |
Ahmed Akib Jawad Karim; Shahria Hoque; Md. Golam Rabiul Alam; Md. Zia Uddin; | arxiv-cs.SE | 2024-10-30 |
40 | ProTransformer: Robustify Transformers Via Plug-and-Play Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures. |
Zhichao Hou; Weizhi Gao; Yuchen Shen; Feiyi Wang; Xiaorui Liu; | arxiv-cs.LG | 2024-10-30 |
41 | Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context for preference. |
Rishabh Adiga; Besmira Nushi; Varun Chandrasekaran; | arxiv-cs.CL | 2024-10-29 |
42 | GPT-4o Reads The Mind in The Eyes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using two versions of a widely used theory of mind test, the Reading the Mind in Eyes Test and the Multiracial Reading the Mind in the Eyes Test, we found that GPT-4o outperformed humans in interpreting mental states from upright faces but underperformed humans when faces were inverted. |
JAMES W. A. STRACHAN et. al. | arxiv-cs.HC | 2024-10-29 |
43 | AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent work, AmpleGCG~\citep{liao2024amplegcg}, demonstrates that a generative model can quickly produce numerous customizable gibberish adversarial suffixes for any harmful query, exposing a range of alignment gaps in out-of-distribution (OOD) language spaces. To bring more attention to this area, we introduce AmpleGCG-Plus, an enhanced version that achieves better performance in fewer attempts. |
Vishal Kumar; Zeyi Liao; Jaylen Jones; Huan Sun; | arxiv-cs.CL | 2024-10-29 |
44 | Benchmarking OpenAI O1 in Cyber Security Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate OpenAI’s o1-preview and o1-mini models, benchmarking their performance against the earlier GPT-4o model. |
Dan Ristea; Vasilios Mavroudis; Chris Hicks; | arxiv-cs.CR | 2024-10-29 |
45 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD. |
Rambod Azimi; Rishav Rishav; Marek Teichmann; Samira Ebrahimi Kahou; | arxiv-cs.CL | 2024-10-28 |
46 | Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a medical literature summary generation method based on the BERT model to address the challenges brought by the current explosion of medical information. |
JIACHENG HU et. al. | arxiv-cs.CL | 2024-10-28 |
47 | UOttawa at LegalLens-2024: Transformer-based Classification Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the methods used for LegalLens-2024 shared task, which focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals. |
Nima Meghdadi; Diana Inkpen; | arxiv-cs.CL | 2024-10-28 |
48 | SepMamba: State-space Models for Speaker Separation Using Mamba Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers. |
THOR HØJHUS AVENSTRUP et. al. | arxiv-cs.SD | 2024-10-28 |
49 | Gender Bias in LLM-generated Interview Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications. |
Haein Kong; Yongsu Ahn; Sangyub Lee; Yunho Maeng; | arxiv-cs.CL | 2024-10-28 |
50 | Sequential Choice in Ordered Bundles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate several predictive models, including two custom Transformers using decoder-only and encoder-decoder architectures, fine-tuned GPT-3, a custom LSTM model, a reinforcement learning model, two Markov models, and a zero-order model. |
Rajeev Kohli; Kriste Krstovski; Hengyu Kuang; Hengxu Lin; | arxiv-cs.LG | 2024-10-28 |
51 | A Simple Yet Effective Corpus Construction Framework for Indonesian Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: How to efficiently construct high-quality evaluation corpora for GEC in low-resource languages has become a significant challenge. To fill these gaps, in this paper, we present a framework for constructing GEC corpora. |
NANKAI LIN et. al. | arxiv-cs.CL | 2024-10-28 |
52 | Is GPT-4 Less Politically Biased Than GPT-3.5? A Renewed Investigation of ChatGPT’s Political Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the political biases and personality traits of ChatGPT, specifically comparing GPT-3.5 to GPT-4. |
Erik Weber; Jérôme Rutinowski; Niklas Jost; Markus Pauly; | arxiv-cs.CL | 2024-10-28 |
53 | Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project explores the security vulnerabilities in relation to prompt injection attacks. |
Md Abdur Rahman; Fan Wu; Alfredo Cuzzocrea; Sheikh Iqbal Ahamed; | arxiv-cs.CL | 2024-10-27 |
54 | SeisGPT: A Physics-Informed Data-Driven Large Model for Real-Time Seismic Response Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods, which rely on complex finite element models often struggle with balancing computational efficiency and accuracy. To address this challenge, we introduce SeisGPT, a data-driven, large physics-informed model that leverages deep neural networks based on the Generative Pre-trained Transformer (GPT) architecture. |
SHIQIAO MENG et. al. | arxiv-cs.CE | 2024-10-26 |
55 | Sequential Large Language Model-Based Hyper-Parameter Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces SLLMBO, an innovative framework that leverages Large Language Models (LLMs) for hyperparameter optimization (HPO), incorporating dynamic search space adaptability, enhanced parameter landscape exploitation, and a hybrid, novel LLM-Tree-structured Parzen Estimator (LLM-TPE) sampler. |
Kanan Mahammadli; | arxiv-cs.LG | 2024-10-26 |
56 | Notes on The Mathematical Structure of GPT LLM Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM. … |
Spencer Becker-Kahn; | arxiv-cs.LG | 2024-10-25 |
57 | Integrating Large Language Models with Internet of Things Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection, macroprogramming over IoT systems, and sensor data processing. |
Mingyu Zong; Arvin Hekmati; Michael Guastalla; Yiyi Li; Bhaskar Krishnamachari; | arxiv-cs.AI | 2024-10-24 |
58 | No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts. |
ISRAEL FAMA et. al. | arxiv-cs.CL | 2024-10-24 |
59 | GPT-Signal: Generative AI for Semi-automated Feature Engineering in The Alpha Research Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the recent development of Generative Artificial Intelligence(Gen AI) and Large Language Models (LLMs), we present a novel way of leveraging GPT-4 to generate new return-predictive formulaic alphas, making alpha mining a semi-automated process, and saving time and energy for investors and traders. |
Yining Wang; Jinman Zhao; Yuri Lawryshyn; | arxiv-cs.CE | 2024-10-24 |
60 | Scaling Up Masked Diffusion Models on Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Fully leveraging the probabilistic formulation of MDMs, we propose a simple yet effective \emph{unsupervised classifier-free guidance} that effectively exploits large-scale unpaired data, boosting performance for conditional inference. |
SHEN NIE et. al. | arxiv-cs.AI | 2024-10-24 |
61 | Probing Ranking LLMs: Mechanistic Interpretability in Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the mechanistic workings of state-of-the-art, fine-tuning-based passage-reranking transformer networks. |
Tanya Chowdhury; James Allan; | arxiv-cs.IR | 2024-10-24 |
62 | Lightweight Neural App Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel mobile phone control architecture, termed “app agents, for efficient interactions and controls across various Android apps. |
FILIPPOS CHRISTIANOS et. al. | arxiv-cs.AI | 2024-10-23 |
63 | Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords. |
Farshad Jafari; Claire Arthur; | arxiv-cs.IT | 2024-10-23 |
64 | Locating Information in Large Language Models Via Random Matrix Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the weight matrices of pretrained transformer models — specifically BERT and Llama — using random matrix theory (RMT) as a zero-information hypothesis. |
Max Staats; Matthias Thamm; Bernd Rosenow; | arxiv-cs.LG | 2024-10-23 |
65 | Interpreting Affine Recurrence Learning in GPT-style Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In-context learning allows transformers to generalize during inference without modifying their weights, yet the precise operations driving this capability remain largely opaque. This paper presents an investigation into the mechanistic interpretability of these transformers, focusing specifically on their ability to learn and predict affine recurrences as an ICL task. |
Samarth Bhargav; Alexander Gu; | arxiv-cs.LG | 2024-10-22 |
66 | GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although large language models (LLMs) have demonstrated potential in code generation tasks, they often encounter issues such as refusal to code or hallucination in geospatial code generation due to a lack of domain-specific knowledge and code corpora. To address these challenges, this paper presents and open-sources the GeoCode-PT and GeoCode-SFT corpora, along with the GeoCode-Eval evaluation dataset. |
SHUYANG HOU et. al. | arxiv-cs.SE | 2024-10-22 |
67 | In Context Learning and Reasoning for Symbolic Regression with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore the potential of LLMs to perform symbolic regression — a machine-learning method for finding simple and accurate equations from datasets. |
Samiha Sharlin; Tyler R. Josephson; | arxiv-cs.CL | 2024-10-22 |
68 | Mechanisms of Symbol Processing for In-Context Learning in Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have demonstrated impressive abilities in symbol processing through in-context learning (ICL). |
Paul Smolensky; Roland Fernandez; Zhenghao Herbert Zhou; Mattia Opper; Jianfeng Gao; | arxiv-cs.AI | 2024-10-22 |
69 | Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through evaluations of edited models and analysis of extracted representations, we show that KE inadvertently affects representations of entities beyond the targeted one, distorting relevant structures that allow a model to infer unseen knowledge about an entity. |
KENTO NISHI et. al. | arxiv-cs.LG | 2024-10-22 |
70 | An Eye for An AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that although GPT-4o exhibits great potential in solving questions with visual information independently, major limitations still exist to the accuracy and quality of the generated results. We propose several novel approaches for CG educators to incorporate GenAI into CG teaching despite these limitations. |
Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Jacqueline Whalley; | arxiv-cs.AI | 2024-10-22 |
71 | Graph Transformers Dream of Electric Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The input to the Transformer is simply the graph incidence matrix; no other explicit positional encoding information is provided. We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms. |
Xiang Cheng; Lawrence Carin; Suvrit Sra; | arxiv-cs.LG | 2024-10-22 |
72 | Using GPT Models for Qualitative and Quantitative News Analytics in The 2024 US Presidental Election Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper considers an approach of using Google Search API and GPT-4o model for qualitative and quantitative analyses of news through retrieval-augmented generation (RAG). |
Bohdan M. Pavlyshenko; | arxiv-cs.CL | 2024-10-21 |
73 | Exploring Pretraining Via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pretraining strategy that uses active forgetting to achieve similar cross lingual transfer in decoder-only LLMs. |
Divyanshu Aggarwal; Ashutosh Sathe; Sunayana Sitaram; | arxiv-cs.CL | 2024-10-21 |
74 | Diffusion Transformer Policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent large visual-language action models pretrained on diverse robot datasets have demonstrated the potential for generalizing to new environments with a few in-domain data. |
ZHI HOU et. al. | arxiv-cs.RO | 2024-10-21 |
75 | BART-based Hierarchical Attentional Network for Sentence Ordering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel BART-based Hierarchical Attentional Ordering Network (BHAONet), aiming to address the coherence modeling challenge within paragraphs, which stands as a cornerstone in comprehension, generation, and reasoning tasks. |
Yiping Yang; Baiyun Cui; Yingming Li; | cikm | 2024-10-21 |
76 | Learning to Differentiate Pairwise-Argument Representations for Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable encoders to produce clearly distinguishable representations, we propose a joint learning framework. |
ZHIPANG WANG et. al. | cikm | 2024-10-21 |
77 | Comparative Study of Multilingual Idioms and Similes in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the gap in the literature concerning the comparative performance of LLMs in interpreting different types of figurative language across multiple languages. |
PARIA KHOSHTAB et. al. | arxiv-cs.CL | 2024-10-21 |
78 | Application of Large Language Models in Chemistry Reaction Data Extraction and Cleaning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a paradigm that leverages prompt-tuning, fine-tuning techniques, and a verifier to check the extracted information. |
XIAOBAO HUANG et. al. | cikm | 2024-10-21 |
79 | A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we theoretically show that, compared to Stepwise ICL, the transformer gains better error correction ability and more accurate predictions if the reasoning from earlier steps (Coherent CoT) is integrated. |
YINGQIAN CUI et. al. | arxiv-cs.CL | 2024-10-21 |
80 | Inferring Visualization Intent from Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider a conversational approach to visualization, where users specify their needs at each step in natural language, with a visualization being returned in turn. |
Haotian Li; Nithin Chalapathi; Huamin Qu; Alvin Cheung; Aditya G. Parameswaran; | cikm | 2024-10-21 |
81 | Improving Neuron-level Interpretability with White-box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE), explicitly engineered to capture sparse, low-dimensional structures within data distributions. |
Hao Bai; Yi Ma; | arxiv-cs.CL | 2024-10-21 |
82 | Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the information retrieval (IR) area, dense retrieval (DR) models use deep learning techniques to encode queries and passages into embedding space to compute their semantic relations. |
Hanqi Zhang; Chong Chen; Lang Mei; Qi Liu; Jiaxin Mao; | cikm | 2024-10-21 |
83 | Does ChatGPT Have A Poetic Style? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the GPT models, especially GPT-4, can successfully produce poems in a range of both common and uncommon English-language forms in superficial yet noteworthy ways, such as by producing poems of appropriate lengths for sonnets (14 lines), villanelles (19 lines), and sestinas (39 lines). |
Melanie Walsh; Anna Preus; Elizabeth Gronski; | arxiv-cs.CL | 2024-10-20 |
84 | Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias. |
Sanguk Lee; Kai-Qi Yang; Tai-Quan Peng; Ruth Heo; Hui Liu; | arxiv-cs.AI | 2024-10-20 |
85 | BERTtime Stories: Investigating The Role of Synthetic Story Data in Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe our contribution to the Strict and Strict-Small tracks of the 2nd iteration of the BabyLM Challenge. |
Nikitas Theodoropoulos; Giorgos Filandrianos; Vassilis Lyberatos; Maria Lymperaiou; Giorgos Stamou; | arxiv-cs.CL | 2024-10-20 |
86 | IANUS: Integrated Accelerator Based on NPU-PIM Unified Memory System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the unique challenges of accelerating end-to-end inference, we propose IANUS — Integrated Accelerator based on NPU-PIM Unified Memory System. |
MINSEOK SEO et. al. | arxiv-cs.AR | 2024-10-19 |
87 | DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-based Proximal Policy Optimization (DTPPO) method. |
Anning Wei; Jintao Liang; Kaiyuan Lin; Ziyue Li; Rui Zhao; | arxiv-cs.MA | 2024-10-19 |
88 | Bias Amplification: Language Models As Increasingly Biased Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the gap in understanding the bias amplification of LLMs with four main contributions. Firstly, we propose a theoretical framework, defining the necessary and sufficient conditions for its occurrence, and emphasizing that it occurs independently of model collapse. |
ZE WANG et. al. | arxiv-cs.AI | 2024-10-19 |
89 | Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This scarcity of annotated data impedes the development of effective machine learning models for cancer document classification. To address this challenge, we present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics. |
ELIAS HOSSAIN et. al. | arxiv-cs.AI | 2024-10-19 |
90 | Automated Genre-Aware Article Scoring and Feedback Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the development of an advanced intelligent article scoring system that not only assesses the overall quality of written work but also offers detailed feature-based scoring tailored to various article genres. |
CHIHANG WANG et. al. | arxiv-cs.CL | 2024-10-18 |
91 | Harmony: A Home Agent for Responsive Management and Action Optimization with A Locally Deployed Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to optimize the privacy and economy of data processing while maintaining the powerful functions of LLMs, we propose Harmony, a smart home assistant framework that uses a locally deployable small-scale LLM. |
Ziqi Yin; Mingxin Zhang; Daisuke Kawahara; | arxiv-cs.HC | 2024-10-18 |
92 | From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation By Natural Language Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces SecCode, a framework that leverages an innovative interactive encouragement prompting (EP) technique for secure code generation with \textit{only NL} prompts. |
SHIGANG LIU et. al. | arxiv-cs.CR | 2024-10-18 |
93 | Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs. |
XINGYU TAN et. al. | arxiv-cs.CL | 2024-10-18 |
94 | SBI-RAG: Enhancing Math Word Problem Solving for Students Through Schema-Based Instruction and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Many students struggle with math word problems (MWPs), often finding it difficult to identify key information and select the appropriate mathematical operations.Schema-based instruction (SBI) is an evidence-based strategy that helps students categorize problems based on their structure, improving problem-solving accuracy. Building on this, we propose a Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG) framework that incorporates a large language model (LLM). |
Prakhar Dixit; Tim Oates; | arxiv-cs.LG | 2024-10-17 |
95 | Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employed a cross-agent prediction model to compare the metacognitive performance of humans and ChatGPT in a language-based memory task involving garden-path sentences preceded by either fitting or unfitting context sentences. |
Markus Huff; Elanur Ulakçı; | arxiv-cs.CL | 2024-10-17 |
96 | Linguistically Grounded Analysis of Language Models Using Shapley Head Values Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the processing of morphosyntactic phenomena, by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs). |
Marcell Fekete; Johannes Bjerva; | arxiv-cs.CL | 2024-10-17 |
97 | Transfer Learning on Transformers for Building Energy Consumption Forecasting — A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting. |
Robert Spencer; Surangika Ranathunga; Mikael Boulic; van Heerden; Teo Susnjak; | arxiv-cs.LG | 2024-10-17 |
98 | FaithBench: A Diverse Hallucination Benchmark for Summarization By Modern LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FaithBench, a summarization hallucination benchmark comprising challenging hallucinations made by 10 modern LLMs from 8 different families, with ground truth annotations by human experts. |
FORREST SHENG BAO et. al. | arxiv-cs.CL | 2024-10-17 |
99 | Measuring and Modifying The Readability of English Texts with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, in a pre-registered human experiment (N = 59), we ask whether Turbo can reliably make text easier or harder to read. We find evidence to support this hypothesis, though considerable variance in human judgments remains unexplained. |
Sean Trott; Pamela D. Rivière; | arxiv-cs.CL | 2024-10-17 |
100 | Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that transformer-based solutions pose higher computational demands, consistently yield inferior performance, and experience significant performance degradation when quantized to accommodate resource-constrained devices. |
Clayton Souza Leite; Henry Mauranen; Aziza Zhanabatyrova; Yu Xiao; | arxiv-cs.LG | 2024-10-17 |
101 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed raster order using BERT- or GPT-like transformer architectures. |
LIJIE FAN et. al. | arxiv-cs.CV | 2024-10-17 |
102 | Detecting AI-Generated Texts in Cross-Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs. |
You Zhou; Jie Wang; | arxiv-cs.CL | 2024-10-17 |
103 | Context-Scaling Versus Task-Scaling in In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training. |
Amirhesam Abedsoltan; Adityanarayanan Radhakrishnan; Jingfeng Wu; Mikhail Belkin; | arxiv-cs.LG | 2024-10-16 |
104 | When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether GPTs can appropriately respond to unanswerable math word problems by applying prompts typically used in solvable mathematical scenarios. |
Asir Saadat; Tasmia Binte Sogir; Md Taukir Azam Chowdhury; Syem Aziz; | arxiv-cs.CL | 2024-10-16 |
105 | Reconstruction of Differentially Private Text Sanitization Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two attacks (black-box and white-box) based on the accessibility to LLMs and show that LLMs could connect the pair of DP-sanitized text and the corresponding private training data of LLMs by giving sample text pairs as instructions (in the black-box attacks) or fine-tuning data (in the white-box attacks). |
SHUCHAO PANG et. al. | arxiv-cs.CR | 2024-10-16 |
106 | SELF-BART : A Transformer-based Molecular Representation Model Using SELFIES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we develop an encoder-decoder model based on BART that is capable of leaning molecular representations and generate new molecules. |
INDRA PRIYADARSINI et. al. | arxiv-cs.CE | 2024-10-16 |
107 | Unifying Economic and Language Models for Enhanced Sentiment Analysis of The Oil Market Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these LMs often have difficulty with domain-specific terminology, limiting their effectiveness in the crude oil sector. Addressing this gap, we introduce CrudeBERT, a fine-tuned LM specifically for the crude oil market. |
Himmet Kaplan; Ralf-Peter Mundani; Heiko Rölke; Albert Weichselbraun; Martin Tschudy; | arxiv-cs.IR | 2024-10-16 |
108 | Stabilize The Latent Space for Image Autoregressive Modeling: A Unified Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This finding contrasts sharply with the field of NLP, where the autoregressive model GPT has established a commanding presence. To address this discrepancy, we introduce a unified perspective on the relationship between latent space and generative models, emphasizing the stability of latent space in image generative modeling. |
YONGXIN ZHU et. al. | arxiv-cs.CV | 2024-10-16 |
109 | Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, a few multi-lingual LLMs have emerged, but their performance in low-resource languages, especially the most spoken languages in South Asia, is less explored. To address this gap, in this study, we evaluate LLMs such as GPT-4, Llama 2, and Gemini to analyze their effectiveness in English compared to other low-resource languages from South Asia (e.g., Bangla, Hindi, and Urdu). |
Krishno Dey; Prerona Tarannum; Md. Arid Hasan; Imran Razzak; Usman Naseem; | arxiv-cs.CL | 2024-10-16 |
110 | With A Grain of SALT: Are LLMs Fair Across Social Dimensions? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an analysis of biases in open-source Large Language Models (LLMs) across various genders, religions, and races. |
Samee Arif; Zohaib Khan; Agha Ali Raza; Awais Athar; | arxiv-cs.CL | 2024-10-16 |
111 | Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Jigsaw Puzzles (JSP), a straightforward yet effective multi-turn jailbreak strategy against the advanced LLMs. |
Hao Yang; Lizhen Qu; Ehsan Shareghi; Gholamreza Haffari; | arxiv-cs.CL | 2024-10-15 |
112 | De-jargonizing Science for Journalists with GPT-4: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study offers an initial evaluation of a human-in-the-loop system leveraging GPT-4 (a large language model or LLM), and Retrieval-Augmented Generation (RAG) to identify and define jargon terms in scientific abstracts, based on readers’ self-reported knowledge. |
Sachita Nishal; Eric Lee; Nicholas Diakopoulos; | arxiv-cs.CL | 2024-10-15 |
113 | Table-LLM-Specialist: Language Model Specialists for Tables Using Iterative Generator-Validator Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Table-LLM-Specialist, or Table-Specialist for short, as a new self-trained fine-tuning paradigm specifically designed for table tasks. |
JUNJIE XING et. al. | arxiv-cs.CL | 2024-10-15 |
114 | In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have achieved impressive results across various tasks. |
Alireza Shamshiri; Kyeong Rok Ryu; June Young Park; | arxiv-cs.CL | 2024-10-15 |
115 | TraM : Enhancing User Sleep Prediction with Transformer-based Multivariate Time Series Modeling and Machine Learning Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach that leverages Transformer-based multivariate time series model and Machine Learning Ensembles to predict the quality of human sleep, emotional states, and stress levels. |
Jinjae Kim; Minjeong Ma; Eunjee Choi; Keunhee Cho; Chanwoo Lee; | arxiv-cs.LG | 2024-10-15 |
116 | ED-ViT: Splitting Vision Transformer for Distributed Inference on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their high computational demands and inference latency pose significant challenges for model deployment on resource-constraint edge devices. To address this issue, we propose a novel Vision Transformer splitting framework, ED-ViT, designed to execute complex models across multiple edge devices efficiently. |
XIANG LIU et. al. | arxiv-cs.CV | 2024-10-15 |
117 | SeaDATE: Remedy Dual-Attention Transformer with Semantic Alignment Via Contrast Learning for Multimodal Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce an accurate and efficient object detection method named SeaDATE. |
SHUHAN DONG et. al. | arxiv-cs.CV | 2024-10-15 |
118 | Domain-Conditioned Transformer for Fully Test-time Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We observe that, when applying a transformer network model into a new domain, the self-attention profiles of image samples in the target domain deviate significantly from those in the source domain, which results in large performance degradation during domain changes. To address this important issue, we propose a new structure for the self-attention modules in the transformer. |
Yushun Tang; Shuoshuo Chen; Jiyuan Jia; Yi Zhang; Zhihai He; | arxiv-cs.CV | 2024-10-14 |
119 | Embedding Self-Correction As An Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often encounter difficulties in certain aspects of mathematical reasoning, leading to flawed reasoning and erroneous results. To mitigate these issues, we introduce a novel mechanism, the Chain of Self-Correction (CoSC), specifically designed to embed self-correction as an inherent ability in LLMs, enabling them to validate and rectify their own results. |
Kuofeng Gao; Huanqia Cai; Qingyao Shuai; Dihong Gong; Zhifeng Li; | arxiv-cs.AI | 2024-10-14 |
120 | Rethinking Legal Judgement Prediction in A Realistic Scenario in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking. |
Shubham Kumar Nigam; Aniket Deroy; Subhankar Maity; Arnab Bhattacharya; | arxiv-cs.CL | 2024-10-14 |
121 | Performance in A Dialectal Profiling Task of LLMs for Varieties of Brazilian Portuguese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results offer sociolinguistic contributions for an equity fluent NLP technology. |
Raquel Meister Ko Freitag; Túlio Sousa de Gois; | arxiv-cs.CL | 2024-10-14 |
122 | Will LLMs Replace The Encoder-Only Models in Temporal Relation Classification? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task. |
Gabriel Roccabruna; Massimo Rizzoli; Giuseppe Riccardi; | arxiv-cs.CL | 2024-10-14 |
123 | RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers. |
Md Kowsher; Tara Esmaeilbeig; Chun-Nam Yu; Mojtaba Soltanalian; Niloofar Yousefi; | arxiv-cs.CL | 2024-10-13 |
124 | Evaluating Gender Bias of LLMs in Making Morality Judgements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates whether current closed and open-source LLMs possess gender bias, especially when asked to give moral opinions. To evaluate these models, we curate and introduce a new dataset GenMO (Gender-bias in Morality Opinions) comprising parallel short stories featuring male and female characters respectively. |
Divij Bajaj; Yuanyuan Lei; Jonathan Tong; Ruihong Huang; | arxiv-cs.CL | 2024-10-13 |
125 | Transformer-based Language Models for Reasoning in The Description Logic ALCQ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we systematically investigate the logical reasoning capabilities of a supervised fine-tuned DeBERTa-based model and two large language models (GPT-3.5, GPT-4) with few-shot prompting. |
Angelos Poulis; Eleni Tsalapati; Manolis Koubarakis; | arxiv-cs.CL | 2024-10-12 |
126 | \llinstruct: An Instruction-tuned Model for English Language Proficiency Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications. |
Debanjan Ghosh; Sophia Chan; | arxiv-cs.CL | 2024-10-11 |
127 | Improving Legal Entity Recognition Using A Hybrid Transformer Model and Semantic Filtering Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel hybrid model that enhances the accuracy and precision of Legal-BERT, a transformer model fine-tuned for legal text processing, by introducing a semantic similarity-based filtering mechanism. |
Duraimurugan Rajamanickam; | arxiv-cs.CL | 2024-10-11 |
128 | Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a pipeline for developing in-house LLMs tailored to identify differential diagnoses from radiology reports. |
LUOYAO CHEN et. al. | arxiv-cs.CL | 2024-10-11 |
129 | AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For instance, attacks tend to be less effective when models pay more attention to system prompts designed to ensure LLM safety alignment. Building on this discovery, we introduce an enhanced method that manipulates models’ attention scores to facilitate LLM jailbreaking, which we term AttnGCG. |
ZIJUN WANG et. al. | arxiv-cs.CL | 2024-10-11 |
130 | Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture. |
Evan Lucas; Dylan Kangas; Timothy C Havens; | arxiv-cs.CL | 2024-10-11 |
131 | Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism Via Dual Diffusion Models and GPT Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Traditional methods often rely on extensive and costly data collection using sonar sensors, jeopardizing data quality and diversity. To overcome these limitations, this study proposes a new sonar image synthesis framework, Synth-SONAR leveraging diffusion models and GPT prompting. |
Purushothaman Natarajan; Kamal Basha; Athira Nambiar; | arxiv-cs.CV | 2024-10-11 |
132 | Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data. |
Grace Proebsting; Adam Poliak; | arxiv-cs.CL | 2024-10-11 |
133 | Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While morally clear scenarios are more discernible to LLMs, greater difficulty is encountered in morally ambiguous contexts. In this investigation, we explored LLM calibration to show that human and LLM judgments are poorly aligned in such scenarios. |
PRANAV SENTHILKUMAR et. al. | arxiv-cs.CL | 2024-10-10 |
134 | HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a method named HorGait, which utilizes a hybrid model with a Transformer architecture for gait recognition on the planar projection of 3D point clouds from LiDAR. |
JIAXING HAO et. al. | arxiv-cs.CV | 2024-10-10 |
135 | Evaluating Transformer Models for Suicide Risk Detection on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on leveraging state-of-the-art natural language processing solutions for identifying suicide risk in social media posts as a submission for the IEEE BigData 2024 Cup: Detection of Suicide Risk on Social Media conducted by the kubapok team. |
Jakub Pokrywka; Jeremi I. Kaczmarek; Edward J. Gorzelańczyk; | arxiv-cs.CL | 2024-10-10 |
136 | VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce VibeCheck, a system for automatically comparing a pair of LLMs by discovering identifying traits of a model (vibes) that are well-defined, differentiating, and user-aligned. |
Lisa Dunlap; Krishna Mandal; Trevor Darrell; Jacob Steinhardt; Joseph E Gonzalez; | arxiv-cs.CL | 2024-10-10 |
137 | The Rise of AI-Generated Content in Wikipedia Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative, to establish lower bounds on the presence of AI-generated content in recently created Wikipedia pages. |
Creston Brooks; Samuel Eggert; Denis Peskoff; | arxiv-cs.CL | 2024-10-10 |
138 | Robust AI-Generated Text Detection By Restricted Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on the robustness of classifier-based detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains. |
KRISTIAN KUZNETSOV et. al. | arxiv-cs.CL | 2024-10-10 |
139 | SWE-Bench+: Enhanced Coding Benchmark for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a systematic evaluation of the quality of SWE-bench remains missing. In this paper, we addressed this gap by presenting an empirical analysis of the SWE-bench dataset. |
REEM ALEITHAN et. al. | arxiv-cs.SE | 2024-10-09 |
140 | Stanceformer: Target-Aware Transformer for Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, these models yield similar performance regardless of whether we utilize or disregard target information, undermining the task’s significance. To address this challenge, we introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference. |
Krishna Garg; Cornelia Caragea; | arxiv-cs.CL | 2024-10-09 |
141 | Optimized Spatial Architecture Mapping Flow for Transformer Accelerators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the design process for existing spatial architectures is predominantly manual, and it often involves time-consuming redesigns for new applications and new problem dimensions, which greatly limits the development of optimally designed accelerators for Transformer models. To address these challenges, we propose SAMT (Spatial Architecture Mapping for Transformers), a comprehensive framework designed to optimize the dataflow mapping of Transformer inference workloads onto spatial accelerators. |
HAOCHENG XU et. al. | arxiv-cs.AR | 2024-10-09 |
142 | SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks. |
VIKTORIIA CHEKALINA et. al. | arxiv-cs.CL | 2024-10-09 |
143 | InAttention: Linear Context Scaling for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we modify the decoder-only transformer, replacing self-attention with InAttention, which scales linearly with context length during inference by having tokens attend only to initial states. |
Joseph Eisner; | arxiv-cs.LG | 2024-10-09 |
144 | Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini – frontier models trained to be helpful, harmless, and honest – can engage in specification gaming without training on a curriculum of tasks, purely from in-context iterative reflection (which we call in-context reinforcement learning, ICRL). |
Leo McKee-Reid; Christoph Sträter; Maria Angelica Martinez; Joe Needham; Mikita Balesni; | arxiv-cs.AI | 2024-10-08 |
145 | A Comparative Study of Hybrid Models in Health Misinformation Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs), aiming to develop more effective tools for countering the spread of health misinformation during the pan-demic. |
Mkululi Sikosana; Oluwaseun Ajao; Sean Maudsley-Barton; | arxiv-cs.IR | 2024-10-08 |
146 | A GPT-based Decision Transformer for Multi-Vehicle Coordination at Unsignalized Intersections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the application of the Decision Transformer, a decision-making algorithm based on the Generative Pre-trained Transformer (GPT) architecture, to multi-vehicle coordination at unsignalized intersections. |
Eunjae Lee; Minhee Kang; Yoojin Choi; Heejin Ahn; | arxiv-cs.RO | 2024-10-08 |
147 | Unveiling Transformer Perception By Exploring Input Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a general method for the exploration of equivalence classes in the input space of Transformer models. |
Alessandro Benfenati; Alfio Ferrara; Alessio Marta; Davide Riva; Elisabetta Rocchetti; | arxiv-cs.LG | 2024-10-08 |
148 | Solving Multi-Goal Robotic Tasks with Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, no existing methods effectively combine offline training, multi-goal learning, and transformer-based architectures. In this paper, we address these challenges by introducing a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics. |
Paul Gajewski; Dominik Żurek; Marcin Pietroń; Kamil Faber; | arxiv-cs.RO | 2024-10-08 |
149 | SC-Bench: A Large-Scale Dataset for Smart Contract Auditing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SC-Bench, the first dataset for automated smart-contract auditing research. |
Shihao Xia; Mengting He; Linhai Song; Yiying Zhang; | arxiv-cs.CR | 2024-10-08 |
150 | Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library. |
YE FANG et. al. | nips | 2024-10-07 |
151 | Timer-XL: Long-Context Transformers for Unified Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Timer-XL, a generative Transformer for unified time series forecasting. |
Yong Liu; Guo Qin; Xiangdong Huang; Jianmin Wang; Mingsheng Long; | arxiv-cs.LG | 2024-10-07 |
152 | RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent. |
SHAOTENG LIU et. al. | nips | 2024-10-07 |
153 | APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents APIGen, an automated data generation pipeline designed to produce verifiable high-quality datasets for function-calling applications. |
ZUXIN LIU et. al. | nips | 2024-10-07 |
154 | Achieving Efficient Alignment Through Learned Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce *Aligner*, a novel and simple alignment paradigm that learns the correctional residuals between preferred and dispreferred answers using a small model. |
JIAMING JI et. al. | nips | 2024-10-07 |
155 | DeformableTST: Transformer for Time Series Forecasting Without Over-reliance on Patching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: But at the same time, we observe a new problem that the recent Transformer-based models are overly reliant on patching to achieve ideal performance, which limits their applicability to some forecasting tasks unsuitable for patching. In this paper, we intent to handle this emerging issue. |
Donghao Luo; Xue Wang; | nips | 2024-10-07 |
156 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 774M parameters. |
RHEA SUKTHANKER et. al. | nips | 2024-10-07 |
157 | Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Efficient Multi-Task Learning (EMTAL), a novel approach that transforms a pre-trained Transformer into an efficient multi-task learner during training, and reparameterizes the knowledge back to the original Transformer for efficient inference. |
Hanwen Zhong; Jiaxin Chen; Yutong Zhang; Di Huang; Yunhong Wang; | nips | 2024-10-07 |
158 | SAND: Smooth Imputation of Sparse and Noisy Functional Data with Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the transformer architecture has come to dominate other models for text and image data, its application to irregularly-spaced longitudinal data has been limited. We introduce a variant of the transformer that enables it to more smoothly impute such functional data. |
Ju-Sheng Hong; Junwen Yao; Jonas Mueller; Jane-Ling Wang; | nips | 2024-10-07 |
159 | ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching. |
JUNJIE NI et. al. | nips | 2024-10-07 |
160 | FinBen: An Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. |
QIANQIAN XIE et. al. | nips | 2024-10-07 |
161 | Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a method that is able to distill a pre-trained Transformer architecture into alternative architectures such as state space models (SSMs). |
Aviv Bick; Kevin Li; Eric Xing; J. Zico Kolter; Albert Gu; | nips | 2024-10-07 |
162 | LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH). |
XIAONAN NIE et. al. | nips | 2024-10-07 |
163 | Transformers Learn Variable-order Markov Chains In-context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the ICL of VOMC by viewing language modeling as a form of data compression and focus on small alphabets and low-order VOMCs. |
Ruida Zhou; Chao Tian; Suhas Diggavi; | arxiv-cs.LG | 2024-10-07 |
164 | SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. |
PEI ZHOU et. al. | nips | 2024-10-07 |
165 | The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent research suggests that state-space models (SSMs) like Mamba can be competitive with Transformer models for language modeling with advantageous deployment characteristics. Given the focus and expertise on training large-scale Transformer models, we consider the challenge of converting these pretrained models into SSMs for deployment. |
Junxiong Wang; Daniele Paliotta; Avner May; Alexander Rush; Tri Dao; | nips | 2024-10-07 |
166 | Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior work has proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top $k$ similar tokens. |
CHAU TRAN et. al. | nips | 2024-10-07 |
167 | Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. |
JONATHAN THOMM et. al. | nips | 2024-10-07 |
168 | Does RoBERTa Perform Better Than BERT in Continual Learning: An Attention Sink Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we observe that pre-trained models may allocate high attention scores to some ‘sink’ tokens, such as [SEP] tokens, which are ubiquitous across various tasks. |
Xueying Bai; Yifan Sun; Niranjan Balasubramanian; | arxiv-cs.LG | 2024-10-07 |
169 | SemCoder: Training Code Language Models with Comprehensive Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair. |
YANGRUIBO DING et. al. | nips | 2024-10-07 |
170 | LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The KG datastore is designed as a plug-and-play module, allowing for seamless integration with various model architectures. We introduce and evaluate three distinct frameworks within this paradigm: KG-LLaVA, which integrates the pre-trained LLaVA model with KG-RAG; Med-XPT, a custom framework combining MedCLIP, a transformer-based projector, and GPT-2; and Bio-LLaVA, which adapts LLaVA by incorporating the Bio-ViT-L vision model. |
Ameer Hamza; Yong Hyun Ahn; Sungyoung Lee; Seong Tae Kim; | arxiv-cs.CV | 2024-10-07 |
171 | OccamLLM: Fast and Exact Language Model Arithmetic in A Single Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that enables exact arithmetic in a single autoregressive step, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities. |
OWEN DUGAN et. al. | nips | 2024-10-07 |
172 | Weak-to-Strong Search: Align Large Language Models Via Searching Over Small Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce *weak-to-strong search*, framing the alignment of a large language model as a test-time greedy search to maximize the log-likelihood difference between small tuned and untuned models while sampling from the frozen large model. |
ZHANHUI ZHOU et. al. | nips | 2024-10-07 |
173 | Alleviating Distortion in Image Generation Via Multi-Resolution Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. |
QIHAO LIU et. al. | nips | 2024-10-07 |
174 | Unraveling The Gradient Descent Dynamics of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradient Descent (GD) to achieve guaranteed convergence? |
Bingqing Song; Boran Han; Shuai Zhang; Jie Ding; Mingyi Hong; | nips | 2024-10-07 |
175 | Finding Transformer Circuits With Edge Pruning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we frame circuit discovery as an optimization problem and propose _Edge Pruning_ as an effective and scalable solution. |
Adithya Bhaskar; Alexander Wettig; Dan Friedman; Danqi Chen; | nips | 2024-10-07 |
176 | In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introducing the concept of state vector. |
DONGFANG LI et. al. | nips | 2024-10-07 |
177 | Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods. |
Lingxiao Zhao; Xueying Ding; Leman Akoglu; | nips | 2024-10-07 |
178 | VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach to reduce vision compute by leveraging redundant vision tokens “skipping layers” rather than decreasing the number of vision tokens. |
SHIWEI WU et. al. | nips | 2024-10-07 |
179 | InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration. |
XIAOYI DONG et. al. | nips | 2024-10-07 |
180 | SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel transformer-to-SNN conversion method that outputs an end-to-end spike-based transformer, named SpikedAttention. |
Sangwoo Hwang; Seunghyun Lee; Dahoon Park; Donghun Lee; Jaeha Kung; | nips | 2024-10-07 |
181 | Predicting Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, despite sustained widespread interest, a rigorous understanding of why transformer scaling laws exist is still missing. To answer this question, we establish novel statistical estimation and mathematical approximation theories for transformers when the input data are concentrated on a low-dimensional manifold. |
Alexander Havrilla; Wenjing Liao; | nips | 2024-10-07 |
182 | Can Large Language Models Explore In-context? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. |
Akshay Krishnamurthy; Keegan Harris; Dylan J Foster; Cyril Zhang; Aleksandrs Slivkins; | nips | 2024-10-07 |
183 | Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models Via Recounted Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new prompting technique tailored for temporal reasoning, Narrative-of-Thought (NoT), that first converts the events set to a Python class, then prompts a small model to generate a temporally grounded narrative, guiding the final generation of a temporal graph. |
Xinliang Frederick Zhang; Nick Beauchamp; Lu Wang; | arxiv-cs.CL | 2024-10-07 |
184 | Understanding Transformers Via N-Gram Statistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper takes a first step in this direction by considering families of functions (i.e. rules) formed out of simple N-gram based statistics of the training data. By studying how well these rulesets approximate transformer predictions, we obtain a variety of novel discoveries: a simple method to detect overfitting during training without using a holdout set, a quantative measure of how transformers progress from learning simple to more complex statistical rules over the course of training, a model-variance criterion governing when transformer predictions tend to be described by N-gram rules, and insights into how well transformers can be approximated by N-gram rulesets in the limit where these rulesets become increasingly complex. |
Timothy Nguyen; | nips | 2024-10-07 |
185 | DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size—adding a few thousand parameters for large-scale models in the 100B parameters range. |
Matteo Pagliardini; Amirkeivan Mohtashami; François Fleuret; Martin Jaggi; | nips | 2024-10-07 |
186 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based **M**ulti-**A**gent framework for **G**itHub **I**ssue re**S**olution, **MAGIS**, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents. |
WEI TAO et. al. | nips | 2024-10-07 |
187 | Approximation Rate of The Transformer Architecture for Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the approximation rate for single-layer Transformers with one head. |
Haotian Jiang; Qianxiao Li; | nips | 2024-10-07 |
188 | $M^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents $M^3$GPT, an advanced \textbf{M}ultimodal, \textbf{M}ultitask framework for \textbf{M}otion comprehension and generation. |
MINGSHUANG LUO et. al. | nips | 2024-10-07 |
189 | A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation. |
ARCHIT SHARMA et. al. | nips | 2024-10-07 |
190 | Differential Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise. |
TIANZHU YE et. al. | arxiv-cs.CL | 2024-10-07 |
191 | Perception of Knowledge Boundary for Large Language Models Through Semi-open-ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perceive the LLMs’ knowledge boundary with semi-open-ended questions by discovering more ambiguous answers. |
ZHIHUA WEN et. al. | nips | 2024-10-07 |
192 | UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction. |
Yansong Ning; Hao Liu; | nips | 2024-10-07 |
193 | JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data. |
KUN ZHOU et. al. | nips | 2024-10-07 |
194 | Visual Autoregressive Modeling: Scalable Image Generation Via Next-Scale Prediction IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine next-scale prediction or next-resolution prediction, diverging from the standard raster-scan next-token prediction. |
Keyu Tian; Yi Jiang; Zehuan Yuan; BINGYUE PENG; Liwei Wang; | nips | 2024-10-07 |
195 | Leveraging Free Energy in Pretraining Model Selection for Improved Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a Bayesian model selection criterion, called the downstream free energy, which quantifies a checkpoint’s adaptability by measuring the concentration of nearby favorable parameters for the downstream task. |
Michael Munn; Susan Wei; | arxiv-cs.LG | 2024-10-07 |
196 | Seshat Global History Databank Text Dataset and Benchmark of Large Language Models’ History Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This benchmarking is particularly challenging, given that human knowledge of history is inherently unbalanced, with more information available on Western history and recent periods. To address this challenge, we introduce a curated sample of the Seshat Global History Databank, which provides a structured representation of human historical knowledge, containing 36,000 data points across 600 historical societies and over 600 scholarly references. |
JAKOB HAUSER et. al. | nips | 2024-10-07 |
197 | Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks. |
Jonathan Hayase; Ema Borevković; Nicholas Carlini; Florian Tramer; Milad Nasr; | nips | 2024-10-07 |
198 | DAMRO: Dive Into The Attention Mechanism of LVLM to Reduce Object Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issue, we propose DAMRO, a novel training-free strategy that $D$ive into $A$ttention $M$echanism of LVLM to $R$educe $O$bject Hallucination. |
Xuan Gong; Tianshi Ming; Xinpeng Wang; Zhihua Wei; | arxiv-cs.CL | 2024-10-06 |
199 | ProtocoLLM: Automatic Evaluation Framework of LLMs on Domain-Specific Scientific Protocol Formulation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a flexible, automatic framework to evaluate LLM’s capability on SPFT: ProtocoLLM. |
Seungjun Yi; Jaeyoung Lim; Juyong Yoon; | arxiv-cs.CL | 2024-10-06 |
200 | Fundamental Limitations on Subquadratic Alternatives to Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For instance, state space models such as Mamba were designed to replace attention with an almost linear time alternative. In this paper, we prove that any such approach cannot perform important tasks that Transformer is able to perform (assuming a popular conjecture from fine-grained complexity theory). |
Josh Alman; Hantao Yu; | arxiv-cs.LG | 2024-10-05 |
201 | Equivariant Neural Functional Networks for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While NFN have been extensively developed for MLP and CNN, no prior work has addressed their design for transformers, despite the importance of transformers in modern deep learning. This paper aims to address this gap by providing a systematic study of NFN for transformers. |
VIET-HOANG TRAN et. al. | arxiv-cs.LG | 2024-10-05 |
202 | Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts. |
AMEY HENGLE et. al. | arxiv-cs.CL | 2024-10-04 |
203 | Selective Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Transformer models face two key challenges when dealing with HSI scenes characterized by diverse land cover types and rich spectral information: (1) fixed receptive field representation overlooks effective contextual information; (2) redundant self-attention feature representation. To address these limitations, we propose a novel Selective Transformer (SFormer) for HSI classification. |
Yichu Xu; Di Wang; Lefei Zhang; Liangpei Zhang; | arxiv-cs.CV | 2024-10-04 |
204 | How Language Models Prioritize Contextual Grammatical Cues? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun. |
Hamidreza Amirzadeh; Afra Alishahi; Hosein Mohebbi; | arxiv-cs.CL | 2024-10-04 |
205 | Learning Semantic Structure Through First-Order-Logic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study whether transformer-based language models can extract predicate argument structure from simple sentences. |
Akshay Chaturvedi; Nicholas Asher; | arxiv-cs.CL | 2024-10-04 |
206 | Dynamic Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we introduce a Timestep-wise Dynamic Width (TDW) approach that adapts model width conditioned on the generation timesteps. |
WANGBO ZHAO et. al. | arxiv-cs.CV | 2024-10-04 |
207 | Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the tokenization techniques employed by state-of-the-art large language models (LLMs) and their implications on the cost and availability of services across different languages, especially low resource languages. |
Abrar Rahman; Garry Bowlin; Binit Mohanty; Sean McGunigal; | arxiv-cs.CL | 2024-10-04 |
208 | AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first correct-by-construction learning-based system for step-by-step mathematical integration. |
Mert Ünsal; Timon Gehr; Martin Vechev; | arxiv-cs.LG | 2024-10-03 |
209 | CulturalBench: A Robust, Diverse and Challenging Benchmark on Measuring The (Lack Of) Cultural Knowledge of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CulturalBench: a set of 1,227 human-written and human-verified questions for effectively assessing LLMs’ cultural knowledge, covering 45 global regions including the underrepresented ones like Bangladesh, Zimbabwe, and Peru. |
YU YING CHIU et. al. | arxiv-cs.CL | 2024-10-03 |
210 | SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% The Cost Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes SIEVE, a lightweight alternative that matches GPT-4o accuracy at a fraction of the cost. |
Jifan Zhang; Robert Nowak; | arxiv-cs.CL | 2024-10-03 |
211 | IndicSentEval: How Effectively Do Multilingual Transformer Models Encode Linguistic Properties for Indic Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate similar questions regarding encoding capability and robustness for 8 linguistic properties across 13 different perturbations in 6 Indic languages, using 9 multilingual Transformer models (7 universal and 2 Indic-specific). |
Akhilesh Aravapalli; Mounika Marreddy; Subba Reddy Oota; Radhika Mamidi; Manish Gupta; | arxiv-cs.CL | 2024-10-03 |
212 | Intrinsic Evaluation of RAG Systems for Deep-Logic Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Overall Performance Index (OPI), an intrinsic metric to evaluate retrieval-augmented generation (RAG) mechanisms for applications involving deep-logic queries. |
Junyi Hu; You Zhou; Jie Wang; | arxiv-cs.AI | 2024-10-03 |
213 | AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose AutoDAN-Turbo, a black-box jailbreak method that can automatically discover as many jailbreak strategies as possible from scratch, without any human intervention or predefined scopes (e.g., specified candidate strategies), and use them for red-teaming. |
XIAOGENG LIU et. al. | arxiv-cs.CR | 2024-10-03 |
214 | Coal Mining Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to coal mining question answering (QA) using large language models (LLMs) combined with tailored prompt engineering techniques. |
Antonio Carlos Rivera; Anthony Moore; Steven Robinson; | arxiv-cs.CL | 2024-10-03 |
215 | Automatic Deductive Coding in Discourse Analysis: An Application of Large Language Models in Learning Analytics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding. |
Lishan Zhang; Han Wu; Xiaoshan Huang; Tengfei Duan; Hanxiang Du; | arxiv-cs.CL | 2024-10-02 |
216 | Emotion-Aware Response Generation Using Affect-Enriched Embeddings with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel framework that integrates multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as LLAMA 2, Flan-T5, ChatGPT 3.0, and ChatGPT 4.0. |
Abdur Rasool; Muhammad Irfan Shahzad; Hafsa Aslam; Vincent Chan; | arxiv-cs.CL | 2024-10-02 |
217 | A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer. |
LIANG CHEN et. al. | arxiv-cs.CV | 2024-10-02 |
218 | ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, even state-of-the-art vision-language models (VLMs), such as GPT-4o, still fall short of human-level performance, particularly in intricate web environments and long-horizon tasks. To address these limitations, we present ExACT, an approach to combine test-time search and self-learning to build o1-like models for agentic applications. |
XIAO YU et. al. | arxiv-cs.CL | 2024-10-02 |
219 | On The Adaptation of Unlimiformer for Decoder-Only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its main limitation is incompatibility with decoder-only transformers out of the box. In this work, we explore practical considerations of adapting Unlimiformer to decoder-only transformers and introduce a series of modifications to overcome this limitation. |
KIAN AHRABIAN et. al. | arxiv-cs.CL | 2024-10-02 |
220 | Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of LLMs and FinBERT for FSA, comparing their performance on news articles, financial reports and company announcements. |
Yanxin Shen; Pulin Kirin Zhang; | arxiv-cs.IR | 2024-10-02 |
221 | Creative and Context-Aware Translation of East Asian Idioms with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, compiling a dictionary of candidate translations demands much time and creativity even for expert translators. To alleviate such burden, we evaluate if GPT-4 can help generate high-quality translations. |
Kenan Tang; Peiyang Song; Yao Qin; Xifeng Yan; | arxiv-cs.CL | 2024-10-01 |
222 | SIGMA: Secure GPT Inference with Function Secret Sharing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the … |
KANAV GUPTA et. al. | Proc. Priv. Enhancing Technol. | 2024-10-01 |
223 | MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone’s Potential with Masked Autoregressive Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on this analysis, we propose Masked Autoregressive Pretraining (MAP) to pretrain a hybrid Mamba-Transformer vision backbone network. |
Yunze Liu; Li Yi; | arxiv-cs.CV | 2024-10-01 |
224 | GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called , simultaneously applicable for various vision tasks only with a vanilla ViT.Interestingly, our builds a new benchmark in generalist performance, and fosters mutual enhancement across tasks, leading to significant improvements compared to isolated training. |
HAIYANG WANG et. al. | eccv | 2024-09-30 |
225 | TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost. |
Areeg Fahad Rasheed; M. Zarkoosh; Safa F. Abbas; Sana Sabah Al-Azzawi; | arxiv-cs.CL | 2024-09-30 |
226 | AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering. |
Weiran Huang; Xiuyuan Chen; Yuan Lin; Yuchen Zhang; | eccv | 2024-09-30 |
227 | ACE: All-round Creator and Editor Following Instructions Via Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose ACE, an All-round Creator and Editor, which achieves comparable performance compared to those expert models in a wide range of visual generation tasks. |
ZHEN HAN et. al. | arxiv-cs.CV | 2024-09-30 |
228 | GENIXER: Empowering Multimodal Large Language Models As A Powerful Data Generator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce , a comprehensive data generation pipeline consisting of four key steps: (i) instruction data collection, (ii) instruction template design, (iii) empowering MLLMs, and (iv) data generation and filtering. |
Henry Hengyuan Zhao; Pan Zhou; Mike Zheng Shou; | eccv | 2024-09-30 |
229 | HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction. |
Fangqin Zhou; Mert Kilickaya; Joaquin Vanschoren; Ran Piao; | eccv | 2024-09-30 |
230 | Comprehensive Performance Modeling and System Design Insights for Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze performance characteristics of such transformer models and discuss their sensitivity to the transformer type, parallelization strategy, and HPC system features (accelerators and interconnects). We utilize a performance model that allows us to explore this complex design space and highlight its key components. |
SHASHANK SUBRAMANIAN et. al. | arxiv-cs.LG | 2024-09-30 |
231 | OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes. |
WENZHAO ZHENG et. al. | eccv | 2024-09-30 |
232 | MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce MaskMamba, a novel hybrid model that combines Mamba and Transformer architectures, utilizing Masked Image Modeling for non-autoregressive image synthesis. |
Wenchao Chen; Liqiang Niu; Ziyao Lu; Fandong Meng; Jie Zhou; | arxiv-cs.CV | 2024-09-30 |
233 | An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This quadratic increase in computational burden restricts the applicability of visual grounding to more intricate scenes, such as conversation-based reasoning segmentation, which involves lengthy language expressions. In this paper, we propose an efficient and effective multi-task visual grounding (EEVG) framework based on Transformer Decoder to address this issue, which reduces the cost in both language and visual aspects. |
Wei Chen; Long Chen; Yu Wu; | eccv | 2024-09-30 |
234 | Evaluating The Fairness of Task-adaptive Pretraining on Unlabeled Test Data Before Few-shot Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Few-shot learning benchmarks are critical for evaluating modern NLP techniques. |
Kush Dubey; | arxiv-cs.CL | 2024-09-30 |
235 | MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the hypergraph transformer-based method for trajectory prediction is yet to be explored. Therefore, we present a MultiscAle Relational Transformer (MART) network for multi-agent trajectory prediction. |
Seongju Lee; Junseok Lee; Yeonguk Yu; Taeri Kim; Kyoobin Lee; | eccv | 2024-09-30 |
236 | Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face limitations in both shape reconstruction and texture generation. This paper introduces an innovative Analysis-by-Synthesis Transformer that addresses these limitations in a unified framework by effectively modeling pixel-to-shape and pixel-to-texture relationships. |
DIAN JIA et. al. | eccv | 2024-09-30 |
237 | An Explainable Vision Question Answer Model Via Diffusion Chain-of-Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This means that generating explanations solely for the answer can lead to a semantic discrepancy between the content of the explanation and the question-answering content. To address this, we propose a step-by-step reasoning approach to reduce such semantic discrepancies. |
Chunhao LU; Qiang Lu; Jake Luo; | eccv | 2024-09-30 |
238 | Sparse Attention Decomposition Applied to Circuit Tracing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we seek to isolate and identify the features used to effect communication and coordination among attention heads in GPT-2 small. |
Gabriel Franco; Mark Crovella; | arxiv-cs.LG | 2024-09-30 |
239 | Depression Detection in Social Media Posts Using Transformer-based Models and Auxiliary Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing studies have explored various approaches to this problem but often fall short in terms of accuracy and robustness. To address these limitations, this research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers. |
Marios Kerasiotis; Loukas Ilias; Dimitris Askounis; | arxiv-cs.CL | 2024-09-30 |
240 | LingoQA: Video Question Answering for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving.We release our dataset and benchmark1 as an evaluation platform for vision-language models in autonomous driving. |
ANA-MARIA MARCU et. al. | eccv | 2024-09-30 |
241 | Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenges posed by the substantial training time and memory consumption associated with video transformers, focusing on the ViViT (Video Vision Transformer) model, in particular the Factorised Encoder version, as our baseline for action recognition tasks. |
Shreyank N Gowda; Anurag Arnab; Jonathan Huang; | eccv | 2024-09-30 |
242 | Spiking Transformer with Spatial-Temporal Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Spiking Transformer with Spatial-Temporal Attention (STAtten), a simple and straightforward architecture designed to integrate spatial and temporal information in self-attention with negligible additional computational load. |
Donghyun Lee; Yuhang Li; Youngeun Kim; Shiting Xiao; Priyadarshini Panda; | arxiv-cs.NE | 2024-09-29 |
243 | Multimodal Misinformation Detection By Learning from Synthetic Data with Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions. |
Fengzhu Zeng; Wenqian Li; Wei Gao; Yan Pang; | arxiv-cs.CL | 2024-09-29 |
244 | 3D-CT-GPT: Generating 3D Radiology Reports Through Integration of Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model specifically designed for generating radiology reports from 3D CT scans, particularly chest CTs. |
HAO CHEN et. al. | arxiv-cs.CV | 2024-09-28 |
245 | Efficient Federated Intrusion Detection in 5G Ecosystem Using Optimized BERT-based Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs). |
Frederic Adjewa; Moez Esseghir; Leila Merghem-Boulahia; | arxiv-cs.CR | 2024-09-28 |
246 | INSIGHTBUDDY-AI: Medication Extraction and Entity Linking Using Large Language Models and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects. |
Pablo Romero; Lifeng Han; Goran Nenadic; | arxiv-cs.CL | 2024-09-28 |
247 | INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore fundamental questions related to solving mathematical reasoning problems using natural language and code with state-of-the-art LLMs, including GPT-4o-mini and LLama-3.1-8b-Turbo. |
Xuyuan Xiong; Simeng Han; Ziyue Zhou; Arman Cohan; | arxiv-cs.CL | 2024-09-28 |
248 | Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Pre-trained language models offer promise for identifying suicidality from unstructured clinical narratives. |
ZEHAN LI et. al. | arxiv-cs.CL | 2024-09-27 |
249 | Cottention: Linear Transformers With Cosine Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Cottention, a novel attention mechanism that replaces the softmax operation with cosine similarity. |
Gabriel Mongaras; Trevor Dohm; Eric C. Larson; | arxiv-cs.LG | 2024-09-27 |
250 | FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research on food image understanding using recipe data has been a long-standing focus due to the diversity and complexity of the data. |
Yuki Imajuku; Yoko Yamakata; Kiyoharu Aizawa; | arxiv-cs.CV | 2024-09-27 |
251 | Experimental Evaluation of Machine Learning Models for Goal-oriented Customer Service Chatbot with Pipeline Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a tailored experimental evaluation approach for goal-oriented customer service chatbots with pipeline architecture, focusing on three key components: Natural Language Understanding (NLU), dialogue management (DM), and Natural Language Generation (NLG). |
Nurul Ain Nabilah Mohd Isa; Siti Nuraishah Agos Jawaddi; Azlan Ismail; | arxiv-cs.AI | 2024-09-27 |
252 | MASSFormer: Mobility-Aware Spectrum Sensing Using Transformer-Driven Tiered Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a novel mobility-aware transformer-driven tiered structure (MASSFormer) based cooperative spectrum sensing method that effectively models the spatio-temporal dynamics of user movements. |
Dimpal Janu; Sandeep Mandia; Kuldeep Singh; Sandeep Kumar; | arxiv-cs.IT | 2024-09-26 |
253 | Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses from Closed-Domain LLM Vs. Clinical Teams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, responding to these patients’ inquiries has become a significant burden on healthcare workflows, consuming considerable time for clinical care teams. To address this, we introduce RadOnc-GPT, a specialized Large Language Model (LLM) powered by GPT-4 that has been designed with a focus on radiotherapeutic treatment of prostate cancer with advanced prompt engineering, and specifically designed to assist in generating responses. |
YUEXING HAO et. al. | arxiv-cs.AI | 2024-09-26 |
254 | Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec. |
Richard Yue; John E. Ortega; | arxiv-cs.CL | 2024-09-26 |
255 | General Compression Framework for Efficient Transformer Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose a general model compression framework for efficient transformer object tracking, named CompressTracker, to reduce the size of a pre-trained tracking model into a lightweight tracker with minimal performance degradation. |
LINGYI HONG et. al. | arxiv-cs.CV | 2024-09-26 |
256 | The Application of GPT-4 in Grading Design University Students’ Assignment and Providing Feedback: An Exploratory Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to investigate whether GPT-4 can effectively grade assignments for design university students and provide useful feedback. |
Qian Huang; Thijs Willems; King Wang Poon; | arxiv-cs.AI | 2024-09-26 |
257 | Beyond Turing Test: Can GPT-4 Sway Experts’ Decisions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the post-Turing era, evaluating large language models (LLMs) involves assessing generated text based on readers’ reactions rather than merely its indistinguishability from human-produced content. |
Takehiro Takayanagi; Hiroya Takamura; Kiyoshi Izumi; Chung-Chi Chen; | arxiv-cs.CE | 2024-09-25 |
258 | Reducing and Exploiting Data Augmentation Noise Through Meta Reweighting Contrastive Learning for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To boost deep learning models’ performance given augmented data/samples in text classification tasks, we propose a novel framework, which leverages both meta learning and contrastive learning techniques as parts of our design for reweighting the augmented samples and refining their feature representations based on their quality. |
Guanyi Mou; Yichuan Li; Kyumin Lee; | arxiv-cs.CL | 2024-09-25 |
259 | Assessing The Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people, from multiple social media sources. |
Mukaffi Bin Moin; Pronay Debnath; Usafa Akther Rifa; Rijeet Bin Anis; | arxiv-cs.CL | 2024-09-25 |
260 | Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code. |
Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier; | arxiv-cs.CR | 2024-09-25 |
261 | GPT-4 As A Homework Tutor Can Improve Student Engagement and Learning Outcomes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools. |
Alessandro Vanzo; Sankalan Pal Chowdhury; Mrinmaya Sachan; | arxiv-cs.CY | 2024-09-24 |
262 | Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) become advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality. |
Xufeng Duan; Xinyu Zhou; Bei Xiao; Zhenguang G. Cai; | arxiv-cs.CL | 2024-09-24 |
263 | MonoFormer: One Transformer for Both Diffusion and Autoregression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to study a simple idea: share one transformer for both autoregression and diffusion. |
CHUYANG ZHAO et. al. | arxiv-cs.CV | 2024-09-24 |
264 | SynChart: Synthesizing Charts from Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a large-scale chart dataset, SynChart, which contains approximately 4 million diverse chart images with over 75 million dense annotations, including data tables, code, descriptions, and question-answer sets. We trained a 4.2B chart-expert model using this dataset and achieve near-GPT-4O performance on the ChartQA task, surpassing GPT-4V. |
MENGCHEN LIU et. al. | arxiv-cs.AI | 2024-09-24 |
265 | SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in FL environments. |
Minyeong Choe; Cheolhee Park; Changho Seo; Hyunil Kim; | arxiv-cs.LG | 2024-09-23 |
266 | SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce \textit{multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries}, SOFI. |
Sebastian Janampa; Marios Pattichis; | arxiv-cs.CV | 2024-09-23 |
267 | Improving Academic Skills Assessment with NLP and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP). |
Xinyi Huang; Yingyi Wu; Danyang Zhang; Jiacheng Hu; Yujian Long; | arxiv-cs.CL | 2024-09-23 |
268 | Towards A Realistic Long-Term Benchmark for Open-Web Research Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present initial results of a forthcoming benchmark for evaluating LLM agents on white-collar tasks of economic value. |
Peter Mühlbacher; Nikos I. Bosse; Lawrence Phillips; | arxiv-cs.CL | 2024-09-23 |
269 | Evaluating The Quality of Code Comments Generated By Large Language Models for Novice Programmers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated. |
Aysa Xuemo Fan; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Jiaze Ke; | arxiv-cs.SE | 2024-09-22 |
270 | Can Pre-trained Language Models Generate Titles for Research Papers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fine-tune pre-trained language models to generate titles of papers from their abstracts. |
Tohida Rehman; Debarshi Kumar Sanyal; Samiran Chattopadhyay; | arxiv-cs.CL | 2024-09-22 |
271 | Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Narrow Jump to Conclusions (NJTC) and Normalized Narrow Jump to Conclusions (N-NJTC) – parameter efficient alternatives to standard linear shortcutting that reduces shortcut parameter count by over 97%. |
Amrit Diggavi Seshadri; | arxiv-cs.AI | 2024-09-21 |
272 | Can LLMs Replace Neil DeGrasse Tyson? Evaluating The Reliability of LLMs As Science Communicators Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on evaluating the reliability of current LLMs as science communicators. |
Prasoon Bajpai; Niladri Chatterjee; Subhabrata Dutta; Tanmoy Chakraborty; | arxiv-cs.CL | 2024-09-21 |
273 | The Use of GPT-4o and Other Large Language Models for The Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OpenAI’s ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft’s Copilot, Google’s Gemini 1.5 Pro, and Antrophic’s Claude 3.5 Sonnet can be effectively used in various phases of scientific research. |
Goran Bubaš; | arxiv-cs.AI | 2024-09-21 |
274 | AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the capabilities and potential of the intelligent personal assistant (IPA) CORE (Checklist Organizer for Research and Exploration), designed to support astronauts during procedures onboard the International Space Station (ISS), the Lunar Gateway station, and beyond. |
OLIVER BENSCH et. al. | arxiv-cs.AI | 2024-09-21 |
275 | Loop Neural Networks for Parameter Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size. |
Kei-Sing Ng; Qingchen Wang; | arxiv-cs.AI | 2024-09-21 |
276 | QMOS: Enhancing LLMs for Telecommunication with Question Masked Loss and Option Shuffling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces QMOS, an innovative approach which uses a Question-Masked loss and Option Shuffling trick to enhance the performance of LLMs in answering Multiple-Choice Questions in the telecommunications domain. |
Blessed Guda; Gabrial Zencha A.; Lawrence Francis; Carlee Joe-Wong; | arxiv-cs.CL | 2024-09-21 |
277 | On Importance of Pruning and Distillation for Efficient Low Resource NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the case of the low-resource Indic language Marathi. |
AISHWARYA MIRASHI et. al. | arxiv-cs.CL | 2024-09-21 |
278 | T2M-X: Learning Expressive Text-to-Motion Generation from Partially Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose T2M-X, a two-stage method that learns expressive text-to-motion generation from partially annotated data. |
Mingdian Liu; Yilin Liu; Gurunandan Krishnan; Karl S Bayer; Bing Zhou; | arxiv-cs.CV | 2024-09-20 |
279 | Prompting Large Language Models for Supporting The Differential Diagnosis of Anemia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by clinical guidelines, our study aimed to develop pathways similar to those that can be obtained in clinical guidelines. |
Elisa Castagnari; Lillian Muyama; Adrien Coulet; | arxiv-cs.CL | 2024-09-20 |
280 | Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are renowned for their exceptional capabilities, and applying to a wide range of applications. |
Md Abdur Rahman; Hossain Shahriar; Fan Wu; Alfredo Cuzzocrea; | arxiv-cs.CL | 2024-09-20 |
281 | ‘Since Lawyers Are Males..’: Examining Implicit Gender Bias in Hindi Language Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education. |
Ishika Joshi; Ishita Gupta; Adrita Dey; Tapan Parikh; | arxiv-cs.CL | 2024-09-20 |
282 | Drift to Remember Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that representational drift can alleviate catastrophic forgetting in AI during new task acquisition. To test this, we introduce DriftNet, a network designed to constantly explore various local minima in the loss landscape while dynamically retrieving relevant tasks. |
JIN DU et. al. | arxiv-cs.AI | 2024-09-20 |
283 | HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This approach ensures that the correlation between the original and updated parameters is preserved, leveraging the semantic features learned during pre-training. Building on this paradigm, we present the Hadamard Updated Transformation (HUT) method. |
Geyuan Zhang; Xiaofei Zhou; Chuheng Chen; | arxiv-cs.CL | 2024-09-20 |
284 | 3DTopia-XL: Scaling High-quality 3D Asset Generation Via Primitive Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for physically based rendering (PBR). In this paper, we introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome these limitations. |
ZHAOXI CHEN et. al. | arxiv-cs.CV | 2024-09-19 |
285 | $\text{M}^\text{6}(\text{GPT})^\text{3}$: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm for the generation of melodic elements. |
Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | arxiv-cs.SD | 2024-09-19 |
286 | TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing prompt compression techniques either rely on sub-optimal metrics such as information entropy or model it as a task-agnostic token classification problem that fails to capture task-specific information. To address these issues, we propose a novel and efficient reinforcement learning (RL) based task-aware prompt compression method. |
SHIVAM SHANDILYA et. al. | arxiv-cs.CL | 2024-09-19 |
287 | Introducing The Large Medical Model: State of The Art Healthcare Cost and Risk Prediction with Transformers Trained on Patient Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration. |
RICKY SAHU et. al. | arxiv-cs.LG | 2024-09-19 |
288 | Recommendation with Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a taxonomy that categorizes DGMs into three types: ID-driven models, large language models (LLMs), and multimodal models. |
YASHAR DELDJOO et. al. | arxiv-cs.IR | 2024-09-18 |
289 | Self-Supervised Pre-training Tasks for An FMRI Time-series Transformer in Autism Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address over-fitting in small datasets and enhance the model performance, we propose self-supervised pre-training tasks to reconstruct the randomly masked fMRI time-series data, investigating the effects of various masking strategies. |
Yinchi Zhou; Peiyu Duan; Yuexi Du; Nicha C. Dvornek; | arxiv-cs.CV | 2024-09-18 |
290 | AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders. |
PENGAN CHEN et. al. | arxiv-cs.RO | 2024-09-18 |
291 | Program Slicing in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the application of large language models (LLMs) to both static and dynamic program slicing, with a focus on Java programs. |
Kimya Khakzad Shahandashti; Mohammad Mahdi Mohajer; Alvine Boaye Belle; Song Wang; Hadi Hemmati; | arxiv-cs.SE | 2024-09-18 |
292 | American Sign Language to Text Translation Using Transformer and Seq2Seq with LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the Transformer with the Sequence-to-Sequence (Seq2Seq) model in translating sign language to text. |
Gregorius Guntur Sunardi Putra; Adifa Widyadhani Chanda D’Layla; Dimas Wahono; Riyanarto Sarno; Agus Tri Haryono; | arxiv-cs.CL | 2024-09-17 |
293 | Small Language Models Can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate the creative fiction writing abilities of a fine-tuned small language model (SLM), BART Large, and compare its performance to humans and two large language models (LLMs): GPT-3.5 and GPT-4o. |
Guillermo Marco; Luz Rello; Julio Gonzalo; | arxiv-cs.CL | 2024-09-17 |
294 | Adaptive Large Language Models By Layerwise Attention Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we propose to challenge this and introduce adaptive computations for LLM-like setups, which allow the final layer to attend to all of the intermediate layers as it deems fit through the attention mechanism, thereby introducing computational \textbf{attention shortcuts}. |
Prateek Verma; Mert Pilanci; | arxiv-cs.CL | 2024-09-16 |
295 | Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories. |
Shaznin Sultana; Sadia Afreen; Nasir U. Eisty; | arxiv-cs.SE | 2024-09-16 |
296 | LLMs for Clinical Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the efficacy of GPT-4 and clinalytix Medical AI in predicting the clinical risk of delirium development. |
Mohamed Rezk; Patricia Cabanillas Silva; Fried-Michael Dahlweid; | arxiv-cs.CL | 2024-09-16 |
297 | Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, inspired by the recent public release of the GPT-o1 models, we conduct the first study to compare the effectiveness of different versions of the GPT-family models in APR. |
Haichuan Hu; Ye Shang; Guolin Xu; Congqing He; Quanjun Zhang; | arxiv-cs.SE | 2024-09-16 |
298 | SelECT-SQL: Self-correcting Ensemble Chain-of-Thought for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SelECT-SQL, a novel in-context learning solution that uses an algorithmic combination of chain-of-thought (CoT) prompting, self-correction, and ensemble methods to yield a new state-of-the-art result on challenging Text-to-SQL benchmarks. |
Ke Shen; Mayank Kejriwal; | arxiv-cs.CL | 2024-09-16 |
299 | Investigating The Impact of Code Comment Inconsistency on Bug Introducing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research investigates the impact of code-comment inconsistency on bug introduction using large language models, specifically GPT-3.5. |
Shiva Radmanesh; Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour; | arxiv-cs.SE | 2024-09-16 |
300 | CAT: Customized Transformer Accelerator Framework on Versal ACAP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is far more flexible than GPU in hardware customization, and has better and smaller design solution space than traditional FPGA. Therefore, this paper proposes the Customized Transformer Accelerator Framework(CAT), through the CAT framework, a customized Transformer accelerator family can be derived on Versal ACAP, CAT framework has an abstract accelerator architecture design idea, which deconstructs and efficiently maps the Transformer into the hardware, which contains a variety of customizable properties. |
Wenbo Zhang; Yiqi Liu; Zhenshan Bao; | arxiv-cs.AR | 2024-09-15 |
301 | GP-GPT: Large Language Model for Gene-Phenotype Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis. |
YANJUN LYU et. al. | arxiv-cs.CL | 2024-09-15 |
302 | Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive investigation of the use of large language models (LLMs) and their capabilities in detecting OWASP Top Ten vulnerabilities in Solidity. |
Md Tauseef Alam; Raju Halder; Abyayananda Maiti; | arxiv-cs.CR | 2024-09-15 |
303 | Leveraging Open-Source Large Language Models for Native Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Native Language Identification (NLI) – the task of identifying the native language (L1) of a person based on their writing in the second language (L2) – has applications in forensics, marketing, and second language acquisition. Historically, conventional machine learning approaches that heavily rely on extensive feature engineering have outperformed transformer-based language models on this task. |
Yee Man Ng; Ilia Markov; | arxiv-cs.CL | 2024-09-15 |
304 | Evaluating Authenticity and Quality of Image Captions Via Sentiment and Semantic Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an evaluation method focused on sentiment and semantic richness. |
Aleksei Krotov; Alison Tebo; Dylan K. Picart; Aaron Dean Algave; | arxiv-cs.CV | 2024-09-14 |
305 | Undergrads Are All You Have Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper also demonstrates that GPT-UGRD is cheaper and easier to train and operate than transformer models. In this paper, we outline the implementation, application, multi-tenanting, and social implications of using this new model in research and other contexts. |
Ashe Neth; | arxiv-cs.CY | 2024-09-13 |
306 | Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a comprehensive framework for evaluating VLMs tailored to VQA tasks in practical settings. |
Neelabh Sinha; Vinija Jain; Aman Chadha; | arxiv-cs.CV | 2024-09-13 |
307 | Autoregressive + Chain of Thought = Recurrent: Recurrence’s Role in Language Models’ Computability and A Revisit of Recurrent Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thoroughly investigate the influence of recurrent structures in neural models on their reasoning abilities and computability, contrasting the role autoregression plays in the neural models’ computational power. |
Xiang Zhang; Muhammad Abdul-Mageed; Laks V. S. Lakshmanan; | arxiv-cs.CL | 2024-09-13 |
308 | Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper’s contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices. |
Jake Street; Isibor Ihianle; Funminiyi Olajide; Ahmad Lotfi; | arxiv-cs.LG | 2024-09-12 |
309 | SDformer: Efficient End-to-End Transformer for Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a different window-based Transformer architecture for depth completion tasks named Sparse-to-Dense Transformer (SDformer). |
JIAN QIAN et. al. | arxiv-cs.CV | 2024-09-12 |
310 | Towards Fairer Health Recommendations: Finding Informative Unbiased Samples Via Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, some of these terms, especially those related to race and ethnicity, can carry different meanings (e.g., white matter of spinal cord). To address this issue, we propose the use of Word Sense Disambiguation models to refine dataset quality by removing irrelevant sentences. |
GAVIN BUTTS et. al. | arxiv-cs.CL | 2024-09-11 |
311 | A Novel Mathematical Framework for Objective Characterization of Ideas Through Vector Embeddings in LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This method suffers from limitations such as human judgment errors, bias, and oversight. Addressing this gap, our study introduces a comprehensive mathematical framework for automated analysis to objectively evaluate the plethora of ideas generated by CAI systems and/or humans. |
B. Sankar; Dibakar Sen; | arxiv-cs.AI | 2024-09-11 |
312 | A Fine-grained Sentiment Analysis of App Reviews Using Large Language Models: An Evaluation Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Analyzing user reviews for sentiment towards app features can provide valuable insights into users’ perceptions of app functionality and their evolving needs. |
Faiz Ali Shah; Ahmed Sabir; Rajesh Sharma; | arxiv-cs.CL | 2024-09-11 |
313 | GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address the challenges of using LLM-as-a-Judge when evaluating grounded answers generated by RAG systems. |
Sacha Muller; António Loison; Bilel Omrani; Gautier Viaud; | arxiv-cs.CL | 2024-09-10 |
314 | FairHome: A Fair Housing and Fair Lending Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories. |
Anusha Bagalkotkar; Aveek Karmakar; Gabriel Arnson; Ondrej Linda; | arxiv-cs.LG | 2024-09-09 |
315 | Harmonic Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are becoming very popular and are used for many different purposes, including creative tasks in the arts. |
Anna Kruspe; | arxiv-cs.CL | 2024-09-09 |
316 | Retrofitting Temporal Graph Neural Networks with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose TF-TGN, which uses Transformer decoder as the backbone model for TGNN to enjoy Transformer’s codebase for efficient training. |
QIANG HUANG et. al. | arxiv-cs.LG | 2024-09-09 |
317 | NOVI : Chatbot System for University Novice with BERT and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the difficulties of university freshmen in adapting to university life, we developed NOVI, a chatbot system based on GPT-4o. |
Yoonji Nam; TaeWoong Seo; Gyeongcheol Shin; Sangji Lee; JaeEun Im; | arxiv-cs.CL | 2024-09-09 |
318 | Can Large Language Models Unlock Novel Scientific Research Ideas? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the capability of LLMs in generating novel research ideas based on information from research papers. |
Sandeep Kumar; Tirthankar Ghosal; Vinayak Goyal; Asif Ekbal; | arxiv-cs.CL | 2024-09-09 |
319 | Identifying The Sources of Ideological Bias in GPT Models Through Linguistic Variation in Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we provide an original approach to identifying ideological bias in generative models, showing that bias can stem from both the training data and the filtering algorithm. |
Christina Walker; Joan C. Timoneda; | arxiv-cs.CL | 2024-09-09 |
320 | Low Latency Transformer Inference on FPGAs for Physics Applications with Hls4ml Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays(FPGAs) using hls4ml. |
ZHIXING JIANG et. al. | arxiv-cs.LG | 2024-09-08 |
321 | The Emergence of Large Language Models (LLM) As A Tool in Literature Reviews: An LLM Automated Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review. |
Dmitry Scherbakov; Nina Hubig; Vinita Jansari; Alexander Bakumenko; Leslie A. Lenert; | arxiv-cs.DL | 2024-09-06 |
322 | Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PIPELOAD mechanism, we present Hermes, a framework optimized for large model inference on edge devices. |
XUEYUAN HAN et. al. | arxiv-cs.DC | 2024-09-06 |
323 | LLM-based Multi-agent Poetry Generation in Non-cooperative Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Under the rationale that the learning process of the poetry generation systems should be more human-like and their output more diverse and novel, we introduce a framework based on social learning where we emphasize non-cooperative interactions besides cooperative interactions to encourage diversity. |
Ran Zhang; Steffen Eger; | arxiv-cs.CL | 2024-09-05 |
324 | CA-BERT: Leveraging Context Awareness for Enhanced Multi-Turn Chat Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional models often struggle with determining when additional context is necessary for generating appropriate responses. This paper introduces Context-Aware BERT (CA-BERT), a transformer-based model specifically fine-tuned to address this challenge. |
Minghao Liu; Mingxiu Sui; Yi Nan; Cangqing Wang; Zhijie Zhou; | arxiv-cs.CL | 2024-09-05 |
325 | CACER: Clinical Concept Annotations for Cancer Events and Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Clinical Concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48,000 medical problems and drug events and 10,000 drug-problem and problem-problem relations. |
YUJUAN FU et. al. | arxiv-cs.CL | 2024-09-05 |
326 | Detecting Calls to Action in Multimodal Content: Analysis of The 2021 German Federal Election Campaign on Instagram Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts. |
Michael Achmann-Denkler; Jakob Fehle; Mario Haim; Christian Wolff; | arxiv-cs.SI | 2024-09-04 |
327 | OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the experiments and results for the CheckThat! |
WŁODZIMIERZ LEWONIEWSKI et. al. | arxiv-cs.CL | 2024-09-04 |
328 | MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Finally many Transformer based approaches rely primarily on CNN based decoders overlooking the benefits of Transformer based decoding models. Recognizing these limitations, we address the need efficient lightweight solutions by introducing MobileUNETR, which aims to overcome the performance constraints associated with both CNNs and Transformers while minimizing model size, presenting a promising stride towards efficient image segmentation. |
Shehan Perera; Yunus Erzurumlu; Deepak Gulati; Alper Yilmaz; | arxiv-cs.CV | 2024-09-04 |
329 | Dialogue You Can Trust: Human and AI Perspectives on Generated Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing the GPT-4o API, we generated a diverse dataset of conversations and conducted a two-part experimental analysis. |
Ike Ebubechukwu; Johane Takeuchi; Antonello Ceravola; Frank Joublin; | arxiv-cs.CL | 2024-09-03 |
330 | LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Modeling and predicting such intricate behavior without explicit knowledge of the system’s underlying topology presents a significant challenge, motivating the development of algorithms that can generalize across various grid configurations and boundary conditions. We develop a decoder-only generative pretrained transformer (GPT) model to solve this problem, showing that our model can simulate Life on a toroidal grid with no prior knowledge on the size of the grid, or its periodic boundary conditions (LifeGPT). |
Jaime A. Berkovich; Markus J. Buehler; | arxiv-cs.AI | 2024-09-03 |
331 | How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper seeks to address this gap by providing a comprehensive case study evaluating LLMs’ performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) with respect to privacy policies and data protection regulations. We introduce a Privacy Technical Review (PTR) framework, highlighting its role in mitigating privacy risks during the software development life-cycle. |
XICHOU ZHU et. al. | arxiv-cs.CL | 2024-09-03 |
332 | Beyond ChatGPT: Enhancing Software Quality Assurance Tasks with Diverse LLMs and Validation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There remains a gap in understanding the performance of various LLMs in this critical domain. This paper aims to address this gap by conducting a comprehensive investigation into the capabilities of several LLMs across two SQA tasks: fault localization and vulnerability detection. |
Ratnadira Widyasari; David Lo; Lizi Liao; | arxiv-cs.SE | 2024-09-02 |
333 | The Role of Transformer Models in Advancing Blockchain Technology: A Systematic Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to provide new perspectives and a research foundation for the integrated development of blockchain technology and machine learning, supporting further innovation and application expansion of blockchain technology. |
TIANXU LIU et. al. | arxiv-cs.LG | 2024-09-02 |
334 | Towards Faster Graph Partitioning Via Pre-training and Inductive Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. |
MENG QIN et. al. | arxiv-cs.LG | 2024-09-01 |
335 | Research on LLM Acceleration Using The High-Performance RISC-V Processor Xiangshan (Nanhu Version) Based on The Open-Source Matrix Instruction Set Extension (Vector Dot Product) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main contributions of this paper are as follows: For the characteristics of large language models, custom instructions were extended based on the RISC-V instruction set to perform vector dot product calculations, accelerating the computation of large language models on dedicated vector dot product acceleration hardware. |
XU-HAO CHEN et. al. | arxiv-cs.AR | 2024-09-01 |
336 | An Empirical Study on Information Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs’ human-like characteristics, we propose and analyze the effects of a series of simple prompt-based methods, which can be generalized to other LLMs and NLP tasks. |
RIDONG HAN et. al. | arxiv-cs.CL | 2024-08-31 |
337 | From Text to Emotion: Unveiling The Emotion Annotation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the potential of Large Language Models (LLMs), specifically GPT4, in automating or assisting emotion annotation. |
Minxue Niu; Mimansa Jaiswal; Emily Mower Provost; | arxiv-cs.CL | 2024-08-30 |
338 | Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning), using leverage retrieval information from the memory to aid in generating accurate answers and persuasive explanations without relying on complex networks and extra datasets. |
Su Hyeon Lim; Minkuk Kim; Hyeon Bae Kim; Seong Tae Kim; | arxiv-cs.CV | 2024-08-30 |
339 | Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study performs a comparative analysis of various natural language models for medical text classification. |
SHUBHAM AGARWAL et. al. | arxiv-cs.CL | 2024-08-30 |
340 | Can Large Language Models Address Open-Target Stance Detection? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Open-Target Stance Detection (OTSD), the most realistic task where targets are neither seen during training nor provided as input. |
Abu Ubaida Akash; Ahmed Fahmy; Amine Trabelsi; | arxiv-cs.CL | 2024-08-30 |
341 | Finding Frames with BERT: A Transformer-based Approach to Generic News Frame Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The availability of digital data offers new possibilities for studying how specific aspects of social reality are made more salient in online communication but also raises challenges related to the scaling of framing analysis and its adoption to new research areas (e.g. studying the impact of artificial intelligence-powered systems on representation of societally relevant issues). To address these challenges, we introduce a transformer-based approach for generic news frame detection in Anglophone online content. |
Vihang Jumle; Mykola Makhortykh; Maryna Sydorova; Victoria Vziatysheva; | arxiv-cs.CL | 2024-08-30 |
342 | Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in The Environmental and Climate Change Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through this research, we aim to contribute to the ongoing discussion on the utility and effectiveness of generative LMs in addressing some of the planet’s most urgent issues, highlighting their strengths and limitations in the context of ecology and CC. |
Francesca Grasso; Stefano Locci; | arxiv-cs.CL | 2024-08-30 |
343 | ProGRes: Prompted Generative Rescoring on ASR N-Best Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs. |
Ada Defne Tur; Adel Moumen; Mirco Ravanelli; | arxiv-cs.CL | 2024-08-30 |
344 | Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing far-right and far-left ideological keywords and manually labeled them as extremist or non-extremist. |
Beidi Dong; Jin R. Lee; Ziwei Zhu; Balassubramanian Srinivasan; | arxiv-cs.CL | 2024-08-29 |
345 | MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following current trends in machine learning, we have created a foundation model for the MAPF problems called MAPF-GPT. |
Anton Andreychuk; Konstantin Yakovlev; Aleksandr Panov; Alexey Skrynnik; | arxiv-cs.MA | 2024-08-29 |
346 | Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Fully Pipelined Distributed Transformer (FPDT) for efficiently training long-context LLMs with extreme hardware efficiency. |
JINGHAN YAO et. al. | arxiv-cs.DC | 2024-08-29 |
347 | Can AI Replace Human Subjects? A Large-Scale Replication of Psychological Experiments with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that GPT-4 successfully replicates 76.0 percent of main effects and 47.0 percent of interaction effects observed in the original studies, closely mirroring human responses in both direction and significance. |
Ziyan Cui; Ning Li; Huaikang Zhou; | arxiv-cs.CL | 2024-08-29 |
348 | FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses Over SORRY-Bench (Automated Multi-shot Jailbreaks) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FRACTURED-SORRY-Bench, a framework for evaluating the safety of Large Language Models (LLMs) against multi-turn conversational attacks. |
Aman Priyanshu; Supriti Vijay; | arxiv-cs.CL | 2024-08-28 |
349 | Unleashing The Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Audio-Language-Referenced SAM 2 (AL-Ref-SAM 2) pipeline to explore the training-free paradigm for audio and language-referenced video object segmentation, namely AVS and RVOS tasks. |
SHAOFEI HUANG et. al. | arxiv-cs.CV | 2024-08-28 |
350 | The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of converting these pretrained models for deployment. |
Junxiong Wang; Daniele Paliotta; Avner May; Alexander M. Rush; Tri Dao; | arxiv-cs.LG | 2024-08-27 |
351 | A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this review paper, we provide an extensive overview of various transformer architectures adapted for computer vision tasks. |
Gracile Astlin Pereira; Muhammad Hussain; | arxiv-cs.CV | 2024-08-27 |
352 | Speech Recognition Transformers: Topological-lingualism Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a comprehensive survey of transformer techniques oriented in speech modality. |
Shruti Singh; Muskaan Singh; Virender Kadyan; | arxiv-cs.CL | 2024-08-27 |
353 | One-layer Transformers Fail to Solve The Induction Heads Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient … |
Clayton Sanford; Daniel Hsu; Matus Telgarsky; | arxiv-cs.LG | 2024-08-26 |
354 | Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluated multiple models, including OpenAI’s gpt-3.5-turbo, gpt-4o, and ZhipuAI’s glm-4, through a two-phase testing approach. |
LIUCHANG XU et. al. | arxiv-cs.CL | 2024-08-26 |
355 | Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite considerable efforts in attack detection, intrusion detection systems remain mostly reactive, responding to specific patterns or observed anomalies. This work proposes a proactive approach to anticipate and mitigate malicious activities before they cause damage. |
Alaeddine Diaf; Abdelaziz Amara Korba; Nour Elislem Karabadji; Yacine Ghamri-Doudane; | arxiv-cs.CR | 2024-08-26 |
356 | Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Methods: We used 300 gastroenterology board exam-style multiple-choice questions, 138 of which contain images to systematically assess the impact of model configurations and parameters and prompt engineering strategies utilizing GPT-3.5. |
SEYED AMIR AHMAD SAFAVI-NAINI et. al. | arxiv-cs.CL | 2024-08-25 |
357 | LowCLIP: Adapting The CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address challenges in vision-language retrieval for low-resource languages, we integrated the CLIP model architecture and employed several techniques to balance computational efficiency with performance. |
Ali Asgarov; Samir Rustamov; | arxiv-cs.CV | 2024-08-25 |
358 | Bidirectional Awareness Induction in Autoregressive Seq2Seq Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Bidirectional Awareness Induction (BAI), a training method that leverages a subset of elements in the network, the Pivots, to perform bidirectional learning without breaking the autoregressive constraints. |
Jia Cheng Hu; Roberto Cavicchioli; Alessandro Capotondi; | arxiv-cs.CL | 2024-08-25 |
359 | Preliminary Investigations of A Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an innovative architecture that leverages the generative capabilities of zero-shot prompting in Large Language Models (LLMs) such as GPT-4(language only), the predictive ability of few-shot (in-context) learning in Large Multimodal Models (LMMs) such as GPT-4(V)ision, and fuses knowledge across image based and linguistic insights for accurate nanomaterial category prediction. |
Sakhinana Sagar Srinivas; Geethan Sannidhi; Sreeja Gangasani; Chidaksh Ravuru; Venkataramana Runkana; | arxiv-cs.CV | 2024-08-24 |
360 | CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a CNN-Transformer rectified collaborative learning (CTRCL) framework to learn stronger CNN-based and Transformer-based models for MIS tasks via the bi-directional knowledge transfer between them. |
LANHU WU et. al. | arxiv-cs.CV | 2024-08-24 |
361 | Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine Against COVID-19 Literature: Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed. |
XU TONG et. al. | arxiv-cs.CL | 2024-08-24 |
362 | Enhancing Multi-hop Reasoning Through Knowledge Erasure in Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE). |
MENGQI ZHANG et. al. | arxiv-cs.CL | 2024-08-22 |
363 | Enhancing Automated Program Repair with Solution Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a compelling question: How can we leverage DR scattered across the issue logs to efficiently enhance APR? To investigate this premise, we introduce DRCodePilot, an approach designed to augment GPT-4-Turbo’s APR capabilities by incorporating DR into the prompt instruction. |
JIUANG ZHAO et. al. | arxiv-cs.SE | 2024-08-21 |
364 | Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry. |
MENGLIN YANG et. al. | kdd | 2024-08-21 |
365 | Clinical Context-aware Radiology Report Generation from Medical Images Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the use of the transformer model for radiology report generation from chest X-rays. |
Sonit Singh; | arxiv-cs.CL | 2024-08-21 |
366 | BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a pipeline for developing an in-house LLM to extract clinical information from radiology reports. |
YUXUAN CHEN et. al. | arxiv-cs.CL | 2024-08-21 |
367 | The Self-Contained Negation Test Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we build on Gubelmann and Handschuh (2022), which studies the modification of PLMs’ predictions as a function of the polarity of inputs, in English. |
David Kletz; Pascal Amsili; Marie Candito; | arxiv-cs.CL | 2024-08-21 |
368 | Maximum-Entropy Regularized Decision Transformer with Reward Relabelling for Dynamic Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce a novel methodology named Max-Entropy enhanced Decision Transformer with Reward Relabeling for Offline RLRS (EDT4Rec). |
Xiaocong Chen; Siyu Wang; Lina Yao; | kdd | 2024-08-21 |
369 | GSTran: Joint Geometric and Semantic Coherence for Point Cloud Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In parallel, inaccurate modeling of long-distance contextual dependencies when utilizing global information can also impact model performance. To address these issues, we propose GSTran, a novel transformer network tailored for the segmentation task. |
ABIAO LI et. al. | arxiv-cs.CV | 2024-08-21 |
370 | Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands. |
Pihe Hu; Shaolong Li; Longbo Huang; | arxiv-cs.LG | 2024-08-21 |
371 | Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level operating system microkernel, seL4. |
Lichen Zhang; Shuai Lu; Nan Duan; | acl | 2024-08-20 |
372 | The MERSA Dataset and A Transformer-Based Approach for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Multimodal Emotion Recognition and Sentiment Analysis (MERSA) dataset, which includes both natural and scripted speech recordings, transcribed text, physiological data, and self-reported emotional surveys from 150 participants collected over a two-week period. |
Enshi Zhang; Rafael Trujillo; Christian Poellabauer; | acl | 2024-08-20 |
373 | Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, methods leveraging pre-trained language models like BERT have been developed, which require less data and yield enhanced performance. |
YUCHENG RUAN et. al. | arxiv-cs.CL | 2024-08-20 |
374 | Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Aim: Our goal is to improve AD detection performance of various ML/DL models. |
Emmanuel Iko-Ojo Simon; Chirath Hettiarachchi; Alex Potanin; Hanna Suominen; Fatemeh Fard; | arxiv-cs.SE | 2024-08-20 |
375 | Self-Evolving GPT: A Lifelong Autonomous Experiential Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential learning framework based on LLMs to explore whether LLMs can imitate human ability for learning and utilizing experience. |
JINGLONG GAO et. al. | acl | 2024-08-20 |
376 | Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis. |
Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu; | acl | 2024-08-20 |
377 | ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, and undergoes a comprehensive training regime with pre-training, SFT, and RLHF. |
Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang; | acl | 2024-08-20 |
378 | Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Syntactic Transformer language models aim to achieve better generalization through simultaneously modeling syntax trees and sentences. |
Yida Zhao; Chao Lou; Kewei Tu; | acl | 2024-08-20 |
379 | GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3. |
Virginia Felkner; Jennifer Thompson; Jonathan May; | acl | 2024-08-20 |
380 | CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Incorrect initial angles between Q and K can cause misestimation in modeling rotary position embedding of the closest tokens. To address this issue, we propose Collinear Constrained Attention mechanism, namely CoCA. |
SHIYI ZHU et. al. | acl | 2024-08-20 |
381 | MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel **map**-guided **GPT**-based agent, dubbed **MapGPT**, which introduces an online linguistic-formed map to encourage the global exploration. |
JIAQI CHEN et. al. | acl | 2024-08-20 |
382 | Your Transformer Is Secretly Linear Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper reveals a novel linear characteristic exclusive to transformer decoders, including models like GPT, LLaMA, OPT, BLOOM and others. |
ANTON RAZZHIGAEV et. al. | acl | 2024-08-20 |
383 | CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of a comprehensive benchmark impedes progress in this field. To bridge this gap, we introduce CharacterEval, a Chinese benchmark for comprehensive RPCA assessment, complemented by a tailored high-quality dataset. |
QUAN TU et. al. | acl | 2024-08-20 |
384 | MultiLegalPile: A 689GB Multilingual Legal Corpus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, so far, few datasets are available for specialized critical domains such as law and the available ones are often small and only in English. To fill this gap, we curate and release MultiLegalPile, a 689GB corpus in 24 languages from 17 jurisdictions. |
Joel Niklaus; Veton Matoshi; Matthias St�rmer; Ilias Chalkidis; Daniel Ho; | acl | 2024-08-20 |
385 | Mission: Impossible Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we develop a set of synthetic impossible languages of differing complexity, each designed by systematically altering English data with unnatural word orders and grammar rules. |
Julie Kallini; Isabel Papadimitriou; Richard Futrell; Kyle Mahowald; Christopher Potts; | acl | 2024-08-20 |
386 | Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer. |
MULING WU et. al. | acl | 2024-08-20 |
387 | Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs By Sampling with People Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by methods from cognitive science, we propose an iterative method for simultaneously eliciting conversational tones and sentences, where participants alternate between two tasks: (1) one participant identifies the tone of a given sentence and (2) a different participant generates a sentence based on that tone. |
Dun-Ming Huang; Pol Van Rijn; Ilia Sucholutsky; Raja Marjieh; Nori Jacoby; | acl | 2024-08-20 |
388 | An Empirical Analysis on Large Language Models in Debate Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3. |
Xinyi Liu; Pinxin Liu; Hangfeng He; | acl | 2024-08-20 |
389 | Tree Transformer�s Disambiguation Ability of Prepositional Phrase Attachment and Garden Path Effects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For comparison we evaluate a pretrained supervised BiLSTM-based model trained on constituency parsing as sequence labelling (G�mez-Rodr�guez and Vilares, 2018). |
Lingling Zhou; Suzan Verberne; Gijs Wijnholds; | acl | 2024-08-20 |
390 | MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability�MELA, with 46K samples covering 10 languages from a diverse set of language families. |
ZIYIN ZHANG et. al. | acl | 2024-08-20 |
391 | Language Models Can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks. |
Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty; | acl | 2024-08-20 |
392 | Linear Transformers with Learnable Kernel Functions Are Better In-Context Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Mirroring the Transformer�s in-context adeptness, it became a strong contender in the field. In our work, we present a singular, elegant alteration to the Based kernel that amplifies its In-Context Learning abilities evaluated with the Multi-Query Associative Recall task and overall language modeling process, as demonstrated on the Pile dataset. |
YAROSLAV AKSENOV et. al. | acl | 2024-08-20 |
393 | Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian. |
CEM ÜYÜK et. al. | arxiv-cs.CL | 2024-08-20 |
394 | D2LLM: Decomposed and Distilled Large Language Models for Semantic Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present D2LLMs�Decomposed and Distilled LLMs for semantic search�that combines the best of both worlds. |
Zihan Liao; Hang Yu; Jianguo Li; Jun Wang; Wei Zhang; | acl | 2024-08-20 |
395 | Rhyme-aware Chinese Lyric Generator Based on GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance. |
YIXIAO YUAN et. al. | arxiv-cs.CL | 2024-08-19 |
396 | GPT-based Textile Pilling Classification Using 3D Point Cloud Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PointGPT, the GPT-like big model of point cloud analysis, we incorporate the global features of the input point cloud extracted from the non-parametric network into it, thus proposing the PointGPT+NN model. |
Yu Lu; YuYu Chen; Gang Zhou; Zhenghua Lan; | arxiv-cs.CV | 2024-08-19 |
397 | How Well Do Large Language Models Serve As End-to-End Secure Code Producers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic investigation into LLMs’ inherent potential to generate code with fewer vulnerabilities. |
JIANIAN GONG et. al. | arxiv-cs.SE | 2024-08-19 |
398 | Demystifying The Communication Characteristics for Distributed Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use GPT-based language models as a case study of the transformer architecture due to their ubiquity. |
QUENTIN ANTHONY et. al. | arxiv-cs.DC | 2024-08-19 |
399 | Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs). |
Aviv Bick; Kevin Y. Li; Eric P. Xing; J. Zico Kolter; Albert Gu; | arxiv-cs.LG | 2024-08-19 |
400 | STransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we review and categorize existing Transformer-based models into two main types: (1) modifications to the model structure and (2) modifications to the input data. |
JIAHENG YIN et. al. | arxiv-cs.LG | 2024-08-19 |
401 | Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare a keyword filtering approach, a RoBERTa model fine-tuned with generic data from CrisisLex, a base RoBERTa model trained with AL and a fine-tuned RoBERTa model trained with AL regarding classification performance. |
David Hanny; Sebastian Schmidt; Bernd Resch; | arxiv-cs.CL | 2024-08-19 |
402 | A Unified Framework for Interpretable Transformers Using PDEs and Information Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel unified theoretical framework for understanding Transformer architectures by integrating Partial Differential Equations (PDEs), Neural Information Flow Theory, and Information Bottleneck Theory. |
Yukun Zhang; | arxiv-cs.LG | 2024-08-18 |
403 | A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets. |
CLAUDIO M. V. DE ANDRADE et. al. | arxiv-cs.CL | 2024-08-18 |
404 | Attention Is A Smoothed Cubic Spline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We highlight a perhaps important but hitherto unobserved insight: The attention module in a transformer is a smoothed cubic spline. |
Zehua Lai; Lek-Heng Lim; Yucong Liu; | arxiv-cs.AI | 2024-08-18 |
405 | From Specifications to Prompts: On The Future of Generative LLMs in Requirements Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative LLMs, such as GPT, have the potential to revolutionize Requirements Engineering (RE) by automating tasks in new ways. This column explores the novelties and introduces … |
Andreas Vogelsang; | arxiv-cs.SE | 2024-08-17 |
406 | See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Designing tasks and finding LLMs’ limitations are becoming increasingly important. In this paper, we investigate the question of whether an LLM can discover its own limitations from the errors it makes. |
YULONG CHEN et. al. | arxiv-cs.CL | 2024-08-16 |
407 | MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a pure Transformer-based SED model with masked-reconstruction based pre-training, termed MAT-SED. |
Pengfei Cai; Yan Song; Kang Li; Haoyu Song; Ian McLoughlin; | arxiv-cs.SD | 2024-08-16 |
408 | Extracting Sentence Embeddings from Pretrained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: Given 110M parameters BERT’s hidden representations from multiple layers and multiple tokens we tried various ways to extract optimal sentence representations. |
Lukas Stankevičius; Mantas Lukoševičius; | arxiv-cs.CL | 2024-08-15 |
409 | Leveraging Web-Crawled Data for High-Quality Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We argue that although the web-crawled data often has formatting errors causing semantic inaccuracies, it can still serve as a valuable source for high-quality supervised fine-tuning in specific domains without relying on advanced models like GPT-4. |
Jing Zhou; Chenglin Jiang; Wei Shen; Xiao Zhou; Xiaonan He; | arxiv-cs.CL | 2024-08-15 |
410 | Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This survey paper provides a comprehensive analysis of the utilization of Transformers and LLMs in cyber-threat detection systems. |
Hamza Kheddar; | arxiv-cs.CR | 2024-08-14 |
411 | MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Models for Multimodal Surface Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting). |
YONGQUAN HU et. al. | arxiv-cs.HC | 2024-08-14 |
412 | Evaluating Cultural Adaptability of A Large Language Model Via Simulation of Synthetic Personas Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis shows that specifying a person’s country of residence improves GPT-3.5’s alignment with their responses. |
Louis Kwok; Michal Bravansky; Lewis D. Griffin; | arxiv-cs.CL | 2024-08-13 |
413 | Generative AI for Automatic Topic Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to assess the reliability of three LLMs, namely flan, GPT-4o, and GPT-4 mini for topic labelling. |
Diego Kozlowski; Carolina Pradier; Pierre Benz; | arxiv-cs.CL | 2024-08-13 |
414 | Pragmatic Inference of Scalar Implicature By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some. |
Ye-eun Cho; Seong mook Kim; | arxiv-cs.CL | 2024-08-13 |
415 | Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel resource efficient deep learning and transformer based framework, Sumotosima (Summarizer for otoscopic images), an end-to-end pipeline for classification followed by summarization. |
Eram Anwarul Khan; Anas Anwarul Haq Khan; | arxiv-cs.CV | 2024-08-13 |
416 | Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the effectiveness of LLMs in detecting and classifying Common Weakness Enumerations (CWE) using different prompt and role strategies. |
Kohei Dozono; Tiago Espinha Gasiba; Andrea Stocco; | arxiv-cs.SE | 2024-08-12 |
417 | The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability of findings across different scenarios. We address this gap by training language models with progressing complexity on trauma-related datasets, including genocide-related court data, a Reddit dataset on post-traumatic stress disorder (PTSD), counseling conversations, and Incel forum posts. |
Miriam Schirmer; Tobias Leemann; Gjergji Kasneci; Jürgen Pfeffer; David Jurgens; | arxiv-cs.CL | 2024-08-12 |
418 | Spacetime $E(n)$-Transformer: Equivariant Attention for Spatio-temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an $E(n)$-equivariant Transformer architecture for spatio-temporal graph data. |
Sergio G. Charles; | arxiv-cs.LG | 2024-08-12 |
419 | Is It A Work or Leisure Travel? Applying Text Classification to Identify Work-related Travel on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a model to predict whether a trip is leisure or work-related, utilizing state-of-the-art Automatic Text Classification (ATC) models such as BERT, RoBERTa, and BART to enhance the understanding of user travel purposes and improve recommendation accuracy in specific travel scenarios. |
Lucas Félix; Washington Cunha; Jussara Almeida; | arxiv-cs.SI | 2024-08-12 |
420 | Body Transformer: Leveraging Robot Embodiment for Policy Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. Therefore, we propose Body Transformer (BoT), an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process. |
Carmelo Sferrazza; Dun-Ming Huang; Fangchen Liu; Jongmin Lee; Pieter Abbeel; | arxiv-cs.RO | 2024-08-12 |
421 | A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is a huge gap between LLM’s and human capabilities for understanding abstract concepts and reasoning. We discuss these issues in a larger philosophical context of human knowledge acquisition and the Turing test. |
Vladimir Cherkassky; Eng Hock Lee; | arxiv-cs.CL | 2024-08-12 |
422 | Cross-Lingual Conversational Speech Summarization with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We build a baseline cascade-based system using open-source speech recognition and machine translation models. |
Max Nelson; Shannon Wotherspoon; Francis Keith; William Hartmann; Matthew Snover; | arxiv-cs.CL | 2024-08-12 |
423 | A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the constantly evolving field of cybersecurity, it is imperative for analysts to stay abreast of the latest attack trends and pertinent information that aids in the investigation and attribution of cyber-attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecurity experts about cyber-attacks investigations and attribution. |
Sampath Rajapaksha; Ruby Rani; Erisa Karafili; | arxiv-cs.CR | 2024-08-12 |
424 | Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches struggle with CQA due to two challenges: (1) precisely identifying necessary conditions and the logical relationship, and (2) verifying conditions to detect any that are missing. In this paper, we propose a novel prompting approach, Chain of condition, by first identifying all conditions and constructing their logical relationships explicitly according to the document, then verifying whether these conditions are satisfied, finally solving the logical expression to indicate any missing conditions and generating the answer accordingly. |
Jiuheng Lin; Yuxuan Lai; Yansong Feng; | arxiv-cs.CL | 2024-08-10 |
425 | Evaluating The Capability of Large Language Models to Personalize Science Texts for Diverse Middle-school-age Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, GPT-4 was used to profile student learning preferences based on choices made during a training session. |
Michael Vaccaro Jr; Mikayla Friday; Arash Zaghi; | arxiv-cs.HC | 2024-08-09 |
426 | Retrieval-augmented Code Completion for Local Projects Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on using LLMs with around 160 million parameters that are suitable for local execution and augmentation with retrieval from local projects. |
Marko Hostnik; Marko Robnik-Šikonja; | arxiv-cs.SE | 2024-08-09 |
427 | From Text to Insight: Leveraging Large Language Models for Performance Evaluation in Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through comparative analyses across two studies, including various task performance outputs, we demonstrate that LLMs can serve as a reliable and even superior alternative to human raters in evaluating knowledge-based performance outputs, which are a key contribution of knowledge workers. |
Ning Li; Huaikang Zhou; Mingze Xu; | arxiv-cs.CL | 2024-08-09 |
428 | Multi-Class Intrusion Detection Based on Transformer for IoT Networks Using CIC-IoT-2023 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study uses deep learning methods to explore the Internet of Things (IoT) network intrusion detection method based on the CIC-IoT-2023 dataset. This dataset contains extensive … |
Shu-Ming Tseng; Yan-Qi Wang; Yung-Chung Wang; | Future Internet | 2024-08-08 |
429 | Transformer Explainer: Interactive Learning of Text-Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model. |
AEREE CHO et. al. | arxiv-cs.LG | 2024-08-08 |
430 | Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles Using LLMs and LMMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores how LLMs and LMMs can assist journalistic practice by generating contextualised captions for images accompanying news articles. |
Aliki Anagnostopoulou; Thiago Gouvea; Daniel Sonntag; | arxiv-cs.CL | 2024-08-08 |
431 | Towards Explainable Network Intrusion Detection Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current state-of-the-art NIDS rely on artificial benchmarking datasets, resulting in skewed performance when applied to real-world networking environments. Therefore, we compare the GPT-4 and LLama3 models against traditional architectures and transformer-based models to assess their ability to detect malicious NetFlows without depending on artificially skewed datasets, but solely on their vast pre-trained acquired knowledge. |
Paul R. B. Houssel; Priyanka Singh; Siamak Layeghy; Marius Portmann; | arxiv-cs.CR | 2024-08-08 |
432 | Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge. |
Steven Y. Feng; Noah D. Goodman; Michael C. Frank; | arxiv-cs.CL | 2024-08-07 |
433 | Bi-Level Spatial and Channel-aware Transformer for Learned Image Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these nonlinear approaches frequently overlook the frequency characteristics of images, which limits their compression efficiency. To address this issue, we propose a novel Transformer-based image compression method that enhances the transformation stage by considering frequency components within the feature map. |
Hamidreza Soltani; Erfan Ghasemi; | arxiv-cs.CV | 2024-08-07 |
434 | A Comparison of LLM Finetuning Methods & Evaluation Metrics with Travel Chatbot Use Case Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We used two pretrained LLMs utilized for fine-tuning research: LLaMa 2 7B, and Mistral 7B. |
Sonia Meyer; Shreya Singh; Bertha Tam; Christopher Ton; Angel Ren; | arxiv-cs.CL | 2024-08-07 |
435 | Evaluating Source Code Quality with Large Languagem Models: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to describe and analyze the results obtained using LLMs as a static analysis tool, evaluating the overall quality of code. |
Igor Regis da Silva Simões; Elaine Venson; | arxiv-cs.SE | 2024-08-07 |
436 | Image-to-LaTeX Converter for Mathematical Formulas and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this project, we train a vision encoder-decoder model to generate LaTeX code from images of mathematical formulas and text. |
Daniil Gurgurov; Aleksey Morshnev; | arxiv-cs.CL | 2024-08-07 |
437 | Accuracy and Consistency of LLMs in The Registered Dietitian Exam: The Impact of Prompt Engineering and Knowledge Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to employ the Registered Dietitian (RD) exam to conduct a standard and comprehensive evaluation of state-of-the-art LLMs, GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro, assessing both accuracy and consistency in nutrition queries. |
Iman Azimi; Mohan Qi; Li Wang; Amir M. Rahmani; Youlin Li; | arxiv-cs.CL | 2024-08-06 |
438 | Training LLMs to Recognize Hedges in Spontaneous Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: After an error analysis on the top performing approaches, we used an LLM-in-the-Loop approach to improve the gold standard coding, as well as to highlight cases in which hedges are ambiguous in linguistically interesting ways that will guide future research. |
Amie J. Paige; Adil Soubki; John Murzaku; Owen Rambow; Susan E. Brennan; | arxiv-cs.CL | 2024-08-06 |
439 | HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. |
Pratyush Dhingra; Janardhan Rao Doppa; Partha Pratim Pande; | arxiv-cs.AR | 2024-08-06 |
440 | Evaluating The Translation Performance of Large Language Models Based on Euas-20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the significant progress in translation performance achieved by large language models, machine translation still faces many challenges. Therefore, in this paper, we construct the dataset Euas-20 to evaluate the performance of large language models on translation tasks, the translation ability on different languages, and the effect of pre-training data on the translation ability of LLMs for researchers and developers. |
Yan Huang; Wei Liu; | arxiv-cs.CL | 2024-08-06 |
441 | PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent success of pre-trained models (PTMs) in natural language processing (NLP), we present PTM4Tag+, a tag recommendation framework for Stack Overflow posts that utilizes PTMs in language modeling. |
JUNDA HE et. al. | arxiv-cs.SE | 2024-08-05 |
442 | Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG. |
Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour; | arxiv-cs.SE | 2024-08-05 |
443 | Evaluating The Performance of Large Language Models for SDG Mapping (Technical Report) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we compare the performance of various language models on the Sustainable Development Goal (SDG) mapping task, using the output of GPT-4o as the baseline. |
Hui Yin; Amir Aryani; Nakul Nambiar; | arxiv-cs.LG | 2024-08-04 |
444 | AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce AVESFormer, the first real-time Audio-Visual Efficient Segmentation transformer that achieves fast, efficient and light-weight simultaneously. |
ZILI WANG et. al. | arxiv-cs.CV | 2024-08-03 |
445 | MiniCPM-V: A GPT-4V Level MLLM on Your Phone IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MiniCPM-V, a series of efficient MLLMs deployable on end-side devices. |
YUAN YAO et. al. | arxiv-cs.CV | 2024-08-03 |
446 | QFormer: An Efficient Quaternion Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Secondly, the DCNNs or Transformer-based image denoising models usually have a large number of parameters, high computational complexity, and slow inference speed. To resolve these issues, this paper proposes a highly-efficient Quaternion Transformer (QFormer) for image denoising. |
Bo Jiang; Yao Lu; Guangming Lu; Bob Zhang; | ijcai | 2024-08-03 |
447 | Class-consistent Contrastive Learning Driven Cross-dimensional Transformer for 3D Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer emerges as an active research topic in medical image analysis. Yet, three substantial challenges limit the effectiveness of both 2D and 3D Transformers in 3D medical … |
Qikui Zhu; Chuan Fu; Shuo Li; | ijcai | 2024-08-03 |
448 | FreqFormer: Frequency-aware Transformer for Lightweight Image Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the significant progress in SR, Transformer-based SR methods (e.g., SwinIR) still suffer from the problems of heavy computation cost and low-frequency preference, while ignoring the reconstruction of rich high-frequency information, hence hindering the representational power of Transformers. To address these issues, in this paper, we propose a novel Frequency-aware Transformer (FreqFormer) for lightweight image SR. |
TAO DAI et. al. | ijcai | 2024-08-03 |
449 | Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants. |
ZHUOYI LIN et. al. | ijcai | 2024-08-03 |
450 | X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities. |
HAOYUAN JIANG et. al. | ijcai | 2024-08-03 |
451 | TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires. |
Yicheng Lin; Dandan Zhang; Yun Liu; | arxiv-cs.LG | 2024-08-02 |
452 | Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the basis of this analysis, we propose DefinitionEMB, a method that utilizes definitions to re-construct isotropically distributed and semantics-related token embeddings for encoder-based PLMs while maintaining original robustness during fine-tuning. |
Ying Zhang; Dongyuan Li; Manabu Okumura; | arxiv-cs.CL | 2024-08-02 |
453 | Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces ‘Psycho Analyst’, a custom GPT model based on OpenAI’s GPT-4, optimized for pre-screening mental health disorders. |
Jinwen Tang; Yi Shang; | arxiv-cs.CY | 2024-08-02 |
454 | High-Throughput Phenotyping of Clinical Text Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a performance comparison of GPT-4 and GPT-3.5-Turbo. |
Daniel B. Hier; S. Ilyas Munzir; Anne Stahlfeld; Tayo Obafemi-Ajayi; Michael D. Carrithers; | arxiv-cs.CL | 2024-08-02 |
455 | Efficacy of Large Language Models in Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the effectiveness of Large Language Models (LLMs) in interpreting existing literature through a systematic review of the relationship between Environmental, Social, and Governance (ESG) factors and financial performance. |
Aaditya Shah; Shridhar Mehendale; Siddha Kanthi; | arxiv-cs.CL | 2024-08-02 |
456 | Toward Automatic Relevance Judgment Using Vision–Language Models for Image–Text Retrieval Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain. |
Jheng-Hong Yang; Jimmy Lin; | arxiv-cs.IR | 2024-08-02 |
457 | Multilevel Intrusion Detection Based on Transformer and Wavelet Transform for IoT Data Security Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Internet of Things (IoT) technology and systems have penetrated every aspect of our lives and generated enormous economic benefits. At the same time, research on the data … |
Peifeng Liang; Lina Yang; Z. Xiong; Xuemin Zhang; Gang Liu; | IEEE Internet of Things Journal | 2024-08-01 |
458 | Transformer-Based Reinforcement Learning for Scalable Multi-UAV Area Coverage Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Compared with terrestrial networks, unmanned aerial vehicles (UAVs) have the characteristics of flexible deployment and strong adaptability, which are an important supplement to … |
DEZHI CHEN et. al. | IEEE Transactions on Intelligent Transportation Systems | 2024-08-01 |
459 | MNAT-Net: Multi-Scale Neighborhood Aggregation Transformer Network for Point Cloud Classification and Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Accurate understanding of 3D objects in complex scenes plays essential roles in the fields of intelligent transportation and autonomous driving technology. Recent deep neural … |
Xuchu Wang; Yue Yuan; | IEEE Transactions on Intelligent Transportation Systems | 2024-08-01 |
460 | Leveraging Large Language Models (LLMs) for Traffic Management at Urban Intersections: The Case of Mixed Traffic Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the ability of a Large Language Model (LLM), specifically, GPT-4o-mini to improve traffic management at urban intersections. |
Sari Masri; Huthaifa I. Ashqar; Mohammed Elhenawy; | arxiv-cs.CL | 2024-08-01 |
461 | MAE-EEG-Transformer: A Transformer-based Approach Combining Masked Autoencoder and Cross-individual Data Augmentation Pre-training for EEG Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Miao Cai; Yu Zeng; | Biomed. Signal Process. Control. | 2024-08-01 |
462 | Granting GPT-4 License and Opportunity: Enhancing Accuracy and Confidence Estimation for Few-Shot Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present effort explores methods for effective confidence estimation with GPT-4 with few-shot learning for event detection in the BETTER ontology as a vehicle. |
Steven Fincke; Adrien Bibal; Elizabeth Boschee; | arxiv-cs.AI | 2024-08-01 |
463 | OmniParser for Pure Vision Based GUI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. |
Yadong Lu; Jianwei Yang; Yelong Shen; Ahmed Awadallah; | arxiv-cs.CV | 2024-07-31 |
464 | The Llama 3 Herd of Models IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new set of foundation models, called Llama 3. |
ABHIMANYU DUBEY et. al. | arxiv-cs.AI | 2024-07-31 |
465 | Generative Expressive Conversational Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, due to the limitations of small-scale datasets containing scripted recording styles, they often fail to simulate real natural conversational styles. To address the above issues, we propose a novel generative expressive CSS system, termed GPT-Talker. |
Rui Liu; Yifan Hu; Yi Ren; Xiang Yin; Haizhou Li; | arxiv-cs.CL | 2024-07-31 |
466 | Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ultimately, we find that the GPT models that we evaluated are not suitable for fully automated vulnerability scanning because the false positive and false negative rates are too high to likely be useful in practice. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CR | 2024-07-31 |
467 | Performance of Recent Large Language Models for A Low-Resourced Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have shown significant advances in the past year. |
Ravindu Jayakody; Gihan Dias; | arxiv-cs.CL | 2024-07-31 |
468 | Robust Load Prediction of Power Network Clusters Based on Cloud-Model-Improved Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Presenting an innovative approach, the Cloud Model Improved Transformer (CMIT) method integrates the Transformer model with the cloud model utilizing the particle swarm optimization algorithm, with the aim of achieving robust and precise power load predictions. |
Cheng Jiang; Gang Lu; Xue Ma; Di Wu; | arxiv-cs.LG | 2024-07-30 |
469 | Interpretable Pre-Trained Transformers for Heart Time-Series Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we employ this framework to the analysis of clinical heart time-series data, to create two pre-trained general purpose cardiac models, termed PPG-PT and ECG-PT. |
Harry J. Davies; James Monsen; Danilo P. Mandic; | arxiv-cs.LG | 2024-07-30 |
470 | Comparison of Large Language Models for Generating Contextually Relevant Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The contribution of this research is the analysis of the capacity of LLMs for Automatic Question Generation in education. |
IVO LODOVICO MOLINA et. al. | arxiv-cs.CL | 2024-07-30 |
471 | Enhancing Agricultural Machinery Management Through Advanced LLM Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach that leverages large language models (LLMs), particularly GPT-4, combined with multi-round prompt engineering to enhance decision-making processes in agricultural machinery management. |
Emily Johnson; Noah Wilson; | arxiv-cs.CL | 2024-07-30 |
472 | Legal Minds, Algorithmic Decisions: How LLMs Apply Constitutional Principles in Complex Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct an empirical analysis of how large language models (LLMs), specifically GPT-4, interpret constitutional principles in complex decision-making scenarios. |
Camilla Bignotti; Carolina Camassa; | arxiv-cs.CL | 2024-07-29 |
473 | Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address sentiment analysis of Lithuanian five-star-based online reviews from multiple domains that we collect and clean. |
Brigita Vileikytė; Mantas Lukoševičius; Lukas Stankevičius; | arxiv-cs.CL | 2024-07-29 |
474 | DuA: Dual Attentive Transformer in Long-Term Continuous EEG Emotion Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods encounter significant challenges in real-life scenarios where emotional states evolve over extended periods. To address this issue, we propose a Dual Attentive (DuA) transformer framework for long-term continuous EEG emotion analysis. |
YUE PAN et. al. | arxiv-cs.HC | 2024-07-29 |
475 | Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is a gap regarding the integration of transformer-based TSF and data-centric AI. This survey aims to pin down this gap via the extensive literature review based on the proposed taxonomy. |
Jingjing Xu; Caesar Wu; Yuan-Fang Li; Gregoire Danoy; Pascal Bouvry; | arxiv-cs.LG | 2024-07-29 |
476 | AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We derive an analytical model for the dependence of optimal weights on data scale and introduce *AutoScale*, a novel, practical approach for optimizing data compositions at potentially large training data scales. |
FEIYANG KANG et. al. | arxiv-cs.LG | 2024-07-29 |
477 | Motamot: A Dataset for Revealing The Supremacy of Large Language Models Over Transformer Models in Bengali Political Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate political sentiment analysis during Bangladeshi elections, specifically examining how effectively Pre-trained Language Models (PLMs) and Large Language Models (LLMs) capture complex sentiment characteristics. |
FATEMA TUJ JOHORA FARIA et. al. | arxiv-cs.CL | 2024-07-28 |
478 | The Impact of LoRA Adapters for LLMs on Clinical NLP Classification Under Data Limitations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to the domain gap and limited data availability. |
Thanh-Dung Le; Ti Ti Nguyen; Vu Nguyen Ha; | arxiv-cs.CL | 2024-07-27 |
479 | FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks. |
Seyed Mojtaba Sadjadi; Zeinab Rajabi; Leila Rabiei; Mohammad-Shahram Moin; | arxiv-cs.CL | 2024-07-27 |
480 | GPT Deciphering Fedspeak: Quantifying Dissent Among Hawks and Doves Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPT-4 to quantify dissent among members on the topic of inflation. |
DENIS PESKOFF et. al. | arxiv-cs.AI | 2024-07-26 |
481 | QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the success of the Transformer architecture in natural language processing and computer vision, we investigate the use of Transformers in Reinforcement Learning (RL), specifically in modeling the environment’s dynamics using Transformer Dynamics Models (TDMs). |
Mostafa Kotb; Cornelius Weber; Muhammad Burhan Hafez; Stefan Wermter; | arxiv-cs.LG | 2024-07-26 |
482 | Is Larger Always Better? Evaluating and Prompting Large Language Models for Non-generative Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models, for non-generative medical tasks utilizing renowned datasets. |
YINGHAO ZHU et. al. | arxiv-cs.CL | 2024-07-26 |
483 | Using GPT-4 to Guide Causal Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are interested in the ability of LLMs to identify causal relationships. |
Anthony C. Constantinou; Neville K. Kitson; Alessio Zanga; | arxiv-cs.AI | 2024-07-26 |
484 | Automatic Detection of Moral Values in Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. |
Vjosa Preniqi; Iacopo Ghinassi; Julia Ive; Kyriaki Kalimeri; Charalampos Saitis; | arxiv-cs.CY | 2024-07-26 |
485 | Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel joint graph learning approach that combines the rich contextual representations learned by pre-trained single-cell language models with the structured knowledge encoded in GRNs using graph neural networks (GNNs). |
Sindhura Kommu; Yizhi Wang; Yue Wang; Xuan Wang; | arxiv-cs.LG | 2024-07-25 |
486 | HDL-GPT: High-Quality HDL Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models. |
BHUVNESH KUMAR et. al. | arxiv-cs.LG | 2024-07-25 |
487 | The Power of Combining Data and Knowledge: GPT-4o Is An Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models to enhance LNM prediction performance. |
Danqing Hu; Bing Liu; Xiaofeng Zhu; Nan Wu; | arxiv-cs.CL | 2024-07-25 |
488 | My Ontologist: Evaluating BFO-Based AI for Definition Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through iterative development of a specialized GPT model named My Ontologist, we aimed to generate BFO-conformant ontologies. |
Carter Benson; Alec Sculley; Austin Liebers; John Beverley; | arxiv-cs.DB | 2024-07-24 |
489 | Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving. |
Zuoyin Tang; Jianhua He; Dashuai Pei; Kezhong Liu; Tao Gao; | arxiv-cs.AI | 2024-07-24 |
490 | Cost-effective Instruction Learning for Pathology Vision and Language Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we propose a cost-effective instruction learning framework for conversational pathology named as CLOVER. |
KAITAO CHEN et. al. | arxiv-cs.AI | 2024-07-24 |
491 | SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we introduced SDoH-GPT, a simple and effective few-shot Large Language Model (LLM) method leveraging contrastive examples and concise instructions to extract SDoH without relying on extensive medical annotations or costly human intervention. |
BERNARDO CONSOLI et. al. | arxiv-cs.CL | 2024-07-24 |
492 | Artificial Intelligence in Extracting Diagnostic Data from Dental Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research addresses the issue of missing structured data in dental records by extracting diagnostic information from unstructured text. |
YAO-SHUN CHUANG et. al. | arxiv-cs.CL | 2024-07-23 |
493 | Can Large Language Models Automatically Jailbreak GPT-4V? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization. |
YUANWEI WU et. al. | arxiv-cs.CL | 2024-07-23 |
494 | OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. |
FAN CUI et. al. | arxiv-cs.AR | 2024-07-23 |
495 | Inverted Activations: Reducing Memory Footprint in Neural Network Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a modification to the handling of activation tensors in pointwise nonlinearity layers. |
Georgii Novikov; Ivan Oseledets; | arxiv-cs.LG | 2024-07-22 |
496 | RadioRAG: Factual Large Language Models for Enhanced Diagnostics in Radiology Using Dynamic Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have advanced the field of artificial intelligence (AI) in medicine. |
SOROOSH TAYEBI ARASTEH et. al. | arxiv-cs.CL | 2024-07-22 |
497 | Can GPT-4 Learn to Analyse Moves in Research Article Abstracts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we employ the affordances of GPT-4 to automate the annotation process by using natural language prompts. |
Danni Yu; Marina Bondi; Ken Hyland; | arxiv-cs.CL | 2024-07-22 |
498 | Dissecting Multiplication in Transformers: Insights Into LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on observation and analysis, we infer the reasons of transformers deficiencies in multiplication tasks lies in their difficulty in calculating successive carryovers and caching intermediate results, and confirmed this inference through experiments. Guided by these findings, we propose improvements to enhance transformers performance on multiplication tasks. |
Luyu Qiu; Jianing Li; Chi Su; Chen Jason Zhang; Lei Chen; | arxiv-cs.CL | 2024-07-22 |
499 | KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the adaptation of Transformerbased models for edge devices through the quantisation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC-V platform. |
Aness Al-Qawlaq; Ajay Kumar M; Deepu John; | arxiv-cs.AR | 2024-07-22 |
500 | Efficient Visual Transformer By Learnable Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Learnable Token Merging (LTM), or LTM-Transformer. |
Yancheng Wang; Yingzhen Yang; | arxiv-cs.CV | 2024-07-21 |
501 | Unipa-GPT: Large Language Models for University-oriented QA in Italian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments we adopted both the Retrieval Augmented Generation (RAG) approach and fine-tuning to develop the system. |
Irene Siragusa; Roberto Pirrone; | arxiv-cs.CL | 2024-07-19 |
502 | LLMs Left, Right, and Center: Assessing GPT’s Capabilities to Label Political Bias from Web Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale (far-left to far-right). |
Raphael Hernandes; Giulio Corsi; | arxiv-cs.CL | 2024-07-19 |
503 | Adaptive Foundation Models for Online Decisions: HyperAgent with Fast Incremental Uncertainty Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GPT-HyperAgent, an augmentation of GPT with HyperAgent for uncertainty-aware, scalable exploration in contextual bandits, a fundamental online decision problem involving natural language input. |
Yingru Li; Jiawei Xu; Zhi-Quan Luo; | arxiv-cs.LG | 2024-07-18 |
504 | Can Open-Source LLMs Compete with Commercial Models? Exploring The Few-Shot Performance of Current GPT Models in Biomedical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3.5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning. |
Samy Ateia; Udo Kruschwitz; | arxiv-cs.CL | 2024-07-18 |
505 | Evaluating Large Language Models for Anxiety and Depression Classification Using Counseling and Psychotherapy Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts. |
Junwei Sun; Siqi Ma; Yiran Fan; Peter Washington; | arxiv-cs.CL | 2024-07-18 |
506 | A Light-weight and Efficient Punctuation and Word Casing Prediction Model for On-device Streaming ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a light-weight and efficient model that jointly predicts punctuation and word casing in real time. |
Jian You; Xiangfeng Li; | arxiv-cs.CL | 2024-07-18 |
507 | ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose ARTEMIS, a mixed analog-stochastic in-DRAM accelerator for transformer models. |
Salma Afifi; Ishan Thakkar; Sudeep Pasricha; | arxiv-cs.AR | 2024-07-17 |
508 | Sharif-STR at SemEval-2024 Task 1: Transformer As A Regression Model for Fine-Grained Scoring of Textual Semantic Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the investigation of sentence-level STR within Track A (Supervised) by leveraging fine-tuning techniques on the RoBERTa transformer. |
SEYEDEH FATEMEH EBRAHIMI et. al. | arxiv-cs.CL | 2024-07-17 |
509 | Frequency Guidance Matters: Skeletal Action Recognition By Frequency-Aware Mixed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the existing transformer-based approaches heavily rely on the naive attention mechanism for capturing the spatiotemporal features, which falls short in learning discriminative representations that exhibit similar motion patterns. To address this challenge, we introduce the Frequency-aware Mixed Transformer (FreqMixFormer), specifically designed for recognizing similar skeletal actions with subtle discriminative motions. |
WENHAN WU et. al. | arxiv-cs.CV | 2024-07-17 |
510 | LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel LLMs-in-the-loop approach to develop supervised neural machine translation models optimized specifically for medical texts. |
Bunyamin Keles; Murat Gunay; Serdar I. Caglar; | arxiv-cs.CL | 2024-07-16 |
511 | Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. |
SEYEDEH FATEMEH EBRAHIMI et. al. | arxiv-cs.CL | 2024-07-16 |
512 | Educational Personalized Learning Path Planning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its potential, traditional PLPP systems often lack adaptability, interactivity, and transparency. This paper proposes a novel approach integrating Large Language Models (LLMs) with prompt engineering to address these challenges. |
Chee Ng; Yuen Fung; | arxiv-cs.CL | 2024-07-16 |
513 | Does Refusal Training in LLMs Generalize to The Past Tense? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We systematically evaluate this method on Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, GPT-4o mini, GPT-4o, o1-mini, o1-preview, and R2D2 models using GPT-3.5 Turbo as a reformulation model. |
Maksym Andriushchenko; Nicolas Flammarion; | arxiv-cs.CL | 2024-07-16 |
514 | A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies show that creating a high-quality training dataset for software engineering chatbots is expensive in terms of both resources and time. Aims: Therefore, in this paper, we present an automated transformer-based approach to augment software engineering chatbot datasets. |
Ahmad Abdellatif; Khaled Badran; Diego Elias Costa; Emad Shihab; | arxiv-cs.SE | 2024-07-16 |
515 | Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To thoroughly examine the behaviours of Transformer-based MDS models, this paper presents five empirical studies on (1) measuring the impact of document boundary separators quantitatively; (2) exploring the effectiveness of different mainstream Transformer structures; (3) examining the sensitivity of the encoder and decoder; (4) discussing different training strategies; and (5) discovering the repetition in a summary generation. |
Congbo Ma; Wei Emma Zhang; Dileepa Pitawela; Haojie Zhuang; Yanfeng Shu; | arxiv-cs.CL | 2024-07-16 |
516 | GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution is a set of features, their properties, definitions, and examples in a machine-readable format, along with the code for RhetAnn and the GPT prompts and fine-tuning procedures for advancing state-of-the-art interpretable propaganda technique detection. |
Kyle Hamilton; Luca Longo; Bojan Bozic; | arxiv-cs.CL | 2024-07-16 |
517 | ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by the need for lightweight, open source, and multilingual dialogue evaluators, this paper introduces GenResCoh (Generated Responses targeting Coherence). |
John Mendonça; Isabel Trancoso; Alon Lavie; | arxiv-cs.CL | 2024-07-16 |
518 | R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rigorous insights are provided into the influence of jamming LLM word embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE). |
ALADIN DJUHERA et. al. | arxiv-cs.LG | 2024-07-16 |
519 | Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches … |
Yaswanth Narsupalli; Abhranil Chandra; Sreevatsa Muppirala; Manish Gupta; Pawan Goyal; | ArXiv | 2024-07-16 |
520 | Scientific QA System with Verifiable Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the VerifAI project, a pioneering open-source scientific question-answering system, designed to provide answers that are not only referenced but also automatically vetted and verifiable. |
ADELA LJAJIĆ et. al. | arxiv-cs.CL | 2024-07-16 |
521 | GPT-4V Cannot Generate Radiology Reports Yet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we perform a systematic evaluation of GPT-4V in generating radiology reports on two chest X-ray report datasets: MIMIC-CXR and IU X-Ray. |
Yuyang Jiang; Chacha Chen; Dang Nguyen; Benjamin M. Mervak; Chenhao Tan; | arxiv-cs.CY | 2024-07-16 |
522 | Large Language Models As Misleading Assistants in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users. |
BETTY LI HOU et. al. | arxiv-cs.CL | 2024-07-16 |
523 | GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images Via VLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4o can decode hand gestures from forearm ultrasound data even with no fine-tuning, and improves with few-shot, in-context learning. |
Keshav Bimbraw; Ye Wang; Jing Liu; Toshiaki Koike-Akino; | arxiv-cs.CV | 2024-07-15 |
524 | Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present novel approaches that use a generative pretrained transformer (GPT) to identify paraphasias from transcripts as well as two end-to-end approaches that focus on modeling both automatic speech recognition (ASR) and paraphasia classification as multiple sequences vs. a single sequence. |
Matthew Perez; Aneesha Sampath; Minxue Niu; Emily Mower Provost; | arxiv-cs.CL | 2024-07-15 |
525 | Transformer-based Drum-level Prediction in A Boiler Plant with Delayed Relations Among Multivariates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging the capabilities of Transformer architectures, this study aims to develop an accurate and robust predictive framework to anticipate water level fluctuations and facilitate proactive control strategies. |
Gang Su; Sun Yang; Zhishuai Li; | arxiv-cs.LG | 2024-07-15 |
526 | Leveraging LLM-Respondents for Item Evaluation: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, item calibration is time-consuming and costly, requiring a sufficient number of respondents for the response process. We explore using six different LLMs (GPT-3.5, GPT-4, Llama 2, Llama 3, Gemini-Pro, and Cohere Command R Plus) and various combinations of them using sampling methods to produce responses with psychometric properties similar to human answers. |
Yunting Liu; Shreya Bhandari; Zachary A. Pardos; | arxiv-cs.CY | 2024-07-15 |
527 | Generalizable Tip-of-the-Tongue Retrieval with LLM Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the generalization capabilities of existing retrieval methods with ToT queries in multiple domains. |
Lu\'{\i}s Borges; Rohan Jha; Jamie Callan; Bruno Martins; | sigir | 2024-07-14 |
528 | DistillSeq: A Framework for Safety Alignment Testing in Large Language Models Using Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we deploy two distinct strategies for generating malicious queries: one based on a syntax tree approach, and the other leveraging an LLM-based method. |
Mingke Yang; Yuqi Chen; Yi Liu; Ling Shi; | arxiv-cs.SE | 2024-07-14 |
529 | Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, current works use GPT-4 to only predict the correct option without providing any explanation and thus do not provide any insight into the thinking process and reasoning used by GPT-4 or other LLMs. Therefore, we introduce a new domain-specific error taxonomy derived from collaboration with medical students. |
SOUMYADEEP ROY et. al. | sigir | 2024-07-14 |
530 | Legal Statute Identification: A Case Study Using State-of-the-Art Datasets and Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reproduce several LSI models on two popular LSI datasets and study the effect of the above-mentioned challenges. |
Shounak Paul; Rajas Bhatt; Pawan Goyal; Saptarshi Ghosh; | sigir | 2024-07-14 |
531 | Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4). |
GE GAO et. al. | arxiv-cs.CL | 2024-07-14 |
532 | Reflections on The Coding Ability of LLMs for Analyzing Market Research Surveys Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first systematic study of applying large language models (in our case, GPT-3.5 and GPT-4) for the automatic coding (multi-class classification) problem in market research. |
Shi Zong; Santosh Kolagati; Amit Chaudhary; Josh Seltzer; Jimmy Lin; | sigir | 2024-07-14 |
533 | CodeV: Empowering LLMs for Verilog Generation Through Multi-Level Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. |
YANG ZHAO et. al. | arxiv-cs.PL | 2024-07-14 |
534 | Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking Over Larger Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages. |
Vinay Setty; | sigir | 2024-07-14 |
535 | Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints. |
Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu; | sigir | 2024-07-14 |
536 | Graph Transformers: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Beyond technical analysis, we discuss the applications of graph transformer models for node-level, edge-level, and graph-level tasks, exploring their potential in other application scenarios as well. |
AHSAN SHEHZAD et. al. | arxiv-cs.LG | 2024-07-13 |
537 | Document-level Clinical Entity and Relation Extraction Via Knowledge Base-Guided Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability. |
Kriti Bhattarai; Inez Y. Oh; Zachary B. Abrams; Albert M. Lai; | arxiv-cs.CL | 2024-07-13 |
538 | Causality Extraction from Medical Text Using Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of natural language models, including large language models, to extract causal relations from medical texts, specifically from Clinical Practice Guidelines (CPGs). |
Seethalakshmi Gopalakrishnan; Luciana Garbayo; Wlodek Zadrozny; | arxiv-cs.CL | 2024-07-13 |
539 | Robustness of LLMs to Perturbations in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs’ robustness against the corrupt variations of the original text. |
Ayush Singh; Navpreet Singh; Shubham Vatsal; | arxiv-cs.CL | 2024-07-12 |
540 | ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a reinforcement learning formulation of the LLM red-teaming task that allows us to discover prompts that both (1) trigger toxic outputs from a frozen defender and (2) have low perplexity as scored by that defender. |
Amelia F. Hardy; Houjun Liu; Bernard Lange; Mykel J. Kochenderfer; | arxiv-cs.CL | 2024-07-12 |
541 | EVOLVE: Predicting User Evolution and Network Dynamics in Social Media Using Fine-Tuned GPT-like Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we propose a predictive method to understand how a user evolves on social media throughout their life and to forecast the next stage of their evolution. |
Ismail Hossain; Md Jahangir Alam; Sai Puppala; Sajedul Talukder; | arxiv-cs.SI | 2024-07-12 |
542 | The Two Sides of The Coin: Hallucination Generation and Detection with LLMs As Evaluators for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. |
ANH THU MARIA BUI et. al. | arxiv-cs.AI | 2024-07-12 |
543 | Movie Recommendation with Poster Attention Via Multi-modal Transformer Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal movie recommendation system by extract features of the well designed posters for each movie and the narrative text description of the movie. |
Linhan Xia; Yicheng Yang; Ziou Chen; Zheng Yang; Shengxin Zhu; | arxiv-cs.IR | 2024-07-12 |
544 | On Exact Bit-level Reversible Transformers Without Changing Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference. |
Guoqiang Zhang; J. P. Lewis; W. B. Kleijn; | arxiv-cs.LG | 2024-07-12 |
545 | Show, Don’t Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the models’ ability to generalize beyond their training data, we introduce two additional games. |
Gonçalo Hora de Carvalho; Oscar Knap; Robert Pollice; | arxiv-cs.AI | 2024-07-12 |
546 | Detect Llama — Finding Vulnerabilities in Smart Contracts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we test the hypothesis that although OpenAI’s GPT-4 performs well generally, we can fine-tune open-source models to outperform GPT-4 in smart contract vulnerability detection. |
Peter Ince; Xiapu Luo; Jiangshan Yu; Joseph K. Liu; Xiaoning Du; | arxiv-cs.CR | 2024-07-11 |
547 | LLMs’ Morphological Analyses of Complex FST-generated Finnish Words Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms. |
Anssi Moisio; Mathias Creutz; Mikko Kurimo; | arxiv-cs.CL | 2024-07-11 |
548 | GPT-4 Is Judged More Human Than Humans in Displaced and Inverted Turing Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall. |
Ishika Rathi; Sydney Taylor; Benjamin K. Bergen; Cameron R. Jones; | arxiv-cs.HC | 2024-07-11 |
549 | Teaching Transformers Causal Reasoning Through Axiomatic Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. |
Aniket Vashishtha; Abhinav Kumar; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma; | arxiv-cs.LG | 2024-07-10 |
550 | FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the previous approaches has investigated the efficiency of LLM-based few-shot learning in domain-specific scenarios. To address this gap, we introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets, with a focus on industrial manufacturing and maintenance, while using multiple LLMs — GPT-4-32K, GPT-3.5-Turbo, LLaMA 2-chat, and Vicuna. |
Yongjian Tang; Rakebul Hasan; Thomas Runkler; | arxiv-cs.CL | 2024-07-10 |
551 | ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose Random Subspace Adaptation (ROSA), a method that outperforms previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time. |
Marawan Gamal Abdel Hameed; Aristides Milios; Siva Reddy; Guillaume Rabusseau; | arxiv-cs.LG | 2024-07-10 |
552 | PEER: Expertizing Domain-Specific Tasks with A Multi-Agent Framework and Tuning Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework. |
YIYING WANG et. al. | arxiv-cs.AI | 2024-07-09 |
553 | Prompting Techniques for Secure Code Generation: A Systematic Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OBJECTIVE: In this study, we investigate the impact of different prompting techniques on the security of code generated from NL instructions by LLMs. |
Catherine Tony; Nicolás E. Díaz Ferreyra; Markus Mutas; Salem Dhiff; Riccardo Scandariato; | arxiv-cs.SE | 2024-07-09 |
554 | Mixture-of-Modules: Reinventing Transformers As Dynamic Assemblies of Modules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization. |
ZHUOCHENG GONG et. al. | arxiv-cs.CL | 2024-07-09 |
555 | Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To assess the LLM output, we propose completeness, soundness, clarity, syntax, and functioning code as metrics. |
Inwon Kang; William Van Woensel; Oshani Seneviratne; | arxiv-cs.CL | 2024-07-09 |
556 | A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine five feature extraction methods (TF-IDF, LSI, BERT, MiniLM, RoBERTa) and find that Term Frequency-Inverse Document Frequency (TF-IDF) outperforms the other four methods with a precision of 75\% and an F1 score of 64\%. |
Refat Othman; Bruno Rossi; Russo Barbara; | arxiv-cs.CR | 2024-07-09 |
557 | Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce Multilingual Blending, a mixed-language query-response scheme designed to evaluate the safety alignment of various state-of-the-art LLMs (e.g., GPT-4o, GPT-3.5, Llama3) under sophisticated, multilingual conditions. |
Jiayang Song; Yuheng Huang; Zhehua Zhou; Lei Ma; | arxiv-cs.CL | 2024-07-09 |
558 | Short Answer Scoring with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Lan Jiang; Nigel Bosch; | ACM Conference on Learning @ Scale | 2024-07-09 |
559 | Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR). |
YAOZONG GAN et. al. | arxiv-cs.CV | 2024-07-08 |
560 | On The Power of Convolution Augmented Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent architectural recipes, such as state-space models, have bridged the performance gap. Motivated by this, we examine the benefits of Convolution-Augmented Transformer (CAT) for recall, copying, and length generalization tasks. |
Mingchen Li; Xuechen Zhang; Yixiao Huang; Samet Oymak; | arxiv-cs.LG | 2024-07-08 |
561 | Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations. |
Moritz Reuss; Ömer Erdinç Yağmurlu; Fabian Wenzel; Rudolf Lioutikov; | arxiv-cs.RO | 2024-07-08 |
562 | Surprising Gender Biases in GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present seven experiments exploring gender biases in GPT. |
Raluca Alexandra Fulgu; Valerio Capraro; | arxiv-cs.CY | 2024-07-08 |
563 | Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study underscores the crucial role of prompt engineering in maximizing the educational benefits of LLMs. By systematically categorizing and testing these strategies, we provide a comprehensive framework for both educators and students to optimize LLM-based learning experiences. |
Tianyu Wang; Nianjun Zhou; Zhixiong Chen; | arxiv-cs.AI | 2024-07-07 |
564 | Image-Conditional Diffusion Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). |
XINGYANG NIE et. al. | arxiv-cs.CV | 2024-07-07 |
565 | MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The rapid advancement of Large Language Models (LLMs) and Large Multimodal Models (LMMs) has heightened the demand for AI-based scientific assistants capable of understanding … |
ZEKUN LI et. al. | ArXiv | 2024-07-06 |
566 | Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored. |
Nathan Herr; Fernando Acero; Roberta Raileanu; María Pérez-Ortiz; Zhibin Li; | arxiv-cs.AI | 2024-07-05 |
567 | Using LLMs to Label Medical Papers According to The CIViC Evidence Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP. |
Markus Hisch; Xing David Wang; | arxiv-cs.CL | 2024-07-05 |
568 | Generalists Vs. Specialists: Evaluating Large Language Models for Urdu Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models–XLM-Roberta-large, mT5-large, and Llama-3-8b–that have been fine-tuned on specific tasks. |
Samee Arif; Abdul Hameed Azeemi; Agha Ali Raza; Awais Athar; | arxiv-cs.CL | 2024-07-05 |
569 | GPT Vs RETRO: Exploring The Intersection of Retrieval and Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we apply PEFT methods (P-tuning, Adapters, and LoRA) to a modified Retrieval-Enhanced Transformer (RETRO) and a baseline GPT model across several sizes, ranging from 823 million to 48 billion parameters. |
Aleksander Ficek; Jiaqi Zeng; Oleksii Kuchaiev; | arxiv-cs.CL | 2024-07-05 |
570 | Associative Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step. |
Ivan Rodkin; Yuri Kuratov; Aydar Bulatov; Mikhail Burtsev; | arxiv-cs.CL | 2024-07-05 |
571 | GPT-4 Vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. |
JIANHAO YAN et. al. | arxiv-cs.CL | 2024-07-04 |
572 | TrackPGD: A White-box Attack Using Binary Masks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are proposing a novel white-box attack named TrackPGD, which relies on the predicted object binary mask to attack the robust transformer trackers. |
Fatemeh Nourilenjan Nokabadi; Yann Batiste Pequignot; Jean-Francois Lalonde; Christian Gagné; | arxiv-cs.CV | 2024-07-04 |
573 | HYBRINFOX at CheckThat! 2024 — Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! |
MORGANE CASANOVA et. al. | arxiv-cs.CL | 2024-07-04 |
574 | From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with a focus on their abilities in reasoning and explainability. |
Stefanie Krause; Frieder Stolzenburg; | arxiv-cs.AI | 2024-07-04 |
575 | Adaptive Step-size Perception Unfolding Network with Non-local Hybrid Attention for Hyperspectral Image Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the aforementioned drawbacks, We proposed an adaptive step-size perception unfolding network (ASPUN), a deep unfolding network based on FISTA algorithm, which uses an adaptive step-size perception module to estimate the update step-size of each spectral channel. |
Yanan Yang; Like Xin; | arxiv-cs.CV | 2024-07-04 |
576 | Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task. |
Sachin Yadav; Tejaswi Choppa; Dominik Schlechtweg; | arxiv-cs.CL | 2024-07-04 |
577 | CATT: Character-based Arabic Tashkeel Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new approach to training ATD models. |
Faris Alasmary; Orjuwan Zaafarani; Ahmad Ghannam; | arxiv-cs.CL | 2024-07-03 |
578 | Regurgitative Training: The Value of Real Data in Training Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: What happens if we train a new Large Language Model (LLM) using data that are at least partially generated by other LLMs? |
Jinghui Zhang; Dandan Qiao; Mochen Yang; Qiang Wei; | arxiv-cs.CL | 2024-07-03 |
579 | Mast Kalandar at SemEval-2024 Task 8: On The Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness. |
Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava; Radhika Mamidi; | arxiv-cs.CL | 2024-07-03 |
580 | Large Language Models As Evaluators for Scientific Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study explores how well the state-of-the-art Large Language Models (LLMs), like GPT-4 and Mistral, can assess the quality of scientific summaries or, more fittingly, scientific syntheses, comparing their evaluations to those of human annotators. |
Julia Evans; Jennifer D’Souza; Sören Auer; | arxiv-cs.CL | 2024-07-03 |
581 | InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. |
PAN ZHANG et. al. | arxiv-cs.CV | 2024-07-03 |
582 | Assessing The Code Clone Detection Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection. |
Zixian Zhang; Takfarinas Saber; | arxiv-cs.SE | 2024-07-02 |
583 | RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG. |
YUE YU et. al. | arxiv-cs.CL | 2024-07-02 |
584 | Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry. |
MENGLIN YANG et. al. | arxiv-cs.LG | 2024-07-01 |
585 | FATFusion: A Functional-anatomical Transformer for Medical Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wei Tang; Fazhi He; | Inf. Process. Manag. | 2024-07-01 |
586 | Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, LLMs struggle to converge even when explicitly prompted to do so and are sensitive to prompt variations. To overcome these issues, we introduce a hybrid algorithm: LLM-Enhanced Adaptive Dueling (LEAD), which takes advantage of both in-context decision-making capabilities of LLMs and theoretical guarantees inherited from classic DB algorithms. |
Fanzeng Xia; Hao Liu; Yisong Yue; Tongxin Li; | arxiv-cs.LG | 2024-07-01 |
587 | Transformer Autoencoder for K-means Efficient Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wenhao Wu; Weiwei Wang; Xixi Jia; Xiangchu Feng; | Eng. Appl. Artif. Intell. | 2024-07-01 |
588 | MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. |
YUBO MA et. al. | arxiv-cs.CV | 2024-07-01 |
589 | Token-disentangling Mutual Transformer for Multimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
GUANGHAO YIN et. al. | Eng. Appl. Artif. Intell. | 2024-07-01 |
590 | Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. |
Kota Shamanth Ramanath Nayak; Leila Kosseim; | arxiv-cs.CL | 2024-07-01 |
591 | Image-to-Text Logic Jailbreak: Your Imagination Can Help You Do Anything Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the integration of visual and text inputs in VLMs, new security issues emerge, as malicious attackers can exploit multiple modalities to achieve their objectives. |
Xiaotian Zou; Ke Li; Yongkang Chen; | arxiv-cs.CR | 2024-07-01 |
592 | Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the wide adoption of hybrid parallel paradigms like model parallelism, expert parallelism, and expert-sharding parallelism (i.e., MP+EP+ESP) to support MoE model training on GPU clusters, the training efficiency is hindered by communication costs introduced by these parallel paradigms. To address this limitation, we propose Parm, a system that accelerates MP+EP+ESP training by designing two dedicated schedules for placing communication tasks. |
XINGLIN PAN et. al. | arxiv-cs.DC | 2024-06-30 |
593 | LegalTurk Optimized BERT for Multi-Label Text Classification and NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies. |
Farnaz Zeidi; Mehmet Fatih Amasyali; Çiğdem Erol; | arxiv-cs.CL | 2024-06-30 |
594 | WallFacer: Harnessing Multi-dimensional Ring Parallelism for Efficient Long Sequence Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods are either constrained by the number of attention heads or excessive communication overheads. To address this problem, we propose WallFacer, a multi-dimensional distributed training system for long sequences, fostering an efficient communication paradigm and providing additional tuning flexibility for communication arrangements. |
ZIMING LIU et. al. | arxiv-cs.DC | 2024-06-30 |
595 | LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, prior research harbors two primary concerns: firstly, a lack of contemplation regarding whether the natural language generated by LLM (LLMNL) truly aligns with human natural language (HNL), a critical foundational question; secondly, an oversight that augmented data is randomly generated by LLM, implying that not all data may possess equal training value, that could impede the performance of classifiers. To address these challenges, we introduce the scaling laws to intrinsically calculate LLMNL and HNL. |
Zhenhua Wang; Guang Xu; Ming Ren; | arxiv-cs.CL | 2024-06-29 |
596 | Machine Learning Predictors for Min-Entropy Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing data from Generalized Binary Autoregressive Models, a subset of Markov processes, we demonstrate that machine learning models (including a hybrid of convolutional and recurrent Long Short-Term Memory layers and the transformer-based GPT-2 model) outperform traditional NIST SP 800-90B predictors in certain scenarios. |
Javier Blanco-Romero; Vicente Lorenzo; Florina Almenares Mendoza; Daniel Díaz-Sánchez; | arxiv-cs.LG | 2024-06-28 |
597 | Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users’ quit-vaping intentions. |
SAI KRISHNA REVANTH VURUMA et. al. | arxiv-cs.CL | 2024-06-28 |
598 | FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FRED, a wafer-scale interconnect that is tailored for the high-BW requirements of wafer-scale networks and can efficiently execute communication patterns of different parallelization strategies. |
Saeed Rashidi; William Won; Sudarshan Srinivasan; Puneet Gupta; Tushar Krishna; | arxiv-cs.AR | 2024-06-27 |
599 | Fine-tuned Network Relies on Generic Representation to Solve Unseen Cognitive Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning pretrained language models has shown promising results on a wide range of tasks, but when encountering a novel task, do they rely more on generic pretrained representation, or develop brand new task-specific solutions? |
Dongyan Lin; | arxiv-cs.LG | 2024-06-27 |
600 | NTFormer: A Composite Node Tokenized Graph Transformer for Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a new graph Transformer called NTFormer to address this issue. |
Jinsong Chen; Siyu Jiang; Kun He; | arxiv-cs.LG | 2024-06-27 |
601 | The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP). |
Xiliang Zhu; Shayna Gardiner; Tere Roldán; David Rossouw; | arxiv-cs.CL | 2024-06-27 |
602 | BADGE: BADminton Report Generation and Evaluation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel framework named BADGE, designed for this purpose using LLM. |
Shang-Hsuan Chiang; Lin-Wei Chao; Kuang-Da Wang; Chih-Chuan Wang; Wen-Chih Peng; | arxiv-cs.CL | 2024-06-26 |
603 | Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora. |
YIMING LI et. al. | arxiv-cs.CL | 2024-06-25 |
604 | SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR). |
Quan Mai; Susan Gauch; Douglas Adams; | arxiv-cs.CL | 2024-06-25 |
605 | This Paper Had The Smartest Reviewers — Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of speech and train machine learning models for automatic flattery detection. |
LUKAS CHRIST et. al. | arxiv-cs.SD | 2024-06-25 |
606 | CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design. |
NAFIS NEEHAL et. al. | arxiv-cs.CL | 2024-06-25 |
607 | Unambiguous Recognition Should Not Rely Solely on Natural Language Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This bias stems from the inherent characteristics of the dataset. To mitigate this bias, we propose a LaTeX printed text recognition model trained on a mixed dataset of pseudo-formulas and pseudo-text. |
Renqing Luo; Yuhan Xu; | arxiv-cs.CV | 2024-06-24 |
608 | The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions. |
Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz; | arxiv-cs.CL | 2024-06-24 |
609 | Exploring The Capability of Mamba in Speech Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compared Mamba with state-of-the-art Transformer variants for various speech applications, including ASR, text-to-speech, spoken language understanding, and speech summarization. |
Koichi Miyazaki; Yoshiki Masuyama; Masato Murata; | arxiv-cs.SD | 2024-06-24 |
610 | GPT-4V Explorations: Mining Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the application of the GPT-4V(ision) large visual language model to autonomous driving in mining environments, where traditional systems often falter in understanding intentions and making accurate decisions during emergencies. |
Zixuan Li; | arxiv-cs.CV | 2024-06-24 |
611 | Exploring Factual Entailment with NLI: A News Media Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel — a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles. |
Guy Mor-Lan; Effi Levi; | arxiv-cs.CL | 2024-06-24 |
612 | Using GPT-4 Turbo to Automatically Identify Defeaters in Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are convincing arguments, supported by a body of evidence and aiming at demonstrating that a system will function as intended. Producers of systems can rely … |
K. K. SHAHANDASHTI et. al. | 2024 IEEE 32nd International Requirements Engineering … | 2024-06-24 |
613 | DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models. |
YUANG PENG et. al. | arxiv-cs.CV | 2024-06-24 |
614 | OlympicArena Medal Ranks: Who Is The Most Intelligent AI So Far? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)? |
Zhen Huang; Zengzhi Wang; Shijie Xia; Pengfei Liu; | arxiv-cs.CL | 2024-06-24 |
615 | CausalFormer: An Interpretable Transformer for Temporal Causal Discovery Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To facilitate the utilization of the whole deep learning models in temporal causal discovery, we proposed an interpretable transformer-based causal discovery model termed CausalFormer, which consists of the causality-aware transformer and the decomposition-based causality detector. |
LINGBAI KONG et. al. | arxiv-cs.LG | 2024-06-24 |
616 | Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT). |
Ni Wang; Dongliang Liao; Xing Xu; | arxiv-cs.CV | 2024-06-23 |
617 | GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent studies have identified limitations in LLMs’ ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph data structure problems along with 2000 test cases. |
Qiming Wu; Zichen Chen; Will Corcoran; Misha Sra; Ambuj K. Singh; | arxiv-cs.AI | 2024-06-23 |
618 | Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate a broader view of knowledge location, that of concepts or clusters of related information, instead of disparate individual facts. |
Christopher Burger; Yifan Hu; Thai Le; | arxiv-cs.LG | 2024-06-22 |
619 | How Effective Is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom’s Revised Taxonomy? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | arxiv-cs.CL | 2024-06-21 |
620 | Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang. |
Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu; | naacl | 2024-06-20 |
621 | VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack. |
Jonathan Rusert; | naacl | 2024-06-20 |
622 | MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3. |
SANCHIT AHUJA et. al. | naacl | 2024-06-20 |
623 | Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3. |
Sindhu Kishore; Hangfeng He; | naacl | 2024-06-20 |
624 | Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; | naacl | 2024-06-20 |
625 | Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults. |
Afonso de Sá Delgado Neto; Maximilian Egger; Mayank Bakshi; Rawad Bitar; | arxiv-cs.LG | 2024-06-20 |
626 | A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems. |
Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas; | naacl | 2024-06-20 |
627 | Does GPT Really Get It? A Hierarchical Scale to Quantify Human Vs AI’s Understanding of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences. |
Mirabel Reid; Santosh S. Vempala; | arxiv-cs.AI | 2024-06-20 |
628 | Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature. |
Anshuman Chhabra; Hadi Askari; Prasant Mohapatra; | naacl | 2024-06-20 |
629 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng; | naacl | 2024-06-20 |
630 | Does GPT-4 Pass The Turing Test? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness. |
Cameron Jones; Ben Bergen; | naacl | 2024-06-20 |
631 | Removing RLHF Protections in GPT-4 Via Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate. |
QIUSI ZHAN et. al. | naacl | 2024-06-20 |
632 | Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance. |
XIANGRU TANG et. al. | naacl | 2024-06-20 |
633 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
DONG YUAN et. al. | naacl | 2024-06-20 |
634 | On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck. |
TING-RUI CHIANG et. al. | naacl | 2024-06-20 |
635 | CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively. |
Ming Jiang; Mansi Joshi; | naacl | 2024-06-20 |
636 | ChatGPT As Research Scientist: Probing GPT’s Capabilities As A Research Librarian, Research Ethicist, Data Generator and Data Predictor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research … |
Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R. Banaji; | arxiv-cs.AI | 2024-06-20 |
637 | SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning. |
ARASH ARDAKANI et. al. | naacl | 2024-06-20 |
638 | CryptoGPT: A 7B Model Rivaling GPT-4 in The Task of Analyzing and Classifying Real-time Financial News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CryptoGPT: a 7B model competing with GPT-4 in a specific task — The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT. |
Ying Zhang; Matthieu Petit Guillaume; Aurélien Krauth; Manel Labidi; | arxiv-cs.AI | 2024-06-20 |
639 | Branch-Solve-Merge Improves Large Language Model Evaluation and Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance can fall short, due to the model�s lack of coherence and inability to plan and decompose the problem. We propose Branch-Solve-Merge (BSM), a Large Language Model program (Schlag et al. , 2023) for tackling such challenging natural language tasks. |
SWARNADEEP SAHA et. al. | naacl | 2024-06-20 |
640 | Metacognitive Prompting Improves Understanding in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes. |
Yuqing Wang; Yun Zhao; | naacl | 2024-06-20 |
641 | Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles. |
Hamdireza Rouzegar; Masoud Makrehchi; | arxiv-cs.CL | 2024-06-19 |
642 | A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem. |
JIAQI LIU et. al. | arxiv-cs.RO | 2024-06-19 |
643 | Fine-Tuning BERTs for Definition Extraction from Mathematical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fine-tuned three pre-trained BERT models on the task of definition extraction from mathematical English written in LaTeX. |
Lucy Horowitz; Ryan Hathaway; | arxiv-cs.CL | 2024-06-19 |
644 | Putting GPT-4o to The Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study … |
SAKIB SHAHRIAR et. al. | ArXiv | 2024-06-19 |
645 | SwinStyleformer Is A Favorable Choice for Image Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects. |
Jiawei Mao; Guangyi Zhao; Xuesong Yin; Yuanqi Chang; | arxiv-cs.CV | 2024-06-18 |
646 | Generating Educational Materials with Different Levels of Readability Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning. |
Chieh-Yang Huang; Jing Wei; Ting-Hao ‘Kenneth’ Huang; | arxiv-cs.CL | 2024-06-18 |
647 | ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. |
TEAM GLM et. al. | arxiv-cs.CL | 2024-06-18 |
648 | Adversarial Attacks on Multimodal Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that multimodal agents raise new safety risks, even though attacking agents is more challenging than prior attacks due to limited access to and knowledge about the environment. |
Chen Henry Wu; Jing Yu Koh; Ruslan Salakhutdinov; Daniel Fried; Aditi Raghunathan; | arxiv-cs.LG | 2024-06-18 |
649 | What Makes Two Language Models Think Alike? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. |
Jeanne Salle; Louis Jalouzot; Nur Lan; Emmanuel Chemla; Yair Lakretz; | arxiv-cs.CL | 2024-06-18 |
650 | Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a thorough analysis and discussion of the results. |
ANKIT AICH et. al. | arxiv-cs.CL | 2024-06-18 |
651 | A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a two-dimensional zero-shot evaluation method for DST using GPT-4, which divides the evaluation into two dimensions: accuracy and completeness. |
Ming Gu; Yan Yang; | arxiv-cs.CL | 2024-06-17 |
652 | Minimal Self in Humanoid Robot Alter3 Driven By Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Alter3, a humanoid robot that demonstrates spontaneous motion generation through the integration of GPT-4, Large Language Model (LLM). |
Takahide Yoshida; Suzune Baba; Atsushi Masumori; Takashi Ikegami; | arxiv-cs.RO | 2024-06-17 |
653 | Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify a pitfall of vanilla iterative DPO – improved response quality can lead to increased verbosity. |
JIE LIU et. al. | arxiv-cs.CL | 2024-06-17 |
654 | Cultural Conditioning or Placebo? On The Effectiveness of Socio-Demographic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI) or neutral (MMLU and ETHICS). |
SAGNIK MUKHERJEE et. al. | arxiv-cs.CL | 2024-06-17 |
655 | DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales. |
FAN ZHOU et. al. | arxiv-cs.DB | 2024-06-17 |
656 | GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts. |
Binnur Görer; Fatma Başak Aydemir; | arxiv-cs.SE | 2024-06-17 |
657 | Large Language Model Tokenizer Bias: A Case Study and Solution on GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This misrepresentation results in the propagation of ‘under-trained’ or ‘untrained’ tokens, which perpetuate biases and pose serious concerns related to data security and ethical standards. We aim to dissect the tokenization mechanics of GPT-4o, illustrating how its simplified token-handling methods amplify these risks and offer strategic solutions to mitigate associated security and ethical issues. |
Jin Yang; Zhiqiang Wang; Yanbin Lin; Zunduo Zhao; | arxiv-cs.CL | 2024-06-17 |
658 | Look Further Ahead: Testing The Limits of GPT-4 in Path Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still face challenges with long-horizon planning. To study this, we propose path planning tasks as a platform to evaluate LLMs’ ability to navigate long trajectories under geometric constraints. |
Mohamed Aghzal; Erion Plaku; Ziyu Yao; | arxiv-cs.AI | 2024-06-17 |
659 | WellDunn: On The Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model’s utility in clinical practice. |
SEYEDALI MOHAMMADI et. al. | arxiv-cs.AI | 2024-06-17 |
660 | Promises, Outlooks and Challenges of Diffusion Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For example, autoregressive token generation is notably slow and can be prone to \textit{exposure bias}. The diffusion-based language models were proposed as an alternative to autoregressive generation to address some of these limitations. |
Justin Deschenaux; Caglar Gulcehre; | arxiv-cs.CL | 2024-06-17 |
661 | Large Language Models for Automatic Milestone Detection in Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate an LLM’s performance on recordings of a group oral communication task in which utterances are often truncated or not well-formed. |
ZHUOXU DUAN et. al. | arxiv-cs.CL | 2024-06-16 |
662 | Distilling Opinions at Scale: Incremental Opinion Summarization Using XL-OPSUMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally. |
SRI RAGHAVA MUDDU et. al. | arxiv-cs.CL | 2024-06-16 |
663 | Generating Tables from The Parametric Knowledge of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables. |
Yevgeni Berkovitch; Oren Glickman; Amit Somech; Tomer Wolfson; | arxiv-cs.CL | 2024-06-16 |
664 | KGPA: Robustness Evaluation for Large Language Models Via Cross-Domain Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). |
AIHUA PEI et. al. | arxiv-cs.CL | 2024-06-16 |
665 | ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we present Video Diffusion GPT (ViD-GPT). |
Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao; | arxiv-cs.CV | 2024-06-16 |
666 | Breaking Boundaries: Investigating The Effects of Model Editing on Cross-linguistic Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. |
SOMNATH BANERJEE et. al. | arxiv-cs.CL | 2024-06-16 |
667 | Exposing The Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dataset MWP-MISTAKE, incorporating MWPs with both correct and incorrect reasoning steps generated through rule-based methods and smaller language models. |
Joykirat Singh; Akshay Nambi; Vibhav Vineet; | arxiv-cs.CL | 2024-06-16 |
668 | ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning Via Shared Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an approach to optimize Parameter Efficient Fine Tuning (PEFT) for Pretrained Language Models (PLMs) by implementing a Shared Low Rank Adaptation (ShareLoRA). |
Yurun Song; Junchen Zhao; Ian G. Harris; Sangeetha Abdu Jyothi; | arxiv-cs.CL | 2024-06-15 |
669 | Multilingual Large Language Models and Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks. |
Daniil Gurgurov; Tanja Bäumel; Tatiana Anikina; | arxiv-cs.CL | 2024-06-15 |
670 | The Devil Is in The Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we try to unveil the mystery of social bias inside language models by introducing the concept of {\sc Social Bias Neurons}. |
YAN LIU et. al. | arxiv-cs.CL | 2024-06-14 |
671 | Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work enables extensive hardware/mapping exploration by extending the DSE framework Stream towards support for transformers across a wide variety of hardware architectures and different execution schedules. |
Steven Colleman; Arne Symons; Victor J. B. Jung; Marian Verhelst; | arxiv-cs.AR | 2024-06-14 |
672 | GPT-4o: Visual Perception Performance of Multimodal Large Language Models in Piglet Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The initial evaluation experiments in this study validate the potential of multimodal large language models in livestock scene video understanding and provide new directions and references for future research on animal behavior video understanding. |
Yiqi Wu; Xiaodan Hu; Ziming Fu; Siling Zhou; Jiangong Li; | arxiv-cs.CV | 2024-06-14 |
673 | Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally bidirectional feature propagation is highly sensitive to inaccurate optical flow in blurry frames leading to error accumulation during the propagation process. To address these issues we propose BSSTNet Blur-aware Spatio-temporal Sparse Transformer Network. |
Huicong Zhang; Haozhe Xie; Hongxun Yao; | cvpr | 2024-06-13 |
674 | Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze a mechanism used in two LMs to selectively inhibit items in a context in one task, and find that it underlies a commonly used abstraction across many context-retrieval behaviors. |
Jack Merullo; Carsten Eickhoff; Ellie Pavlick; | arxiv-cs.CL | 2024-06-13 |
675 | MoMask: Generative Masked Modeling of 3D Human Motions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MoMask a novel masked modeling framework for text-driven 3D human motion generation. |
Chuan Guo; Yuxuan Mu; Muhammad Gohar Javed; Sen Wang; Li Cheng; | cvpr | 2024-06-13 |
676 | GROD: Enhancing Generalization of Transformer with Out-of-Distribution Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach based on OOD detection, termed the Generate Rounded OOD Data (GROD) algorithm, which significantly bolsters the generalization performance of transformer networks across various tasks. |
Yijin Zhou; Yuguang Wang; | arxiv-cs.LG | 2024-06-13 |
677 | OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions without a large-scale animal text-motion dataset. |
ZHANGSIHAO YANG et. al. | cvpr | 2024-06-13 |
678 | SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum we introduce SDPose a new self-distillation method for improving the performance of small transformer-based models. |
SICHEN CHEN et. al. | cvpr | 2024-06-13 |
679 | MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion. |
Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi; | cvpr | 2024-06-13 |
680 | ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing studies are devoted to designing vision-specific transformers to solve the above problems which introduce additional pre-training costs. Therefore we present a plain pre-training-free and feature-enhanced ViT backbone with Convolutional Multi-scale feature interaction named ViT-CoMer which facilitates bidirectional interaction between CNN and transformer. |
Chunlong Xia; Xinliang Wang; Feng Lv; Xin Hao; Yifeng Shi; | cvpr | 2024-06-13 |
681 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose VisualFactChecker (VFC) a flexible training-free pipeline that generates high-fidelity and detailed captions for both 2D images and 3D objects. |
YUNHAO GE et. al. | cvpr | 2024-06-13 |
682 | Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models. |
Han Cai; Muyang Li; Qinsheng Zhang; Ming-Yu Liu; Song Han; | cvpr | 2024-06-13 |
683 | Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-standard varieties from around the world). |
EVE FLEISIG et. al. | arxiv-cs.CL | 2024-06-13 |
684 | Permutation Equivariance of Transformers and Its Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks. |
HENGYUAN XU et. al. | cvpr | 2024-06-13 |
685 | GPT-Fabric: Smoothing and Folding Fabric By Leveraging Pre-Trained Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-Fabric for the canonical tasks of fabric smoothing and folding, where GPT directly outputs an action informing a robot where to grasp and pull a fabric. |
Vedant Raval; Enyu Zhao; Hejia Zhang; Stefanos Nikolaidis; Daniel Seita; | arxiv-cs.RO | 2024-06-13 |
686 | GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic versatile and human-aligned evaluation metric for text-to-3D generative models. |
TONG WU et. al. | cvpr | 2024-06-13 |
687 | General Point Model Pretraining with Autoencoding and Autoregressive Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the General Language Model we propose a General Point Model (GPM) that seamlessly integrates autoencoding and autoregressive tasks in a point cloud transformer. |
ZHE LI et. al. | cvpr | 2024-06-13 |
688 | Complex Image-Generative Diffusion Transformer for Audio Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance audio denoising performance, this paper introduces a complex image-generative diffusion transformer that captures more information from the complex Fourier domain. |
Junhui Li; Pu Wang; Jialu Li; Youshan Zhang; | arxiv-cs.SD | 2024-06-13 |
689 | Mean-Shift Feature Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models developed in NLP make a great impact on computer vision fields producing promising performance on various tasks. |
Takumi Kobayashi; | cvpr | 2024-06-13 |
690 | AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively. |
REDUAN ACHTIBAT et. al. | icml | 2024-06-12 |
691 | Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses—that intuitively contain more learnable information and are harder to overfit—from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge. |
Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; | icml | 2024-06-12 |
692 | Accelerating Transformer Pre-training with 2:4 Sparsity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: First, we define a “flip rate” to monitor the stability of a 2:4 training process. Utilizing this metric, we propose three techniques to preserve accuracy: to modify the sparse-refined straight-through estimator by applying the masked decay term on gradients, to determine a feasible decay factor in warm-up stage, and to enhance the model’s quality by a dense fine-tuning procedure near the end of pre-training. |
Yuezhou Hu; Kang Zhao; Weiyu Huang; Jianfei Chen; Jun Zhu; | icml | 2024-06-12 |
693 | In-context Learning on Function Classes Unveiled for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given some training examples, a pre-trained model can make accurate predictions on an unseen input. |
Zhijie Wang; Bo Jiang; Shuai Li; | icml | 2024-06-12 |
694 | Asymmetry in Low-Rank Adapters of Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. |
JIACHENG ZHU et. al. | icml | 2024-06-12 |
695 | Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. |
Jaehoon Kim; Seungwan Jin; Sohyun Park; Someen Park; Kyungsik Han; | arxiv-cs.CL | 2024-06-12 |
696 | An Empirical Study of Mamba-based Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In a controlled setting (e.g., same data), however, studies so far have only presented small scale experiments comparing SSMs to Transformers. To understand the strengths and weaknesses of these architectures at larger scales, we present a direct comparison between 8B-parameter Mamba, Mamba-2, and Transformer models trained on the same datasets of up to 3.5T tokens. |
ROGER WALEFFE et. al. | arxiv-cs.LG | 2024-06-12 |
697 | Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. |
NICHOLAS CARLINI et. al. | icml | 2024-06-12 |
698 | Trainable Transformer in Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new efficient construction, Transformer in Transformer (in short, TINT), that allows a transformer to simulate and fine-tune more complex models during inference (e.g., pre-trained language models). |
Abhishek Panigrahi; Sadhika Malladi; Mengzhou Xia; Sanjeev Arora; | icml | 2024-06-12 |
699 | Entropy-Reinforced Planning with Large Language Models for Drug Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose ERP, Entropy-Reinforced Planning for Transformer Decoding, which employs an entropy-reinforced planning algorithm to enhance the Transformer decoding process and strike a balance between exploitation and exploration. |
Xuefeng Liu; Chih-chan Tien; Peng Ding; Songhao Jiang; Rick L. Stevens; | icml | 2024-06-12 |
700 | Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by previous theoretical study of static version of the attention multiplication problem [Zandieh, Han, Daliri, and Karbasi ICML 2023, Alman and Song NeurIPS 2023], we formally define a dynamic version of attention matrix multiplication problem. |
Jan van den Brand; Zhao Song; Tianyi Zhou; | icml | 2024-06-12 |
701 | Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the slow speed of current trackers limits their applicability on devices with constrained computational resources. To address this challenge, we introduce ABTrack, an adaptive computation framework that adaptively bypassing transformer blocks for efficient visual tracking. |
XIANGYANG YANG et. al. | arxiv-cs.CV | 2024-06-12 |
702 | GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website. |
Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su; | icml | 2024-06-12 |
703 | Timer: Generative Pre-trained Transformers Are Large Time Series Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM). |
YONG LIU et. al. | icml | 2024-06-12 |
704 | Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on improving the FFN module within the vision transformer. |
YIXING XU et. al. | icml | 2024-06-12 |
705 | PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality. |
Praneeth Kacham; Vahab Mirrokni; Peilin Zhong; | icml | 2024-06-12 |
706 | Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. |
HAOQI WU et. al. | icml | 2024-06-12 |
707 | Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we first introduce LoCoV1, a 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long. |
Jon Saad-Falcon; Daniel Y Fu; Simran Arora; Neel Guha; Christopher Re; | icml | 2024-06-12 |
708 | Do Efficient Transformers Really Save Computation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to understand the capabilities and limitations of efficient Transformers, specifically the Sparse Transformer and the Linear Transformer. |
KAI YANG et. al. | icml | 2024-06-12 |
709 | InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval. |
BOXIN WANG et. al. | icml | 2024-06-12 |
710 | In-Context Principle Learning from Mistakes IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples. |
TIANJUN ZHANG et. al. | icml | 2024-06-12 |
711 | Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to *weakly supervise* superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model? |
COLLIN BURNS et. al. | icml | 2024-06-12 |
712 | Discrete Diffusion Modeling By Estimating The Ratios of The Data Distribution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Crucially, standard diffusion models rely on the well-established theory of score matching, but efforts to generalize this to discrete structures have not yielded the same empirical gains. In this work, we bridge this gap by proposing score entropy, a novel loss that naturally extends score matching to discrete spaces, integrates seamlessly to build discrete diffusion models, and significantly boosts performance. |
Aaron Lou; Chenlin Meng; Stefano Ermon; | icml | 2024-06-12 |
713 | Fine-Tuned ‘Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification. |
Martin Juan José Bucher; Marco Martini; | arxiv-cs.CL | 2024-06-12 |
714 | How Smooth Is Attention? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a detailed study of the Lipschitz constant of self-attention in several practical scenarios, discussing the impact of the sequence length $n$ and layer normalization on the local Lipschitz constant of both unmasked and masked self-attention. |
Valérie Castin; Pierre Ablin; Gabriel Peyré; | icml | 2024-06-12 |
715 | Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed `OutEffHop`) and use it to address the outlier inefficiency problem of training gigantic transformer-based models. |
JERRY YAO-CHIEH HU et. al. | icml | 2024-06-12 |
716 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge. |
Fangyun Wei; Xi Chen; Lin Luo; | icml | 2024-06-12 |
717 | Prodigy: An Expeditiously Adaptive Parameter-Free Learner IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally. |
Konstantin Mishchenko; Aaron Defazio; | icml | 2024-06-12 |
718 | Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias terms. |
Brian K Chen; Tianyang Hu; Hui Jin; Hwee Kuan Lee; Kenji Kawaguchi; | icml | 2024-06-12 |
719 | What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth. |
Xingwu Chen; Difan Zou; | icml | 2024-06-12 |
720 | Position: On The Possibilities of AI-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds. |
SOURADIP CHAKRABORTY et. al. | icml | 2024-06-12 |
721 | How Language Model Hallucinations Can Snowball IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim. |
Muru Zhang; Ofir Press; William Merrill; Alisa Liu; Noah A. Smith; | icml | 2024-06-12 |
722 | Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective. |
CHENG HAN et. al. | icml | 2024-06-12 |
723 | Privacy-Preserving Embedding Via Look-up Table Evaluation with Fully Homomorphic Encryption Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, our study proposes an efficient algorithm for privacy-preserving embedding via look-up table evaluation with HE(HELUT) by developing an encrypted indicator function (EIF) that assures high precision with the use of the approximate HE scheme(CKKS). |
Jae-yun Kim; Saerom Park; Joohee Lee; Jung Hee Cheon; | icml | 2024-06-12 |
724 | Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Differentiable Channel Selection, or DCS-Transformer. |
Yancheng Wang; Ping Li; Yingzhen Yang; | icml | 2024-06-12 |
725 | SpikeZIP-TF: Conversion Is All You Need for Transformer-based SNN Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel ANN-to-SNN conversion method called SpikeZIP-TF, where ANN and SNN are exactly equivalent, thus incurring no accuracy degradation. |
KANG YOU et. al. | icml | 2024-06-12 |
726 | Improving Autoformalization Using Type Checking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis shows that the performance of these models is largely limited by their inability to generate formal statements that successfully type-check (i.e., are syntactically correct and consistent with types) – with a whopping 86.6% of GPT-4o errors starting from a type-check failure. In this work, we propose a method to fix this issue through decoding with type-check filtering, where we initially sample a diverse set of candidate formalizations for an informal statement, then use the Lean proof assistant to filter out candidates that do not type-check. |
Auguste Poiroux; Gail Weiss; Viktor Kunčak; Antoine Bosselut; | arxiv-cs.CL | 2024-06-11 |
727 | Anomaly Detection on Unstable Logs with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs. |
Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand; | arxiv-cs.SE | 2024-06-11 |
728 | Towards Generalized Hydrological Forecasting Using Transformer Models for 120-Hour Streamflow Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing data from the preceding 72 hours, including precipitation, evapotranspiration, and discharge values, we developed a generalized model to predict future streamflow. |
Bekir Z. Demiray; Ibrahim Demir; | arxiv-cs.LG | 2024-06-11 |
729 | LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approaches for the SMM4H24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders. |
Dasun Athukoralage; Thushari Atapattu; Menasha Thilakaratne; Katrina Falkner; | arxiv-cs.CL | 2024-06-11 |
730 | Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models. |
AmirMohammad Azadi; Baktash Ansari; Sina Zamani; | arxiv-cs.CL | 2024-06-11 |
731 | Unveiling The Safety of GPT-4o: An Empirical Study Using Jailbreak Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this paper adopts a series of multi-modal and uni-modal jailbreak attacks on 4 commonly used benchmarks encompassing three modalities (ie, text, speech, and image), which involves the optimization of over 4,000 initial text queries and the analysis and statistical evaluation of nearly 8,000+ response on GPT-4o. |
Zonghao Ying; Aishan Liu; Xianglong Liu; Dacheng Tao; | arxiv-cs.CR | 2024-06-10 |
732 | LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent. |
SIMRANJIT SINGH et. al. | arxiv-cs.DC | 2024-06-10 |
733 | LLM-Powered Multimodal AI Conversations for Diabetes Prevention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The global prevalence of diabetes remains high despite rising life expectancy with improved quality and access to healthcare services. The significant burden that diabetes imposes … |
Dung Dao; Jun Yi Claire Teo; Wenru Wang; Hoang D. Nguyen; | Proceedings of the 1st ACM Workshop on AI-Powered Q&A … | 2024-06-10 |
734 | In-Context Learning and Fine-Tuning GPT for Argument Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an ICL strategy for ATC combining kNN-based examples selection and majority vote ensembling. |
Jérémie Cabessa; Hugo Hernault; Umer Mushtaq; | arxiv-cs.CL | 2024-06-10 |
735 | Validating LLM-Generated Programs with Metamorphic Prompt Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research is required to comprehensively explore these critical concerns surrounding LLM-generated code. In this paper, we propose a novel solution called metamorphic prompt testing to address these challenges. |
Xiaoyin Wang; Dakai Zhu; | arxiv-cs.SE | 2024-06-10 |
736 | Annotation Alignment: Comparing LLM and Human Annotations of Conversational Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: GPT-4 achieves a Pearson correlation of $r = 0.59$ with the average annotator rating, \textit{higher} than the median annotator’s correlation with the average ($r=0.51$). We show that larger datasets are needed to resolve whether LLMs exhibit disparities in how well they correlate with different demographic groups. |
Rajiv Movva; Pang Wei Koh; Emma Pierson; | arxiv-cs.CL | 2024-06-10 |
737 | Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture. |
Martin Courtois; Malte Ostendorff; Leonhard Hennig; Georg Rehm; | arxiv-cs.CL | 2024-06-10 |
738 | Hidden Holes: Topological Aspects of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods developed in this paper are novel in the field and based on mathematical apparatus that might be unfamiliar to the target audience. |
Stephen Fitz; Peter Romero; Jiyan Jonas Schneider; | arxiv-cs.CL | 2024-06-09 |
739 | Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, resource-intensive VTs updating and high mobility of vehicles require intensive computation, communication, and storage resources, especially for their migration among RSUs with limited coverages. To address these issues, we propose an attribute-aware auction-based mechanism to optimize resource allocation during VTs migration by considering both price and non-monetary attributes, e.g., location and reputation. |
YONGJU TONG et. al. | arxiv-cs.AI | 2024-06-08 |
740 | MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. |
GYEONG HOON YI et. al. | arxiv-cs.CL | 2024-06-08 |
741 | G-Transformer: Counterfactual Outcome Prediction Under Dynamic and Time-varying Treatment Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present G-Transformer for counterfactual outcome prediction under dynamic and time-varying treatment strategies. |
Hong Xiong; Feng Wu; Leon Deng; Megan Su; Li-wei H Lehman; | arxiv-cs.LG | 2024-06-08 |
742 | Do LLMs Recognize Me, When I Is Not Me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first study examining indexical shift in any language, releasing a Turkish dataset specifically designed for this purpose. |
Metehan Oğuz; Yusuf Umut Ciftci; Yavuz Faruk Bakman; | arxiv-cs.CL | 2024-06-08 |
743 | Automata Extraction from Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automata extraction algorithm specifically designed for Transformer models. |
Yihao Zhang; Zeming Wei; Meng Sun; | arxiv-cs.LG | 2024-06-08 |
744 | Are Large Language Models More Empathetic Than Humans? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study exploring the empathetic responding capabilities of four state-of-the-art LLMs: GPT-4, LLaMA-2-70B-Chat, Gemini-1.0-Pro, and Mixtral-8x7B-Instruct in comparison to a human baseline. |
Anuradha Welivita; Pearl Pu; | arxiv-cs.CL | 2024-06-07 |
745 | VTrans: Accelerating Transformer Compression with Variational Information Bottleneck Based Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges, we propose VTrans, an iterative pruning framework guided by the Variational Information Bottleneck (VIB) principle. |
Oshin Dutta; Ritvik Gupta; Sumeet Agarwal; | arxiv-cs.LG | 2024-06-07 |
746 | Concept Formation and Alignment in Language Models: Bridging Statistical Patterns in Latent Space to Concept Taxonomy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a mechanism for identifying concepts and their hierarchical organization within the semantic representations learned by various LMs, encompassing a spectrum from early models like Glove to the transformer-based language models like ALBERT and T5. |
Mehrdad Khatir; Chandan K. Reddy; | arxiv-cs.CL | 2024-06-07 |
747 | Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its efficiency, Sentence-BERT tackles STS tasks from a classification perspective, overlooking the progressive nature of semantic relationships, which results in suboptimal performance. To bridge this gap, this paper presents an innovative regression framework and proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss. |
Bowen Zhang; Chunping Li; | arxiv-cs.CL | 2024-06-07 |
748 | BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense. |
Baktash Ansari; Mohammadmostafa Rostamkhani; Sauleh Eetemadi; | arxiv-cs.CL | 2024-06-07 |
749 | Low-Resource Cross-Lingual Summarization Through Few-Shot Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4. |
Gyutae Park; Seojin Hwang; Hwanhee Lee; | arxiv-cs.CL | 2024-06-07 |
750 | Transformer Conformal Prediction for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a conformal prediction method for time series using the Transformer architecture to capture long-memory and long-range dependencies. |
Junghwan Lee; Chen Xu; Yao Xie; | arxiv-cs.LG | 2024-06-07 |
751 | Mixture-of-Agents Enhances Large Language Model Capabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. |
Junlin Wang; Jue Wang; Ben Athiwaratkun; Ce Zhang; James Zou; | arxiv-cs.CL | 2024-06-07 |
752 | Logic Synthesis with Generative Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named ctrw (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting. |
XIHAN LI et. al. | arxiv-cs.LO | 2024-06-07 |
753 | Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations. |
NEEMESH YADAV et. al. | arxiv-cs.CL | 2024-06-06 |
754 | Exploring The Latest LLMs for Leaderboard Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document). |
Salomon Kabongo; Jennifer D’Souza; Sören Auer; | arxiv-cs.CL | 2024-06-06 |
755 | GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents’ performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of LLM agents. |
ANTHONY COSTARELLI et. al. | arxiv-cs.CL | 2024-06-06 |
756 | MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Multi-path Enhanced Taylor (MET) Transformer based U-net for Speech Enhancement (MUSE), a lightweight speech enhancement network built upon the Unet architecture. |
Zizhen Lin; Xiaoting Chen; Junyu Wang; | arxiv-cs.SD | 2024-06-06 |
757 | Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Global Clipper and Global Hybrid Clipper, effective mitigation strategies specifically designed for transformer-based models. |
QUTUB SYED SHA et. al. | arxiv-cs.CV | 2024-06-05 |
758 | CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition. |
YE ZENG et. al. | arxiv-cs.IT | 2024-06-05 |
759 | From Tarzan to Tolkien: Controlling The Language Proficiency Level of LLMs for Content Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of controlling the difficulty level of text generated by Large Language Models (LLMs) for contexts where end-users are not fully proficient, such as language learners. |
Ali Malik; Stephen Mayhew; Chris Piech; Klinton Bicknell; | arxiv-cs.CL | 2024-06-05 |
760 | The Good, The Bad, and The Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states. |
MIKHAIL MOZIKOV et. al. | arxiv-cs.AI | 2024-06-05 |
761 | Learning to Grok: Emergence of In-context Learning and Skill Composition in Modular Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks. |
Tianyu He; Darshil Doshi; Aritra Das; Andrey Gromov; | arxiv-cs.LG | 2024-06-04 |
762 | Multi-layer Learnable Attention Mask for Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Comprehensive experimental validation on various datasets, such as MADv2, QVHighlights, ImageNet 1K, and MSRVTT, demonstrates the efficacy of the LAM, exemplifying its ability to enhance model performance while mitigating redundant computations. This pioneering approach presents a significant advancement in enhancing the understanding of complex scenarios, such as in movie understanding. |
Wayner Barrios; SouYoung Jin; | arxiv-cs.CV | 2024-06-04 |
763 | Probing The Category of Verbal Aspect in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian. |
Anisia Katinskaia; Roman Yangarber; | arxiv-cs.CL | 2024-06-04 |
764 | Randomized Geometric Algebra Methods for Convex Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce randomized algorithms to Clifford’s Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces. |
Yifei Wang; Sungyoon Kim; Paul Chu; Indu Subramaniam; Mert Pilanci; | arxiv-cs.LG | 2024-06-04 |
765 | Too Big to Fail: Larger Language Models Are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous findings show that changes in PPL when masking attention layers in pre-trained transformer-based NLMs reflect linguistic anomalies associated with Alzheimer’s disease dementia. Building upon this, we explore a novel bidirectional attention head ablation method that exhibits properties attributed to the concepts of cognitive and brain reserve in human brain studies, which postulate that people with more neurons in the brain and more efficient processing are more resilient to neurodegeneration. |
Changye Li; Zhecheng Sheng; Trevor Cohen; Serguei Pakhomov; | arxiv-cs.CL | 2024-06-04 |
766 | A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs). |
Remi Genet; Hugo Inzirillo; | arxiv-cs.LG | 2024-06-04 |
767 | Eliciting The Priors of Large Language Models Using Iterated In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a prompt-based workflow for eliciting prior distributions from LLMs. |
Jian-Qiao Zhu; Thomas L. Griffiths; | arxiv-cs.CL | 2024-06-03 |
768 | Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our empirical study focuses on evaluating adversarial robustness of object trackers based on bounding box versus binary mask predictions, and attack methods at different levels of perturbations. |
Fatemeh Nourilenjan Nokabadi; Jean-François Lalonde; Christian Gagné; | arxiv-cs.CV | 2024-06-03 |
769 | Superhuman Performance in Urology Board Questions By An Explainable Large Language Model Enabled for Context Integration of The European Association of Urology Guidelines: The UroBot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: UroBot was developed using OpenAI’s GPT-3.5, GPT-4, and GPT-4o models, employing retrieval-augmented generation (RAG) and the latest 2023 guidelines from the European Association of Urology (EAU). |
MARTIN J. HETZ et. al. | arxiv-cs.CL | 2024-06-03 |
770 | SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for semantic understanding for complex tasks like debugging and program repair. |
YANGRUIBO DING et. al. | arxiv-cs.CL | 2024-06-03 |
771 | Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective. |
CHENG HAN et. al. | arxiv-cs.CV | 2024-06-03 |
772 | In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties? |
GRZEGORZ KASZUBA et. al. | arxiv-cs.LG | 2024-06-03 |
773 | Drive As Veteran: Fine-tuning of An Onboard Large Language Model for Highway Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Due to the limitations of network communication conditions for online calling GPT, the onboard deployment of Large Language Models for autonomous driving is in need. In this … |
YUJIN WANG et. al. | 2024 IEEE Intelligent Vehicles Symposium (IV) | 2024-06-02 |
774 | Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs. |
SHIQI LIU et. al. | arxiv-cs.CL | 2024-06-02 |
775 | RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks. |
Md. Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Md. Ashad Alam; | arxiv-cs.CL | 2024-06-01 |
776 | SwinFG: A Fine-grained Recognition Scheme Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhipeng Ma; Xiaoyu Wu; Anzhuo Chu; Lei Huang; Zhiqiang Wei; | Expert Syst. Appl. | 2024-06-01 |
777 | FuzzyTP-BERT: Enhancing Extractive Text Summarization with Fuzzy Topic Modeling and Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View |
Aytuğ Onan; Hesham A. Alhumyani; | J. King Saud Univ. Comput. Inf. Sci. | 2024-06-01 |
778 | Multi-granularity Cross Transformer Network for Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yanping Li; Duoqian Miao; Hongyun Zhang; Jie Zhou; Cairong Zhao; | Pattern Recognit. | 2024-06-01 |
779 | Mobile Transformer Accelerator Exploiting Various Line Sparsity and Tile-Based Dynamic Quantization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models are difficult to employ in mobile devices due to their memory- and computation-intensive properties. Accordingly, there is ongoing research on various methods … |
Eunji Kwon; Jongho Yoon; Seokhyeong Kang; | IEEE Transactions on Computer-Aided Design of Integrated … | 2024-06-01 |
780 | EdgeTran: Device-Aware Co-Search of Transformers for Efficient Inference on Mobile Edge Platforms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while … |
Shikhar Tuli; N. Jha; | IEEE Transactions on Mobile Computing | 2024-06-01 |
781 | Bidirectional Interaction of CNN and Transformer for Image Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jialu Liu; Maoguo Gong; Yuan Gao; Yihe Lu; Hao Li; | Knowl. Based Syst. | 2024-06-01 |
782 | Multimodal Metadata Assignment for Cultural Heritage Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset. |
LUIS REI et. al. | arxiv-cs.CV | 2024-06-01 |
783 | Beyond Metrics: Evaluating LLMs’ Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our evaluation includes both quantitative analysis using metrics like F1 score and qualitative assessment of LLMs’ explanations for their predictions. We find that, while Mistral-7b and Mixtral-8x7b achieved high F1 scores, they and other LLMs such as GPT-3.5-Turbo, Llama-2-70b, and Gemma-7b struggled with understanding linguistic and contextual nuances, as well as lack of transparency in their decision-making process as observed from their explanations. |
MILLICENT OCHIENG et. al. | arxiv-cs.CL | 2024-06-01 |
784 | LiteFormer: A Lightweight and Efficient Transformer for Rotating Machine Fault Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, … |
WENJUN SUN et. al. | IEEE Transactions on Reliability | 2024-06-01 |
785 | Transformer-based Fall Detection in Videos Related Papers Related Patents Related Grants Related Venues Related Experts View |
Adrián Núñez-Marcos; I. Arganda-Carreras; | Eng. Appl. Artif. Intell. | 2024-06-01 |
786 | A Comparison of Correspondence Analysis with PMI-based Word Embedding Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we link correspondence analysis (CA) to the factorization of the PMI matrix. |
Qianqian Qi; Ayoub Bagheri; David J. Hessen; Peter G. M. van der Heijden; | arxiv-cs.CL | 2024-05-31 |
787 | QClusformer: A Quantum Transformer-based Framework for Unsupervised Visual Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce QClusformer, a pioneering Transformer-based framework leveraging quantum machines to tackle unsupervised vision clustering challenges. |
XUAN-BAC NGUYEN et. al. | arxiv-cs.CV | 2024-05-30 |
788 | Bi-Directional Transformers Vs. Word2vec: Discovering Vulnerabilities in Lifted Compiled Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting vulnerabilities within compiled binaries is challenging due to lost high-level code structures and other factors such as architectural dependencies, compilers, and optimization options. To address these obstacles, this research explores vulnerability detection using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa to learn semantics from intermediate representation (LLVM IR) code. |
Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier; | arxiv-cs.CR | 2024-05-30 |
789 | The Point of View of A Sentiment: Towards Clinician Bias Detection in Psychiatric Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging large language models, this work aims to identify the sentiment expressed in psychiatric clinical notes based on the reader’s point of view. |
Alissa A. Valentine; Lauren A. Lepow; Alexander W. Charney; Isotta Landi; | arxiv-cs.CL | 2024-05-30 |
790 | Automatic Graph Topology-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers. |
CHAO WANG et. al. | arxiv-cs.NE | 2024-05-30 |
791 | Hyper-Transformer for Amodal Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Learning shape priors is crucial for effective amodal completion, but traditional methods often rely on two-stage processes or additional information, leading to inefficiencies and potential error accumulation. To address these shortcomings, we introduce a novel framework named the Hyper-Transformer Amodal Network (H-TAN). |
Jianxiong Gao; Xuelin Qian; Longfei Liang; Junwei Han; Yanwei Fu; | arxiv-cs.CV | 2024-05-30 |
792 | DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. |
JIA LI et. al. | arxiv-cs.CL | 2024-05-30 |
793 | Divide-and-Conquer Meets Consensus: Unleashing The Power of Functions in Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. |
JINGCHANG CHEN et. al. | arxiv-cs.CL | 2024-05-30 |
794 | Towards Next-Generation Urban Decision Support Systems Through AI-Powered Generation of Scientific Ontology Using Large Language Models – A Case in Optimizing Intermodal Freight Transportation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management … |
JOSE TUPAYACHI et. al. | ArXiv | 2024-05-29 |
795 | Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RIR robustly improves knowledge-intensive visual question answering (VQA) of GPT-4V by 37-43%, GPT-4 Turbo by 25-27%, and GPT-4o by 18-20% in terms of open-ended VQA evaluation metrics. To our surprise, we discover that RIR helps the model to better access its own world knowledge. |
Jialiang Xu; Michael Moor; Jure Leskovec; | arxiv-cs.CL | 2024-05-29 |
796 | Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As interest in reformulating the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA. |
Simranjit Singh; Georgios Pavlakos; Dimitrios Stamoulis; | arxiv-cs.CV | 2024-05-29 |
797 | Multi-objective Cross-task Learning Via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new learning-based framework by leveraging the strong reasoning capability of the GPT-based architecture to automate surgical robotic tasks. |
Jiawei Fu; Yonghao Long; Kai Chen; Wang Wei; Qi Dou; | arxiv-cs.RO | 2024-05-29 |
798 | MDS-ViTNet: Improving Saliency Prediction for Eye-Tracking with Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking. |
Polezhaev Ignat; Goncharenko Igor; Iurina Natalya; | arxiv-cs.CV | 2024-05-29 |
799 | A Multi-Source Retrieval Question Answering Framework Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information. |
RIDONG WU et. al. | arxiv-cs.IR | 2024-05-29 |
800 | Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Language models, such as GPT-3 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks, using instruction fine-tuning. … |
PENG LI et. al. | Proc. ACM Manag. Data | 2024-05-29 |
801 | Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Repeat Ranking method – where we evaluate the same responses multiple times and train only on those responses which are consistently ranked. |
Peter Devine; | arxiv-cs.CL | 2024-05-29 |
802 | AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we rethink the approach to jailbreaking LLMs and formally define three essential properties from the attacker’ s perspective, which contributes to guiding the design of jailbreak methods. |
JIAWEI CHEN et. al. | arxiv-cs.CV | 2024-05-29 |
803 | LMO-DP: Optimizing The Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$). |
QIN YANG et. al. | arxiv-cs.CR | 2024-05-29 |
804 | Beyond Agreement: Diagnosing The Rationale Alignment of Automated Essay Scoring Methods Based on Linguistically-informed Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that BERT-like models primarily focus on sentence-level features, whereas LLMs such as GPT-3.5, GPT-4 and Llama-3 are sensitive to conventions & accuracy, language complexity, and organization, indicating a more comprehensive rationale alignment with scoring rubrics. |
Yupei Wang; Renfen Hu; Zhe Zhao; | arxiv-cs.CL | 2024-05-29 |
805 | Voice Jailbreak Attacks Against GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first systematic measurement of jailbreak attacks against the voice mode of GPT-4o. |
Xinyue Shen; Yixin Wu; Michael Backes; Yang Zhang; | arxiv-cs.CR | 2024-05-29 |
806 | Data-Efficient Approach to Humanoid Control Via Fine-Tuning A Pre-Trained GPT on Action Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories. |
Siddharth Padmanabhan; Kazuki Miyazawa; Takato Horii; Takayuki Nagai; | arxiv-cs.RO | 2024-05-28 |
807 | Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks. |
Shubham Vatsal; Ayush Singh; | arxiv-cs.CL | 2024-05-28 |
808 | Notes on Applicability of GPT-4 to Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics. |
Łukasz Borchmann; | arxiv-cs.CL | 2024-05-28 |
809 | Delving Into Differentially Private Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such `reduction’ is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively. |
YOULONG DING et. al. | arxiv-cs.LG | 2024-05-28 |
810 | I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach aims to revolutionize teachers’ assessment of students’ practices by leveraging Generative Artificial Intelligence (GenAI) to offer detailed insights into classroom dynamics. |
UNGGI LEE et. al. | arxiv-cs.HC | 2024-05-28 |
811 | How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they … |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | ArXiv | 2024-05-27 |
812 | PivotMesh: Generic 3D Mesh Generation Via Pivot Vertices Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a generic and scalable mesh generation framework PivotMesh, which makes an initial attempt to extend the native mesh generation to large-scale datasets. |
Haohan Weng; Yikai Wang; Tong Zhang; C. L. Philip Chen; Jun Zhu; | arxiv-cs.CV | 2024-05-27 |
813 | Multi-objective Representation for Numbers in Clinical Narratives Using CamemBERT-bio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to classify numerical values extracted from medical documents across seven distinct physiological categories, employing CamemBERT-bio. |
Boammani Aser Lompo; Thanh-Dung Le; | arxiv-cs.CL | 2024-05-27 |
814 | Are Self-Attentions Effective for Time Series Forecasting? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we shift the focus from evaluating the overall Transformer architecture to specifically examining the effectiveness of self-attention for time series forecasting. |
Dongbin Kim; Jinseong Park; Jaewook Lee; Hoki Kim; | arxiv-cs.LG | 2024-05-27 |
815 | Vision-and-Language Navigation Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposal, the Vision-and-Language Navigation Generative Pretrained Transformer (VLN-GPT), adopts a transformer decoder model (GPT2) to model trajectory sequence dependencies, bypassing the need for historical encoding modules. |
Wen Hanlin; | arxiv-cs.AI | 2024-05-27 |
816 | Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous approaches to 3D human motion generation have achieved notable success, they often rely on extensive training and are limited to specific tasks. To address these challenges, we introduce Motion-Agent, an efficient conversational framework designed for general human motion generation, editing, and understanding. |
QI WU et. al. | arxiv-cs.CV | 2024-05-27 |
817 | RLAIF-V: Aligning MLLMs Through Open-Source AI Feedback for Super GPT-4V Trustworthiness IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm for super GPT-4V trustworthiness. |
TIANYU YU et. al. | arxiv-cs.CL | 2024-05-27 |
818 | InversionView: A General-Purpose Method for Reading Information from Neural Activations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activations. |
Xinting Huang; Madhur Panwar; Navin Goyal; Michael Hahn; | arxiv-cs.LG | 2024-05-27 |
819 | LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker. |
Shaohua Dong; Yunhe Feng; Qing Yang; Yuewei Lin; Heng Fan; | arxiv-cs.CV | 2024-05-27 |
820 | Deployment of Large Language Models to Control Mobile Robots at The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated. |
PASCAL SIKORSKI et. al. | arxiv-cs.RO | 2024-05-27 |
821 | Assessing LLMs Suitability for Knowledge Graph Completion Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Recent work has shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot … |
Vasile Ionut Remus Iga; Gheorghe Cosmin Silaghi; | arxiv-cs.CL | 2024-05-27 |
822 | Performance Evaluation of Reddit Comments Using Machine Learning and Natural Language Processing Methods in Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments. |
Xiaoxia Zhang; Xiuyuan Qi; Zixin Teng; | arxiv-cs.CL | 2024-05-26 |
823 | Disentangling and Integrating Relational and Sensory Information in Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we distinguish between two types of information: sensory information about the properties of individual objects, and relational information about the relationships between objects. |
Awni Altabaa; John Lafferty; | arxiv-cs.LG | 2024-05-26 |
824 | M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. |
MINGSHUANG LUO et. al. | arxiv-cs.CV | 2024-05-25 |
825 | Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. |
HOAI-CHAU TRAN et. al. | arxiv-cs.LG | 2024-05-25 |
826 | Activator: GLU Activation Function As The Core Component of A Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimental assessments conducted by this research show that both proposed modifications and reductions offer competitive performance in relation to baseline architectures, in support of the aims of this work in establishing a more efficient yet capable alternative to the traditional attention mechanism as the core component in designing transformer architectures. |
Abdullah Nazhat Abdullah; Tarkan Aydin; | arxiv-cs.CV | 2024-05-24 |
827 | Incremental Comprehension of Garden-Path Sentences By Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa. |
ANDREW LI et. al. | arxiv-cs.CL | 2024-05-24 |
828 | Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD). |
Gao Tianci; | arxiv-cs.RO | 2024-05-24 |
829 | The Buffer Mechanism for Multi-Step Information Reasoning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy based on their inherent structure and horizontal thinking strategy based on Chain of Thought to achieve multi-step reasoning. |
ZHIWEI WANG et. al. | arxiv-cs.AI | 2024-05-24 |
830 | GPTZoo: A Large-scale Dataset of GPTs for The Research Community Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support academic research on GPTs, we introduce GPTZoo, a large-scale dataset comprising 730,420 GPT instances. |
Xinyi Hou; Yanjie Zhao; Shenao Wang; Haoyu Wang; | arxiv-cs.SE | 2024-05-24 |
831 | GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey. |
Virginia K. Felkner; Jennifer A. Thompson; Jonathan May; | arxiv-cs.CL | 2024-05-24 |
832 | A Comparative Analysis of Distributed Training Strategies for GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques developed to address these challenges, enabling the efficient and scalable training of Large Language Models. |
Ishan Patwardhan; Shubham Gandhi; Om Khare; Amit Joshi; Suraj Sawant; | arxiv-cs.DC | 2024-05-24 |
833 | SMART: Scalable Multi-agent Real-time Motion Generation Via Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens. |
Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan; | arxiv-cs.RO | 2024-05-24 |
834 | Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS). |
Barış Büyüktaş; Kenneth Weitzel; Sebastian Völkers; Felix Zailskas; Begüm Demir; | arxiv-cs.CV | 2024-05-24 |
835 | Comet: A Communication-efficient and Performant Approximation for Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel plug-in method Comet to effectively reduce the communication cost without compromising the inference performance. |
Xiangrui Xu; Qiao Zhang; Rui Ning; Chunsheng Xin; Hongyi Wu; | arxiv-cs.LG | 2024-05-24 |
836 | SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings. |
GUIBAO SHEN et. al. | arxiv-cs.CV | 2024-05-24 |
837 | PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis. |
ZICHENG WANG et. al. | arxiv-cs.CV | 2024-05-24 |
838 | An Evaluation of Estimative Uncertainty in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares estimative uncertainty in commonly used large language models (LLMs) like GPT-4 and ERNIE-4 to that of humans, and to each other. |
Zhisheng Tang; Ke Shen; Mayank Kejriwal; | arxiv-cs.CL | 2024-05-23 |
839 | Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to The Edge of Generalization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with. |
Boshi Wang; Xiang Yue; Yu Su; Huan Sun; | arxiv-cs.CL | 2024-05-23 |
840 | CulturePark: Boosting Cross-cultural Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection. |
CHENG LI et. al. | arxiv-cs.AI | 2024-05-23 |
841 | Understanding The Training and Generalization of Pretrained Transformer for Sequential Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the supervised pre-trained transformer for a class of sequential decision-making problems. |
HANZHAO WANG et. al. | arxiv-cs.LG | 2024-05-23 |
842 | JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data. |
KUN ZHOU et. al. | arxiv-cs.CL | 2024-05-23 |
843 | CEEBERT: Cross-Domain Inference in Early Exit BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point. |
Divya Jyoti Bajpai; Manjesh Kumar Hanawal; | arxiv-cs.CL | 2024-05-23 |
844 | AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce AutoCoder, the first Large Language Model to surpass GPT-4 Turbo (April 2024) and GPT-4o in pass@1 on the Human Eval benchmark test ($\mathbf{90.9\%}$ vs. … |
Bin Lei; Yuchen Li; Qiuwu Chen; | ArXiv | 2024-05-23 |
845 | ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD. |
Luan Thanh Nguyen; | arxiv-cs.CL | 2024-05-22 |
846 | PuTR: A Pure Transformer for Decoupled and Online Multi-Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we demonstrate that the trajectory graph is a directed acyclic graph, which can be represented by an object sequence arranged by frame and a binary adjacency matrix. |
Chongwei Liu; Haojie Li; Zhihui Wang; Rui Xu; | arxiv-cs.CV | 2024-05-22 |
847 | Generative AI and Large Language Models for Cyber Security: All Insights You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs). |
MOHAMED AMINE FERRAG et. al. | arxiv-cs.CR | 2024-05-21 |
848 | How Reliable AI Chatbots Are for Disease Prediction from Patient Complaints? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. |
Ayesha Siddika Nipu; K M Sajjadul Islam; Praveen Madiraju; | arxiv-cs.AI | 2024-05-21 |
849 | Quantifying Emergence in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a quantifiable solution for estimating emergence. |
Hang Chen; Xinyu Yang; Jiaying Zhu; Wenya Wang; | arxiv-cs.CL | 2024-05-21 |
850 | Transformer in Touch: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to comprehensively outline the application and development of Transformers in tactile technology. |
Jing Gao; Ning Cheng; Bin Fang; Wenjuan Han; | arxiv-cs.LG | 2024-05-21 |
851 | Automated Hardware Logic Obfuscation Framework Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process. |
BANAFSHEH SABER LATIBARI et. al. | arxiv-cs.CR | 2024-05-20 |
852 | GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Iterative Refinement Induced Self-Jailbreak (IRIS), a novel approach that leverages the reflective capabilities of LLMs for jailbreaking with only black-box access. |
Govind Ramesh; Yao Dou; Wei Xu; | arxiv-cs.CR | 2024-05-20 |
853 | From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, … |
PRIYANKA NANAYAKKARA et. al. | 2024 IEEE Symposium on Security and Privacy (SP) | 2024-05-19 |
854 | Zero-Shot Stance Detection Using Contextual Data Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this approach, we aim to fine-tune an existing model at test time. |
Ghazaleh Mahmoudi; Babak Behkamkia; Sauleh Eetemadi; | arxiv-cs.CL | 2024-05-19 |
855 | Enhancing User Experience in Large Language Models Through Human-centered Design: Integrating Theoretical Insights with An Experimental Study to Meet Diverse Software Learning Needs with A Single Document Knowledge Base Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The experimental results demonstrate the effect of different elements’ forms and organizational methods in the document, as well as GPT’s relevant configurations, on the interaction effectiveness between GPT and software learners. |
Yuchen Wang; Yin-Shan Lin; Ruixin Huang; Jinyin Wang; Sensen Liu; | arxiv-cs.HC | 2024-05-19 |
856 | DaVinci at SemEval-2024 Task 9: Few-shot Prompting GPT-3.5 for Unconventional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types. |
Suyash Vardhan Mathur; Akshett Rai Jindal; Manish Shrivastava; | arxiv-cs.CL | 2024-05-19 |
857 | Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques. |
Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; Chen Hajaj; | arxiv-cs.CR | 2024-05-18 |
858 | GPTs Window Shopping: An Analysis of The Landscape of Custom ChatGPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Customization comes in the form of prompt-tuning, analysis of reference resources, browsing, and external API interactions, alongside a promise of revenue sharing for created custom GPTs. In this work, we peer into the window of the GPT Store and measure its impact. |
Benjamin Zi Hao Zhao; Muhammad Ikram; Mohamed Ali Kaafar; | arxiv-cs.SI | 2024-05-17 |
859 | Benchmarking Large Language Models on CFLUE — A Chinese Financial Language Understanding Evaluation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CFLUE, the Chinese Financial Language Understanding Evaluation benchmark, designed to assess the capability of LLMs across various dimensions. |
Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo; | arxiv-cs.CL | 2024-05-17 |
860 | Benchmarking Large Language Models on CFLUE – A Chinese Financial Language Understanding Evaluation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In light of recent breakthroughs in large language models (LLMs) that have revolutionized natural language processing (NLP), there is an urgent need for new benchmarks to keep … |
Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo; | Annual Meeting of the Association for Computational … | 2024-05-17 |
861 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 1.55B parameters. |
RHEA SANJAY SUKTHANKER et. al. | arxiv-cs.LG | 2024-05-16 |
862 | Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, no comparative study examining different LLMs has yet been reported for web-form-test generation. |
TAO LI et. al. | arxiv-cs.SE | 2024-05-16 |
863 | GPT Store Mining and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings aim to enhance understanding of the GPT ecosystem, providing valuable insights for future research, development, and policy-making in generative AI. |
Dongxun Su; Yanjie Zhao; Xinyi Hou; Shenao Wang; Haoyu Wang; | arxiv-cs.LG | 2024-05-16 |
864 | Comparing The Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to compare the performance of two large language models, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings. |
Birger Moell; | arxiv-cs.CL | 2024-05-15 |
865 | ALPINE: Unveiling The Planning Capability of Autoregressive Learning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the findings of our Project ALPINE which stands for “Autoregressive Learning for Planning In NEtworks. |
SIWEI WANG et. al. | arxiv-cs.LG | 2024-05-15 |
866 | Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP). |
Tong Zhan; Chenxi Shi; Yadong Shi; Huixiang Li; Yiyu Lin; | arxiv-cs.CL | 2024-05-15 |
867 | GPT-3.5 for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models. |
Anisia Katinskaia; Roman Yangarber; | arxiv-cs.CL | 2024-05-14 |
868 | Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a theoretical framework that sheds light on the memorization process and performance dynamics of transformer-based language models. |
Xueyan Niu; Bo Bai; Lei Deng; Wei Han; | arxiv-cs.LG | 2024-05-14 |
869 | Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Continual learning, which acts as an effective tool for detecting newly emerged deepfake audio while maintaining performance on older types, lacks a well-constructed and user-friendly evaluation framework. To address this gap, we introduce EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection. |
Xiaohui Zhang; Jiangyan Yi; Jianhua Tao; | arxiv-cs.SD | 2024-05-14 |
870 | Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes a concurrent programming framework for quantitatively analyzing the efficiency challenges in serving multiple long-context requests under limited size of GPU high-bandwidth memory (HBM) regime. |
Yao Fu; | arxiv-cs.LG | 2024-05-14 |
871 | Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs. |
CHENGYUE WU et. al. | arxiv-cs.CL | 2024-05-13 |
872 | PRECYSE: Predicting Cybersickness Using Transformer for Multimodal Time-Series Sensor Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cybersickness, a factor that hinders user immersion in VR, has been the subject of ongoing attempts to predict it using AI. Previous studies have used CNN and LSTM for prediction … |
Dayoung Jeong; Kyungsik Han; | Proceedings of the ACM on Interactive, Mobile, Wearable and … | 2024-05-13 |
873 | COLA: Cross-city Mobility Transformer for Human Trajectory Simulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to explore the intriguing problem of mobility transfer across cities, grasping the universal patterns of human trajectories to augment the powerful Transformer with external mobility data. |
Yu Wang; Tongya Zheng; Yuxuan Liang; Shunyu Liu; Mingli Song; | www | 2024-05-13 |
874 | Decision Mamba Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models. |
André Correia; Luís A. Alexandre; | arxiv-cs.LG | 2024-05-13 |
875 | Coding Historical Causes of Death Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. |
BJØRN PEDERSEN et. al. | arxiv-cs.LG | 2024-05-13 |
876 | Can GNN Be Good Adapter for LLMs? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. |
XUANWEN HUANG et. al. | www | 2024-05-13 |
877 | Open-vocabulary Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the \textbf{Brain Prompt GPT (BP-GPT)}. |
Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He; | arxiv-cs.HC | 2024-05-13 |
878 | L(u)PIN: LLM-based Political Ideology Nowcasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method to analyze ideological positions of individual parliamentary representatives by leveraging the latent knowledge of LLMs. |
Ken Kato; Annabelle Purnomo; Christopher Cochrane; Raeid Saqur; | arxiv-cs.CL | 2024-05-12 |
879 | Limited Ability of LLMs to Simulate Human Psychological Behaviours: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we prompt OpenAI’s flagship models, GPT-3.5 and GPT-4, to assume different personas and respond to a range of standardized measures of personality constructs. |
Nikolay B Petrov; Gregory Serapio-García; Jason Rentfrow; | arxiv-cs.CL | 2024-05-12 |
880 | Can Language Models Explain Their Own Classification Behavior? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules. |
Dane Sherburn; Bilal Chughtai; Owain Evans; | arxiv-cs.LG | 2024-05-12 |
881 | Integrating Expertise in LLMs: Crafting A Customized Nutrition Assistant with Refined Template Instructions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have the potential to contribute to the fields of nutrition and dietetics in generating food product explanations that facilitate informed food … |
Annalisa Szymanski; Brianna L Wimer; Oghenemaro Anuyah; H. Eicher-Miller; Ronald A Metoyer; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
882 | ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Peer influence plays a crucial role in promoting classroom participation, where behaviors from active students can contribute to a collective classroom learning experience. … |
ZIYI LIU et. al. | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
883 | Retrieval Enhanced Zero-Shot Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to take advantage of existing pre-trained large-scale vision and language models to directly generate captions with test time adaptation. |
YUNCHUAN MA et. al. | arxiv-cs.CV | 2024-05-11 |
884 | TacoERE: Cluster-aware Compression for Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm. |
YONG GUAN et. al. | arxiv-cs.CL | 2024-05-10 |
885 | Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution. |
IBAI RAMIREZ et. al. | arxiv-cs.LG | 2024-05-10 |
886 | Multimodal LLMs Struggle with Basic Visual Network Analysis: A VNA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose. |
Evan M. Williams; Kathleen M. Carley; | arxiv-cs.CV | 2024-05-10 |
887 | A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task. |
Dongwei Sun; Yajie Bao; Junmin Liu; Xiangyong Cao; | arxiv-cs.CV | 2024-05-10 |
888 | ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design. |
Francisco Olivos; Minhui Liu; | arxiv-cs.CY | 2024-05-10 |
889 | Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder. |
YAO GE et. al. | arxiv-cs.CL | 2024-05-09 |
890 | People Cannot Distinguish GPT-4 from A Human in A Turing Test Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or … |
Cameron R. Jones; Benjamin K. Bergen; | ArXiv | 2024-05-09 |
891 | Spectral Superresolution Using Transformer with Convolutional Spectral Self-Attention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Hyperspectral images (HSI) find extensive application across numerous domains of study. Spectral superresolution (SSR) refers to reconstructing HSIs from readily available RGB … |
Xiaomei Liao; Lirong He; Jiayou Mao; Meng Xu; | Remote. Sens. | 2024-05-09 |
892 | Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. |
HAOQI WU et. al. | arxiv-cs.CR | 2024-05-08 |
893 | Optimizing Software Vulnerability Detection Using RoBERTa and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View |
Cho Xuan Do; Nguyen Trong Luu; Phuong Thi Lan Nguyen; | Autom. Softw. Eng. | 2024-05-08 |
894 | Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals. |
Aylin Gunal; Baihan Lin; Djallel Bouneffouf; | arxiv-cs.CL | 2024-05-08 |
895 | Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by recent work that has utilised very powerful LLMs, such as GPT-4, to evaluate the outputs produced by less powerful models, we conduct an automated analysis of the quality of the feedback produced by several open source models using a dataset from an introductory programming course. |
CHARLES KOUTCHEME et. al. | arxiv-cs.CL | 2024-05-08 |
896 | Evaluating Text Summaries Generated By Large Language Models Using OpenAI’s GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research examines the effectiveness of OpenAI’s GPT models as independent evaluators of text summaries generated by six transformer-based models from Hugging Face: DistilBART, BERT, ProphetNet, T5, BART, and PEGASUS. |
Hassan Shakil; Atqiya Munawara Mahi; Phuoc Nguyen; Zeydy Ortiz; Mamoun T. Mardini; | arxiv-cs.CL | 2024-05-07 |
897 | Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries. |
Hassan Shakil; Zeydy Ortiz; Grant C. Forbes; | arxiv-cs.CL | 2024-05-07 |
898 | A Transformer with Stack Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism. |
Jiaoda Li; Jennifer C. White; Mrinmaya Sachan; Ryan Cotterell; | arxiv-cs.CL | 2024-05-07 |
899 | The Silicon Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness. |
Lena Armstrong; Abbey Liu; Stephen MacNeil; Danaë Metaxa; | arxiv-cs.CY | 2024-05-07 |
900 | GPT-Enabled Cybersecurity Training: A Tailored Approach for Effective Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the limitations of traditional Cybersecurity Awareness and Training (CSAT) programs and proposes an innovative solution using Generative Pre-Trained Transformers (GPT) to address these shortcomings. |
Nabil Al-Dhamari; Nathan Clarke; | arxiv-cs.CR | 2024-05-07 |
901 | How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms. |
Jorge García-Carrasco; Alejandro Maté; Juan Trujillo; | arxiv-cs.LG | 2024-05-07 |
902 | Structured Click Control in Transformer-based Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness of the response, we propose a structured click intent model based on graph neural networks, which adaptively obtains graph nodes via the global similarity of user-clicked Transformer tokens. |
Long Xu; Yongquan Chen; Rui Huang; Feng Wu; Shiwu Lai; | arxiv-cs.CV | 2024-05-07 |
903 | Addressing Data Scarcity in The Medical Domain: A GPT-Based Approach for Synthetic Data Generation and Feature Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research confronts the persistent challenge of data scarcity in medical machine learning by introducing a pioneering methodology that harnesses the capabilities of Generative … |
F. Sufi; | Inf. | 2024-05-06 |
904 | Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs have not yet been used to characterize synergistic learning in students’ collaborative discourse. In this exploratory work, we take a first step towards adopting a human-in-the-loop prompt engineering approach with GPT-4-Turbo to summarize and categorize students’ synergistic learning during collaborative discourse. |
Clayton Cohn; Caitlin Snyder; Justin Montenegro; Gautam Biswas; | arxiv-cs.CL | 2024-05-06 |
905 | Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation. |
Keith Burghardt; Kai Chen; Kristina Lerman; | arxiv-cs.CL | 2024-05-06 |
906 | Detecting Anti-Semitic Hate Speech Using Transformer-based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we developed a new data labeling technique and established a proof of concept targeting anti-Semitic hate speech, utilizing a variety of transformer models such as BERT (arXiv:1810.04805), DistillBERT (arXiv:1910.01108), RoBERTa (arXiv:1907.11692), and LLaMA-2 (arXiv:2307.09288), complemented by the LoRA fine-tuning approach (arXiv:2106.09685). |
Dengyi Liu; Minghao Wang; Andrew G. Catlin; | arxiv-cs.CL | 2024-05-06 |
907 | Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This anchored bias challenges the integrity of GPT-2’s decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias. |
Ruizhe Li; Yanjun Gao; | arxiv-cs.CL | 2024-05-06 |
908 | Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems. |
Bingzhang Wang; Muhammad Monjurul Karim; Chenxi Liu; Yinhai Wang; | arxiv-cs.MA | 2024-05-05 |
909 | Can Large Language Models Make The Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents reports on a series of experiments with a novel dataset evaluating how well Large Language Models (LLMs) can mark (i.e. grade) open text responses to short answer questions, Specifically, we explore how well different combinations of GPT version and prompt engineering strategies performed at marking real student answers to short answer across different domain areas (Science and History) and grade-levels (spanning ages 5-16) using a new, never-used-before dataset from Carousel, a quizzing platform. |
Owen Henkel; Adam Boxer; Libby Hills; Bill Roberts; | arxiv-cs.CL | 2024-05-05 |
910 | Unraveling The Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the underexplored area of evaluating LLMs in low-resourced languages such as Bengali. |
Fatema Tuj Johora Faria; Mukaffi Bin Moin; Asif Iftekher Fahim; Pronay Debnath; Faisal Muhammad Shah; | arxiv-cs.CL | 2024-05-05 |
911 | A Combination of BERT and Transformer for Vietnamese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction. |
Hieu Ngo Trung; Duong Tran Ham; Tin Huynh; Kiem Hoang; | arxiv-cs.CL | 2024-05-04 |
912 | U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on self-attention with downsampled tokens, we propose a series of U-shaped DiTs (U-DiTs) in the paper and conduct extensive experiments to demonstrate the extraordinary performance of U-DiT models. |
YUCHUAN TIAN et. al. | arxiv-cs.CV | 2024-05-04 |
913 | Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on The Travelling Salesman Problem Using GPT-3.5 Turbo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP). |
Mahmoud Masoud; Ahmed Abdelhay; Mohammed Elhenawy; | arxiv-cs.CL | 2024-05-03 |
914 | Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to Test BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. |
PATRICK KRAUSS et. al. | arxiv-cs.CL | 2024-05-03 |
915 | Structural Pruning of Pre-trained Language Models Via Neural Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process. |
Aaron Klein; Jacek Golebiowski; Xingchen Ma; Valerio Perrone; Cedric Archambeau; | arxiv-cs.LG | 2024-05-03 |
916 | REASONS: A Benchmark for REtrieval and Automated CitationS Of ScieNtific Sentences Using Public and Proprietary LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article. |
DEEPA TILWANI et. al. | arxiv-cs.CL | 2024-05-03 |
917 | GroupedMixer: An Entropy Model with Group-wise Token-Mixers for Learned Image Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel transformer-based entropy model called GroupedMixer, which enjoys both faster coding speed and better compression performance than previous transformer-based methods. |
DAXIN LI et. al. | arxiv-cs.CV | 2024-05-02 |
918 | The Effectiveness of LLMs As Annotators: A Comparative Overview and Empirical Analysis of Direct Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data. |
Maja Pavlovic; Massimo Poesio; | arxiv-cs.CL | 2024-05-02 |
919 | UQA: Corpus for Urdu Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers. |
Samee Arif; Sualeha Farid; Awais Athar; Agha Ali Raza; | arxiv-cs.CL | 2024-05-02 |
920 | Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. |
TOLGA BUZ et. al. | arxiv-cs.CL | 2024-05-02 |
921 | Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements. |
SEUNGONE KIM et. al. | arxiv-cs.CL | 2024-05-02 |
922 | Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. We investigate the ability of … |
TOLGA BUZ et. al. | STARSEM | 2024-05-02 |
923 | Memory-Augmented Autoencoder Based Continuous Authentication on Smartphones With Conditional Transformer GANs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Over the last years, sensor-based continuous authentication on mobile devices has achieved great success on personal information protection. These proposed mechanisms, however, … |
YANTAO LI et. al. | IEEE Transactions on Mobile Computing | 2024-05-01 |
924 | Vision Transformer: To Discover The four Secrets of Image Patches Related Papers Related Patents Related Grants Related Venues Related Experts View |
TAO ZHOU et. al. | Inf. Fusion | 2024-05-01 |
925 | A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges. |
Ayaz Mehmood; Muhammad Tayyab Zamir; Muhammad Asif Ayub; Nasir Ahmad; Kashif Ahmad; | arxiv-cs.CL | 2024-05-01 |
926 | Chat-GPT; Validating Technology Acceptance Model (TAM) in Education Sector Via Ubiquitous Learning Mechanism IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
N. SAIF et. al. | Comput. Hum. Behav. | 2024-05-01 |
927 | Semantic Perceptive Infrared and Visible Image Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIN YANG et. al. | Pattern Recognit. | 2024-05-01 |
928 | FedViT: Federated Continual Learning of Vision Transformer at Edge Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIAOJIANG ZUO et. al. | Future Gener. Comput. Syst. | 2024-05-01 |
929 | Collaborative Compensative Transformer Network for Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jun Chen; Heye Zhang; Mingming Gong; Zhifan Gao; | Pattern Recognit. | 2024-05-01 |
930 | How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. |
JIONGHAO LIN et. al. | arxiv-cs.CL | 2024-05-01 |
931 | Joint Pixel and Frequency Feature Learning and Fusion Via Channel-Wise Transformer for High-Efficiency Learned In-Loop Filter in VVC Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Block-based video codecs such as Versatile Video Coding (VVC)/H.266, High Efficiency Video Coding (HEVC)/H.265, Advanced Video Coding (AVC)/H.264 etc. inherently introduces … |
B. Kathariya; Zhu Li; G. V. D. Auwera; | IEEE Transactions on Circuits and Systems for Video … | 2024-05-01 |
932 | Energy-informed Graph Transformer Model for Solid Mechanical Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View |
Bo Feng; Xiaoping Zhou; | Commun. Nonlinear Sci. Numer. Simul. | 2024-05-01 |
933 | Transformer Dense Center Network for Liver Tumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
JINLIN MA et. al. | Biomed. Signal Process. Control. | 2024-05-01 |
934 | Harmonic LLMs Are Trustworthy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an intuitive method to test the robustness (stability and explainability) of any black-box LLM in real-time via its local deviation from harmoniticity, denoted as $\gamma$. |
Nicholas S. Kersting; Mohammad Rahman; Suchismitha Vedala; Yang Wang; | arxiv-cs.LG | 2024-04-30 |
935 | Do Large Language Models Understand Conversational Implicature — A Case Study with A Chinese Sitcom Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $\textit{My Own Swordsman}$. |
Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu; | arxiv-cs.CL | 2024-04-30 |
936 | Do Large Language Models Understand Conversational Implicature – A Case Study with A Chinese Sitcom Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce … |
Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu; | ArXiv | 2024-04-30 |
937 | How Can I Improve? Using GPT to Highlight The Desired and Undesired Parts of Open-ended Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our aim is to equip tutors with actionable, explanatory feedback during online training lessons. |
JIONGHAO LIN et. al. | arxiv-cs.CL | 2024-04-30 |
938 | RSCaMa: Remote Sensing Image Change Captioning with State Space Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite previous methods progressing in the spatial change perception, there are still weaknesses in joint spatial-temporal modeling. To address this, in this paper, we propose a novel RSCaMa model, which achieves efficient joint spatial-temporal modeling through multiple CaMa layers, enabling iterative refinement of bi-temporal features. |
CHENYANG LIU et. al. | arxiv-cs.CV | 2024-04-29 |
939 | Ethical Reasoning and Moral Value Alignment of LLMs Depend on The Language We Prompt Them in Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs — GPT-4, ChatGPT, and Llama2-70B-Chat — perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted. |
Utkarsh Agarwal; Kumar Tanmay; Aditi Khandelwal; Monojit Choudhury; | arxiv-cs.CL | 2024-04-29 |
940 | GPT-4 Passes Most of The 297 Written Polish Board Certification Examinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: We developed a software program to download and process PES exams and tested the performance of GPT models using OpenAI Application Programming Interface. |
Jakub Pokrywka; Jeremi Kaczmarek; Edward Gorzelańczyk; | arxiv-cs.CL | 2024-04-29 |
941 | Time Machine GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative. |
Felix Drinkall; Eghbal Rahimikia; Janet B. Pierrehumbert; Stefan Zohren; | arxiv-cs.CL | 2024-04-29 |
942 | Can GPT-4 Do L2 Analytic Assessment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components. |
Stefano Bannò; Hari Krishna Vydana; Kate M. Knill; Mark J. F. Gales; | arxiv-cs.CL | 2024-04-29 |
943 | PatentGPT: A Large Language Model for Intellectual Property Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain. |
ZILONG BAI et. al. | arxiv-cs.CL | 2024-04-28 |
944 | Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the study of how subwording affects the understanding capacity of language models has been very few and only limited to a handful of languages. To reduce this gap we used 6 different tokenization schemes to pretrain relatively small language models in Nepali and used the representations learned to finetune on several downstream tasks. |
Nishant Luitel; Nirajan Bekoju; Anand Kumar Sah; Subarna Shakya; | arxiv-cs.CL | 2024-04-28 |
945 | CLFT: Camera-LiDAR Fusion Transformer for Semantic Segmentation in Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, the vision transformer is the novel ground-breaker that successfully brought the multi-head-attention mechanism to computer vision applications. Therefore, we propose a vision-transformer-based network to carry out camera-LiDAR fusion for semantic segmentation applied to autonomous driving. |
Junyi Gu; Mauro Bellone; Tomáš Pivoňka; Raivo Sell; | arxiv-cs.CV | 2024-04-27 |
946 | GPT for Games: A Scoping Review (2020-2023) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a scoping review of 55 articles to explore GPT’s potential for games, offering researchers a comprehensive understanding of the current applications and identifying both emerging trends and unexplored areas. |
Daijin Yang; Erica Kleinman; Casper Harteveld; | arxiv-cs.HC | 2024-04-27 |
947 | Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work addresses the task of detecting conspiracy theories in German Telegram messages. |
Milena Pustet; Elisabeth Steffen; Helena Mihaljević; | arxiv-cs.CL | 2024-04-27 |
948 | MRScore: Evaluating Radiology Report Generation with LLM-based Reward System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MRScore, an automatic evaluation metric tailored for radiology report generation by leveraging Large Language Models (LLMs). |
YUNYI LIU et. al. | arxiv-cs.CL | 2024-04-27 |
949 | CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments. |
KAIXUAN HUANG et. al. | arxiv-cs.AI | 2024-04-27 |
950 | ChatGPT Is Here to Help, Not to Replace Anybody — An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, 52 first-year CS students were surveyed in order to assess their views on technologies with code-generation capabilities, both from academic and professional perspectives. |
Bruno Pereira Cipriano; Pedro Alves; | arxiv-cs.ET | 2024-04-26 |
951 | Enhancing Legal Compliance and Regulation Analysis with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks. |
Shabnam Hassani; | arxiv-cs.SE | 2024-04-26 |
952 | UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt — A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. |
PARTH VASHISHT et. al. | arxiv-cs.AI | 2024-04-26 |
953 | ChatGPT Is Here to Help, Not to Replace Anybody – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) like GPT and Bard are capable of producing code based on textual descriptions, with remarkable efficacy. Such technology will have profound … |
Bruno Pereira Cipriano; P. Alves; | ArXiv | 2024-04-26 |
954 | Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT As A Pivot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process. |
Michelle Terblanche; Kayode Olaleye; Vukosi Marivate; | arxiv-cs.CL | 2024-04-26 |
955 | TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters. |
LIANG ZHANG et. al. | arxiv-cs.CV | 2024-04-25 |
956 | Player-Driven Emergence in LLM-Driven Game Narrative Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives. |
XIANGYU PENG et. al. | arxiv-cs.CL | 2024-04-25 |
957 | Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative artificial intelligences, especially large language models (LLMs), are increasingly being used, necessitating transparency about their capabilities. While prior studies … |
Lydia Uhler; Verena Jordan; Jürgen Buder; Markus Huff; Frank Papenmeier; | arxiv-cs.CL | 2024-04-25 |
958 | Exploring Internal Numeracy in Language Models: A Case Study on ALBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models. |
Ulme Wennberg; Gustav Eje Henter; | arxiv-cs.CL | 2024-04-25 |
959 | Promoting CNNs with Cross-Architecture Knowledge Distillation for Efficient Monocular Depth Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively distill valuable information from the transformer teacher and bridge the gap between convolution and transformer features, we introduce a method to acclimate the teacher with a ghost decoder. |
Zhimeng Zheng; Tao Huang; Gongsheng Li; Zuyi Wang; | arxiv-cs.CV | 2024-04-25 |
960 | IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate a wide range of proprietary and open-source LLMs including GPT-3.5, GPT-4, PaLM-2, mT5, Gemma, BLOOM and LLaMA on IndicGenBench in a variety of settings. |
Harman Singh; Nitish Gupta; Shikhar Bharadwaj; Dinesh Tewari; Partha Talukdar; | arxiv-cs.CL | 2024-04-25 |
961 | Towards Efficient Patient Recruitment for Clinical Trials: Application of A Prompt-Based Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR. |
Mojdeh Rahmanian; Seyed Mostafa Fakhrahmad; Seyedeh Zahra Mousavi; | arxiv-cs.CL | 2024-04-24 |
962 | An Automated Learning Model for Twitter Sentiment Analysis Using Ranger AdaBelief Optimizer Based Bidirectional Long Short Term Memory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment analysis is an automated approach which is utilized in process of analysing textual data to describe public opinion. The sentiment analysis has major role in creating … |
Sasirekha Natarajan; Smitha Kurian; P. Divakarachari; Przemysław Falkowski‐Gilski; | Expert Syst. J. Knowl. Eng. | 2024-04-24 |
963 | The Promise and Challenges of Using LLMs to Accelerate The Screening Process of Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening. |
Aleksi Huotala; Miikka Kuutila; Paul Ralph; Mika Mäntylä; | arxiv-cs.CL | 2024-04-24 |
964 | Automated Creation of Source Code Variants of A Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CR | 2024-04-24 |
965 | GeckOpt: LLM System Efficiency Via Intent-Based Tool Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By … |
Michael Fore; Simranjit Singh; Dimitrios Stamoulis; | Proceedings of the Great Lakes Symposium on VLSI 2024 | 2024-04-24 |
966 | A Comprehensive Survey on Evaluating Large Language Model Applications in The Medical Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective and ethical deployment. |
Yining Huang; Keke Tang; Meilian Chen; Boyuan Wang; | arxiv-cs.CL | 2024-04-24 |
967 | Transformers Can Represent $n$-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; | arxiv-cs.CL | 2024-04-23 |
968 | Pyramid Hierarchical Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). |
Muhammad Ahmad; Muhammad Hassaan Farooq Butt; Manuel Mazzara; Salvatore Distifano; | arxiv-cs.CV | 2024-04-23 |
969 | Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance. |
Qianru Meng; Xiao Zhang; Guus Ramackers; Visser Joost; | arxiv-cs.SE | 2024-04-23 |
970 | Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on a specific use case, pharmaceutical manufacturing investigations, and propose that leveraging historical records of manufacturing incidents and deviations in an organization can be beneficial for addressing and closing new cases, or de-risking new manufacturing campaigns. |
Hossein Salami; Brandye Smith-Goettler; Vijay Yadav; | arxiv-cs.CL | 2024-04-23 |
971 | PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs. |
SHASHI KANT GUPTA et. al. | arxiv-cs.CL | 2024-04-23 |
972 | From Complexity to Clarity: How AI Enhances Perceptions of Scientists and The Public’s Understanding of Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public’s understanding of science. |
David M. Markowitz; | arxiv-cs.CL | 2024-04-23 |
973 | Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates GPT-4V’s ability to interpret meteorological charts and communicate weather hazards appropriately to the user, despite challenges of hallucinations, where generative AI delivers coherent, confident, but incorrect responses. We assess GPT-4V’s competence via its web interface ChatGPT in two tasks: (1) generating a severe-weather outlook from weather-chart analysis and conducting self-evaluation, revealing an outlook that corresponds well with a Storm Prediction Center human-issued forecast; and (2) producing hazard summaries in Spanish and English from weather charts. |
JOHN R. LAWSON et. al. | arxiv-cs.CL | 2024-04-22 |
974 | Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Marking, a novel grading task that enhances automated grading systems by performing an in-depth analysis of student responses and providing students with visual highlights. |
Shashank Sonkar; Naiming Liu; Debshila B. Mallick; Richard G. Baraniuk; | arxiv-cs.CL | 2024-04-22 |
975 | A Multimodal Feature Distillation with CNN-Transformer Network for Brain Tumor Segmentation with Incomplete Modalities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Multimodal feature distillation with Convolutional Neural Network (CNN)-Transformer hybrid network (MCTSeg) for accurate brain tumor segmentation with missing modalities. |
Ming Kang; Fung Fung Ting; Raphaël C. -W. Phan; Zongyuan Ge; Chee-Ming Ting; | arxiv-cs.CV | 2024-04-22 |
976 | Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CL | 2024-04-22 |
977 | How Well Can LLMs Echo Us? Evaluating AI Chatbots’ Role-Play Ability with ECHO Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an oversight limits the potential for advancements in digital human clones and non-player characters in video games. To bridge this gap, we introduce ECHO, an evaluative framework inspired by the Turing test. |
MAN TIK NG et. al. | arxiv-cs.CL | 2024-04-22 |
978 | Pre-Calc: Learning to Use The Calculator Improves Numeracy in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively. |
Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate; | arxiv-cs.CL | 2024-04-22 |
979 | Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This paper presents a preliminary evaluation of GPT-4-Vision, a state-of-the-art deep learning model, and its capabilities in transforming Unified Modeling Language (UML) class diagrams into fully operating Java class files. |
Gábor Antal; Richárd Vozár; Rudolf Ferenc; | arxiv-cs.SE | 2024-04-22 |
980 | What Do Transformers Know About Government? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. |
JUE HOU et. al. | arxiv-cs.CL | 2024-04-22 |
981 | SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM’s SVG Editing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For quantitative evaluation of LLMs’ ability to edit SVG, we propose SVGEditBench. |
Kunato Nishina; Yusuke Matsui; | arxiv-cs.CV | 2024-04-21 |
982 | Automated Text Mining of Experimental Methodologies from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the fine-tuned DistilBERT, a methodology-specific, pre-trained generative classification language model for mining biomedicine texts. |
Ziqing Guo; | arxiv-cs.CL | 2024-04-21 |
983 | Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a solution, we propose a combined intrinsic-extrinsic evaluation framework for subword tokenization. |
KHUYAGBAATAR BATSUREN et. al. | arxiv-cs.CL | 2024-04-20 |
984 | Do English Named Entity Recognizers Work Well on Global Englishes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As such, it is unclear whether they generalize for analyzing use of English globally. To test this, we build a newswire dataset, the Worldwide English NER Dataset, to analyze NER model performance on low-resource English variants from around the world. |
Alexander Shan; John Bauer; Riley Carlson; Christopher Manning; | arxiv-cs.CL | 2024-04-20 |
985 | Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. |
Danqing Ma; Meng Wang; Ao Xiang; Zongqing Qi; Qin Yang; | arxiv-cs.CV | 2024-04-19 |
986 | Crowdsourcing Public Attitudes Toward Local Services Through The Lens of Google Maps Reviews: An Urban Density-based Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel data source and methodological framework that can be easily adapted to different regions, offering useful insights into public sentiment toward the built environment and shedding light on how planning policies can be designed to handle related challenges. |
Lingyao Li; Songhua Hu; Atiyya Shaw; Libby Hemphill; | arxiv-cs.SI | 2024-04-19 |
987 | Enabling Natural Zero-Shot Prompting on Encoder Models Via Statement-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label. |
Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji; | arxiv-cs.CL | 2024-04-19 |
988 | TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling. |
Aleksei Dorkin; Kairit Sirts; | arxiv-cs.CL | 2024-04-19 |
989 | Linearly-evolved Transformer for Pan-sharpening Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource satellites.To address this challenge between favorable performance and expensive computation, we tailor an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework. |
JUNMING HOU et. al. | arxiv-cs.CV | 2024-04-19 |
990 | Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation. |
Yucheng Lin; Yuhan Xia; Yunfei Long; | arxiv-cs.CL | 2024-04-18 |
991 | Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles. |
Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch; | arxiv-cs.CL | 2024-04-18 |
992 | MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm. |
Jinwu Wang; Wei Mao; Miaomiao Liu; | arxiv-cs.SD | 2024-04-18 |
993 | EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research. |
Jimenez Eladio; Hao Wu; | arxiv-cs.CL | 2024-04-18 |
994 | Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce two new methods, Dubo-SQL v1 and v2. |
Dayton G. Thorpe; Andrew J. Duberstein; Ian A. Kinsey; | arxiv-cs.CL | 2024-04-18 |
995 | Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. … |
Nils Graef; | arxiv-cs.LG | 2024-04-18 |
996 | Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications. |
Wei Chen; Zhiyuan Li; | arxiv-cs.CL | 2024-04-17 |
997 | CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. |
Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu; | arxiv-cs.AI | 2024-04-17 |
998 | MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how to build small fact-checking models that have GPT-4-level performance but for 400x lower cost. |
Liyan Tang; Philippe Laban; Greg Durrett; | arxiv-cs.CL | 2024-04-16 |
999 | CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions. |
Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak; | arxiv-cs.CL | 2024-04-16 |
1000 | Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. |
PEIYUAN ZHI et. al. | arxiv-cs.RO | 2024-04-15 |
1001 | AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance. |
Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara; | arxiv-cs.CV | 2024-04-15 |
1002 | Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, I argue that we can extract from the way the transformer architecture works a theory of the relationship between context and meaning. |
Jumbly Grindrod; | arxiv-cs.CL | 2024-04-15 |
1003 | Leveraging GPT-like LLMs to Automate Issue Labeling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Issue labeling is a crucial task for the effective management of software projects. To date, several approaches have been put forth for the automatic assignment of labels to issue … |
Giuseppe Colavito; F. Lanubile; Nicole Novielli; L. Quaranta; | 2024 IEEE/ACM 21st International Conference on Mining … | 2024-04-15 |
1004 | Zero-shot Building Age Classification from Facade Image Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. A building’s age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images … |
ZICHAO ZENG et. al. | ArXiv | 2024-04-15 |
1005 | Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper introduces fourteen novel datasets for the evaluation of Large Language Models’ safety in the context of enterprise tasks. A method was devised to evaluate a model’s … |
David Nadeau; Mike Kroutikov; Karen McNeil; Simon Baribeau; | ArXiv | 2024-04-15 |
1006 | Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility. |
SIQIAO XUE et. al. | arxiv-cs.AI | 2024-04-15 |
1007 | Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain. |
Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu; | arxiv-cs.CV | 2024-04-15 |
1008 | Few-shot Name Entity Recognition on StackOverflow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning. |
Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo; | arxiv-cs.CL | 2024-04-14 |
1009 | Assessing The Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are structured arguments that allow verifying the correct implementation of the created systems’ non-functional requirements (e.g., safety, security). This … |
KIMYA KHAKZAD SHAHANDASHTI et. al. | 2024 IEEE/ACM First International Conference on AI … | 2024-04-14 |
1010 | Fine Tuning Large Language Model for Secure Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI pair programmers, such as GitHub’s Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the … |
Junjie Li; Aseem Sangalay; Cheng Cheng; Yuan Tian; Jinqiu Yang; | 2024 IEEE/ACM First International Conference on AI … | 2024-04-14 |
1011 | Improving Domain Generalization in Speech Emotion Recognition with Whisper Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers have been used successfully in a variety of settings, including Speech Emotion Recognition (SER). However, use of the latest transformer base models in domain … |
Erik Goron; Lena Asai; Elias Rut; Martin Dinov; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1012 | LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NER Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Flat-LAttice Transformer (FLAT) has achieved notable success in Chinese named entity recognition (NER) by integrating lexical information into the widely-used Transformer … |
Zongcheng Ji; Yinlong Xiao; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1013 | A Lightweight Transformer-based Neural Network for Large-scale Masonry Arch Bridge Point Cloud Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to … |
Yixiong Jing; Brian Sheil; S. Acikgoz; | Comput. Aided Civ. Infrastructure Eng. | 2024-04-14 |
1014 | Inheritune: Training Smaller Yet More Attentive Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Layers in this state are unable to learn anything meaningful and mostly redundant; we refer to these as lazy layers. The goal of this paper is to train smaller models by eliminating this structural inefficiency without compromising performance. |
Sunny Sanyal; Ravid Shwartz-Ziv; Alexandros G. Dimakis; Sujay Sanghavi; | arxiv-cs.CL | 2024-04-12 |
1015 | Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap. |
Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych; | arxiv-cs.CL | 2024-04-12 |
1016 | CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs. |
Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran; | arxiv-cs.CL | 2024-04-12 |
1017 | Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code. |
MENGNAN QI et. al. | arxiv-cs.PL | 2024-04-12 |
1018 | Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels. |
William Gantt; Aaron Steven White; | arxiv-cs.CL | 2024-04-12 |
1019 | Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items. |
Andreas Säuberli; Simon Clematide; | arxiv-cs.CL | 2024-04-11 |
1020 | Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented. |
Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi; | arxiv-cs.CY | 2024-04-11 |
1021 | Remembering Transformer for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing data fine-tuning and regularization methods necessitate task identity information during inference and cannot eliminate interference among different tasks, while soft parameter sharing approaches encounter the problem of an increasing model parameter size. To tackle these challenges, we propose the Remembering Transformer, inspired by the brain’s Complementary Learning Systems (CLS). |
Yuwei Sun; Ippei Fujisawa; Arthur Juliani; Jun Sakuma; Ryota Kanai; | arxiv-cs.LG | 2024-04-11 |
1022 | From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates. |
Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu; | arxiv-cs.CL | 2024-04-11 |
1023 | On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models. |
YEKUN CHAI et. al. | arxiv-cs.CL | 2024-04-11 |
1024 | Map Reading and Analysis with GPT-4V(ision) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In late 2023, the image-reading capability added to a Generative Pre-trained Transformer (GPT) framework provided the opportunity to potentially revolutionize the way we view and … |
Jinwen Xu; Ran Tao; | ISPRS Int. J. Geo Inf. | 2024-04-11 |
1025 | LLM Agents Can Autonomously Exploit One-day Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems. |
Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang; | arxiv-cs.CR | 2024-04-11 |
1026 | Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images. |
Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi; | arxiv-cs.RO | 2024-04-11 |
1027 | Measuring Geographic Diversity of Foundation Models with A Natural Language-based Geo-guessing Experiment on GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. Generative AI based on foundation models provides a first glimpse into the world represented by machines trained on vast amounts of multimodal data ingested by these … |
Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi; | ArXiv | 2024-04-11 |
1028 | Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs. |
Miriam Anschütz; Edoardo Mosca; Georg Groh; | arxiv-cs.CL | 2024-04-10 |
1029 | Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere. |
Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan; | arxiv-cs.CL | 2024-04-09 |
1030 | Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data. |
YANJIE LI et. al. | arxiv-cs.LG | 2024-04-09 |
1031 | InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration. |
XIAOYI DONG et. al. | arxiv-cs.CV | 2024-04-09 |
1032 | PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English. |
Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar; | arxiv-cs.CL | 2024-04-08 |
1033 | OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets. |
MEHRAN SAFAYANI et. al. | arxiv-cs.CL | 2024-04-08 |
1034 | Use of A Structured Knowledge Base Enhances Metadata Curation By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the potential of large language models (LLMs), specifically GPT-4, to improve adherence to metadata standards. |
SOWMYA S. SUNDARAM et. al. | arxiv-cs.AI | 2024-04-08 |
1035 | VulnHunt-GPT: A Smart Contract Vulnerabilities Detector Based on OpenAI ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Smart contracts are self-executing programs that can run on a blockchain. Due to the fact of being immutable after their deployment on blockchain, it is crucial to ensure their … |
Biagio Boi; Christian Esposito; Sokjoon Lee; | Proceedings of the 39th ACM/SIGAPP Symposium on Applied … | 2024-04-08 |
1036 | Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data. |
AHMAD IDRISSI-YAGHIR et. al. | arxiv-cs.CL | 2024-04-08 |
1037 | Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to compare the performance of GPT with traditional deep learning models (Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT)) in extracting acupoint-related location relations and assess the impact of pretraining and fine-tuning on GPT’s performance. |
YIMING LI et. al. | arxiv-cs.CL | 2024-04-08 |
1038 | PagPassGPT: Pattern Guided Password Guessing Via Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT). |
XINGYU SU et. al. | arxiv-cs.CR | 2024-04-07 |
1039 | Clinical Trials Protocol Authoring Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies. |
Morteza Maleki; SeyedAli Ghahari; | arxiv-cs.CE | 2024-04-07 |
1040 | RecGPT: Generative Personalized Prompts for Sequential Recommendation Via ChatGPT Training Paradigm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as … |
YABIN ZHANG et. al. | ArXiv | 2024-04-06 |
1041 | Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel approach, Joint Visual and Text Prompting (VTPrompt), that employs fine-grained visual information to enhance the capability of MLLMs in VQA, especially for object-oriented perception. |
SONGTAO JIANG et. al. | arxiv-cs.CL | 2024-04-06 |
1042 | Scope Ambiguities in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models — GPT-2, GPT-3/3.5, Llama 2 and GPT-4 — treat scope ambiguous sentences, and compare this with human judgments. |
Gaurav Kamath; Sebastian Schuster; Sowmya Vajjala; Siva Reddy; | arxiv-cs.CL | 2024-04-05 |
1043 | Evaluating LLMs at Detecting Errors in LLM Responses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs. |
RYO KAMOI et. al. | arxiv-cs.CL | 2024-04-04 |
1044 | Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models. |
JERRY YAO-CHIEH HU et. al. | arxiv-cs.LG | 2024-04-04 |
1045 | Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the … |
SHUO CHEN et. al. | arxiv-cs.LG | 2024-04-04 |
1046 | FGeo-TP: A Language Model-Enhanced Solver for Euclidean Geometry Problems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The application of contemporary artificial intelligence techniques to address geometric problems and automated deductive proofs has always been a grand challenge to the … |
Yiming He; Jia Zou; Xiaokai Zhang; Na Zhu; Tuo Leng; | Symmetry | 2024-04-03 |
1047 | BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this, our team engaged in SemEval-2024 Task 4, a hierarchical multi-label classification task designed to identify rhetorical and psychological persuasion techniques embedded within memes. To tackle this problem, we introduced a caption generation step to assess the modality gap and the impact of additional semantic information from images, which improved our result. |
Amirhossein Abaskohi; Amirhossein Dabiriaghdam; Lele Wang; Giuseppe Carenini; | arxiv-cs.CL | 2024-04-03 |
1048 | GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. |
Ali Pesaranghader; Nikhil Verma; Manasa Bharadwaj; | arxiv-cs.CL | 2024-04-03 |
1049 | Task Agnostic Architecture for Algorithm Induction Via Implicit Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking this trend of generalization to the extreme suggests the possibility of a single deep network architecture capable of solving all tasks. This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed. |
Sahil J. Sindhi; Ignas Budvytis; | arxiv-cs.LG | 2024-04-03 |
1050 | NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation Using Few-Shot Multi-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate. |
Anish Pahilajani; Samyak Rajesh Jain; Devasha Trivedi; | arxiv-cs.CL | 2024-04-03 |
1051 | UTeBC-NLP at SemEval-2024 Task 9: Can LLMs Be Lateral Thinkers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT) and direct prompting, enhancing with informative descriptions, and employing contextualizing prompts using a retrieval augmented generation (RAG) pipeline. |
Pouya Sadeghi; Amirhossein Abaskohi; Yadollah Yaghoobzadeh; | arxiv-cs.CL | 2024-04-03 |
1052 | Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this way, we achieve 100% attack success rate — according to GPT-4 as a judge — on Vicuna-13B, Mistral-7B, Phi-3-Mini, Nemotron-4-340B, Llama-2-Chat-7B/13B/70B, Llama-3-Instruct-8B, Gemma-7B, GPT-3.5, GPT-4o, and R2D2 from HarmBench that was adversarially trained against the GCG attack. |
Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; | arxiv-cs.CR | 2024-04-02 |
1053 | Accelerating Transformer Pre-Training with 2: 4 Sparsity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Training large transformers is slow, but recent innovations on GPU architecture give us an advantage. NVIDIA Ampere GPUs can execute a fine-grained 2:4 sparse matrix … |
Yuezhou Hu; Kang Zhao; Wei Huang; Jianfei Chen; Jun Zhu; | ArXiv | 2024-04-02 |
1054 | SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SGSH–a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. |
SHASHA GUO et. al. | arxiv-cs.CL | 2024-04-02 |
1055 | Release of Pre-Trained Models for The Japanese Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI democratization aims to create a world in which the average person can utilize AI techniques. |
KEI SAWADA et. al. | arxiv-cs.CL | 2024-04-02 |
1056 | Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks. |
AMIRHOSSEIN ABASKOHI et. al. | arxiv-cs.CL | 2024-04-02 |
1057 | METAL: Towards Multilingual Meta-Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a framework for an end-to-end assessment of LLMs as evaluators in multilingual scenarios. |
Rishav Hada; Varun Gumma; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram; | arxiv-cs.CL | 2024-04-02 |
1058 | RDTN: Residual Densely Transformer Network for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yan Li; Xiaofei Yang; Dong Tang; Zheng-yang Zhou; | Expert Syst. Appl. | 2024-04-01 |
1059 | GPT-COPE: A Graph-Guided Point Transformer for Category-Level Object Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Category-level object pose estimation aims to predict the 6D pose and 3D metric size of objects from given categories. Due to significant intra-class shape variations among … |
Lu Zou; Zhangjin Huang; Naijie Gu; Guoping Wang; | IEEE Transactions on Circuits and Systems for Video … | 2024-04-01 |
1060 | BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research. |
Jiarong Xian; Jibao Yuan; Peiwei Zheng; Dexian Chen; Nie yuntao; | arxiv-cs.CL | 2024-04-01 |
1061 | Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of zero-shot Large Multimodal Models (LMMs) in the domain of drone perception. |
Christian Limberg; Artur Gonçalves; Bastien Rigault; Helmut Prendinger; | arxiv-cs.CV | 2024-04-01 |
1062 | An Innovative GPT-based Open-source Intelligence Using Historical Cyber Incident Reports Related Papers Related Patents Related Grants Related Venues Related Experts View |
F. Sufi; | Nat. Lang. Process. J. | 2024-04-01 |
1063 | ScopeViT: Scale-Aware Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
XUESONG NIE et. al. | Pattern Recognit. | 2024-04-01 |
1064 | LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment. |
Zilong Wang; Xufang Luo; Xinyang Jiang; Dongsheng Li; Lili Qiu; | arxiv-cs.CL | 2024-04-01 |
1065 | Large Language Model Evaluation Via Multi AI Agents: Preliminary Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite extensive efforts to examine LLMs from various perspectives, there is a noticeable lack of multi-agent AI models specifically designed to evaluate the performance of different LLMs. To address this gap, we introduce a novel multi-agent AI model that aims to assess and compare the performance of various LLMs. |
Zeeshan Rasheed; Muhammad Waseem; Kari Systä; Pekka Abrahamsson; | arxiv-cs.SE | 2024-04-01 |
1066 | Time Domain Speech Enhancement with CNN and Time-attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
N. Saleem; T. S. Gunawan; Sami Dhahbi; Sami Bourouis; | Digit. Signal Process. | 2024-04-01 |
1067 | Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify and address ethical issues through empirical studies. |
Richard Kimera; Yun-Seon Kim; Heeyoul Choi; | arxiv-cs.CL | 2024-04-01 |
1068 | TWIN-GPT: Digital Twins for Clinical Trials Via Large Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a large language model-based digital twin creation approach, called TWIN-GPT. |
YUE WANG et. al. | arxiv-cs.LG | 2024-04-01 |
1069 | Vision Transformer Models for Mobile/edge Devices: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View |
SEUNG IL LEE et. al. | Multim. Syst. | 2024-04-01 |
1070 | Syntactic Robustness for LLM-based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on prompts that ask for code that generates solutions to variables in an equation, when given coefficients of the equation as input. |
Laboni Sarker; Mara Downing; Achintya Desai; Tevfik Bultan; | arxiv-cs.SE | 2024-04-01 |
1071 | Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. |
HAN CAI et. al. | arxiv-cs.CV | 2024-04-01 |
1072 | CHOPS: CHat with CustOmer Profile Systems for Customer Service with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a practical dataset, the CPHOS-dataset, which includes a database, guiding files, and QA pairs collected from CPHOS, an online platform that facilitates the organization of simulated Physics Olympiads for high school teachers and students. |
JINGZHE SHI et. al. | arxiv-cs.CL | 2024-03-31 |
1073 | EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new benchmark – EvoCodeBench to address the preceding problems, which has three primary advances. |
Jia Li; Ge Li; Xuanming Zhang; Yihong Dong; Zhi Jin; | arxiv-cs.CL | 2024-03-31 |
1074 | Jetsons at FinNLP 2024: Towards Understanding The ESG Impact of A News Article Using Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. |
PARAG PRAVIN DAKLE et. al. | arxiv-cs.CL | 2024-03-30 |
1075 | Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite efforts to create open-source models, their limited parameters often result in insufficient multi-step reasoning capabilities required for solving complex medical problems. To address this, we introduce Meerkat, a new family of medical AI systems ranging from 7 to 70 billion parameters. |
HYUNJAE KIM et. al. | arxiv-cs.CL | 2024-03-30 |
1076 | A Hybrid Transformer and Attention Based Recurrent Neural Network for Robust and Interpretable Sentiment Analysis of Tweets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this. |
Md Abrar Jahin; Md Sakib Hossain Shovon; M. F. Mridha; Md Rashedul Islam; Yutaka Watanobe; | arxiv-cs.CL | 2024-03-30 |
1077 | Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune Your Model Unless You Have Access to GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a PEFT method to improve the consistency of LLMs by merging adapters that were fine-tuned separately using triplet and language modelling objectives. |
Aryo Pradipta Gema; Giwon Hong; Pasquale Minervini; Luke Daines; Beatrice Alex; | arxiv-cs.CL | 2024-03-30 |
1078 | Cross-lingual Named Entity Corpus for Slavic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a corpus manually annotated with named entities for six Slavic languages – Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. |
Jakub Piskorski; Michał Marcińczuk; Roman Yangarber; | arxiv-cs.CL | 2024-03-30 |
1079 | A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In pursuit of suitable data augmentation methods, this study explores both established legacy approaches and contemporary practices such as Large Language Models (LLM), including GPT in Hate Speech detection. |
Md Saroar Jahan; Mourad Oussalah; Djamila Romaissa Beddia; Jhuma kabir Mim; Nabil Arhab; | arxiv-cs.CL | 2024-03-30 |
1080 | Transformer Based Pluralistic Image Completion with Reduced Information Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer. To mitigate these issues, we propose a new transformer based framework called PUT. |
QIANKUN LIU et. al. | arxiv-cs.CV | 2024-03-30 |
1081 | Spread Your Wings: A Radial Strip Transformer for Image Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Radial Strip Transformer (RST), which is a transformer-based architecture that restores the blur images in a polar coordinate system instead of a Cartesian one. |
DUOSHENG CHEN et. al. | arxiv-cs.CV | 2024-03-30 |
1082 | ChatGPT V.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can ChatGPT detect media bias? This study seeks to answer this question by leveraging the Media Bias Identification Benchmark (MBIB) to assess ChatGPT’s competency in distinguishing six categories of media bias, juxtaposed against fine-tuned models such as BART, ConvBERT, and GPT-2. |
Zehao Wen; Rabih Younes; | arxiv-cs.CL | 2024-03-29 |
1083 | Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. |
Ahmad Diab; Rr. Nefriana; Yu-Ru Lin; | arxiv-cs.CL | 2024-03-29 |
1084 | ReALM: Reference Resolution As Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality. |
JOEL RUBEN ANTONY MONIZ et. al. | arxiv-cs.CL | 2024-03-29 |
1085 | Shallow Cross-Encoders for Low-Latency Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, keeping search latencies low is important for user satisfaction and energy usage. In this paper, we show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings since they can estimate the relevance of more documents in the same time budget. |
Aleksandr V. Petrov; Sean MacAvaney; Craig Macdonald; | arxiv-cs.IR | 2024-03-29 |
1086 | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator’s behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT). |
Norman Di Palo; Edward Johns; | arxiv-cs.RO | 2024-03-28 |
1087 | TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-modal large language models (MLLMs), such as GPT-4, exhibit great comprehension capabilities on human instruction, as well as zero-shot ability on new downstream multi-modal … |
YUNKAI CHEN et. al. | ACM Transactions on Knowledge Discovery from Data | 2024-03-28 |
1088 | A Review of Multi-Modal Large Language and Vision Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have recently emerged as a focal point of research and application, driven by their unprecedented ability to understand and generate text with … |
Kilian Carolan; Laura Fennelly; A. Smeaton; | ArXiv | 2024-03-28 |
1089 | Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks. |
ANG LV et. al. | arxiv-cs.CL | 2024-03-28 |
1090 | Decision Mamba: Reinforcement Learning Via Sequence Modeling with Selective State Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios. |
Toshihiro Ota; | arxiv-cs.LG | 2024-03-28 |
1091 | Reshaping Free-Text Radiology Notes Into Structured Reports With Generative Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a pipeline to extract information from free-text radiology reports, that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma. |
LAURA BERGOMI et. al. | arxiv-cs.CL | 2024-03-27 |
1092 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students’ physics labs. |
Ehsan Latif; Ramviyas Parasuraman; Xiaoming Zhai; | arxiv-cs.RO | 2024-03-27 |
1093 | The Topos of Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a theoretical analysis of the expressivity of the transformer architecture through the lens of topos theory. |
Mattia Jacopo Villani; Peter McBurney; | arxiv-cs.LG | 2024-03-27 |
1094 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan Sheng Foo; Luu Anh Tuan; See-Kiong Ng; | arxiv-cs.CL | 2024-03-27 |
1095 | 3P-LLM: Probabilistic Path Planning Using Large Language Model for Autonomous Robot Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning. |
Ehsan Latif; | arxiv-cs.RO | 2024-03-27 |
1096 | RankMamba: Benchmarking Mamba’s Document Ranking Performance in The Era of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine \mamba’s efficacy through the lens of a classical IR task — document ranking. |
Zhichao Xu; | arxiv-cs.IR | 2024-03-27 |
1097 | A Survey on Large Language Models from Concept to Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series. |
Chen Wang; Jin Zhao; Jiaqi Gong; | arxiv-cs.CL | 2024-03-27 |
1098 | Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed three approaches for leveraging LLMs for text classification: employing LLMs as zero-shot classifiers, us-ing LLMs as annotators to annotate training data for supervised classifiers, and utilizing LLMs with few-shot examples for augmentation of manually annotated data. |
Yuting Guo; Anthony Ovadje; Mohammed Ali Al-Garadi; Abeed Sarker; | arxiv-cs.CL | 2024-03-27 |
1099 | AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. |
FELIX VIRGO et. al. | arxiv-cs.CL | 2024-03-27 |
1100 | From Traditional Recommender Systems to GPT-Based Chatbots: A Survey of Recent Developments and Future Directions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recommender systems are a key technology for many applications, such as e-commerce, streaming media, and social media. Traditional recommender systems rely on collaborative … |
TAMIM M. AL-HASAN et. al. | Big Data Cogn. Comput. | 2024-03-27 |
1101 | Evaluating The Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of Large Language Models (LLMs) has led to a parallel rise in the development of Large Multimodal Models (LMMs), which have begun to transform a variety of applications. These sophisticated multimodal models are designed to interpret and analyze complex data by integrating multiple modalities such as text and images, thereby opening new avenues for a range of applications. |
Fouad Trad; Ali Chehab; | arxiv-cs.AI | 2024-03-26 |
1102 | Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking. |
HAI-LONG NGUYEN et. al. | arxiv-cs.CL | 2024-03-26 |
1103 | Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a task for testing lexical-syntactic flexibility — the degree to which models can generalize over words in a construction with a non-prototypical part of speech. |
David R. Mortensen; Valentina Izrailevitch; Yunze Xiao; Hinrich Schütze; Leonie Weissweiler; | arxiv-cs.CL | 2024-03-26 |
1104 | State Space Models As Foundation Models: A Control Theoretic Overview Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by … |
Carmen Amo Alonso; Jerome Sieber; M. Zeilinger; | ArXiv | 2024-03-25 |
1105 | Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models have shown significant success in a wide range of tasks. Meanwhile, massive resources required by its inference prevent scenarios with resource-constrained … |
YUANXIN WEI et. al. | 2024 Design, Automation & Test in Europe Conference & … | 2024-03-25 |
1106 | Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models Using Minimal Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing’ method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer. |
Linyang He; Peili Chen; Ercong Nie; Yuanning Li; Jonathan R. Brennan; | arxiv-cs.CL | 2024-03-25 |
1107 | CYGENT: A Cybersecurity Conversational Agent with Log Summarization Powered By GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability. |
Prasasthy Balasubramanian; Justin Seby; Panos Kostakos; | arxiv-cs.CR | 2024-03-25 |
1108 | LLM-Guided Formal Verification Coupled with Mutation Testing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The increasing complexity of modern hardware designs poses significant challenges for design verification, particularly defining and verifying properties and invariants manually. … |
Muhammad Hassan; Sallar Ahmadi-Pour; Khushboo Qayyum; C. Jha; Rolf Drechsler; | 2024 Design, Automation & Test in Europe Conference & … | 2024-03-25 |
1109 | GPT-4 Understands Discourse at Least As Well As Humans Do Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief … |
Thomas Shultz; Jamie Wise; Ardavan Salehi Nobandegani; | arxiv-cs.CL | 2024-03-25 |
1110 | Grammatical Vs Spelling Error Correction: An Investigation Into The Responsiveness of Transformer-based Language Models Using BART and MarianMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims at analyzing different kinds of error that occurs in text documents. |
Rohit Raju; Peeta Basa Pati; SA Gandheesh; Gayatri Sanjana Sannala; Suriya KS; | arxiv-cs.CL | 2024-03-25 |
1111 | TextGT: A Double-View Graph Transformer on Text for Aspect-Based Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Aspect-based sentiment analysis (ABSA) is aimed at predicting the sentiment polarities of the aspects included in a sentence instead of the whole sentence itself, and is a … |
Shuo Yin; Guoqiang Zhong; | AAAI Conference on Artificial Intelligence | 2024-03-24 |
1112 | Automatic Short Answer Grading for Finnish with ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic short answer grading (ASAG) seeks to mitigate the burden on teachers by leveraging computational methods to evaluate student-constructed text responses. Large language … |
Li-Hsin Chang; Filip Ginter; | AAAI Conference on Artificial Intelligence | 2024-03-24 |
1113 | Anomaly Detection and Localization in Optical Networks Using Vision Transformer and SOP Monitoring Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce an innovative vision transformer approach to identify and precisely locate high-risk events, including fiber cut precursors, in state-of-polarization derived … |
K. ABDELLI et. al. | 2024 Optical Fiber Communications Conference and Exhibition … | 2024-03-24 |
1114 | Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL). |
MINYU CHEN et. al. | arxiv-cs.AI | 2024-03-24 |
1115 | GPT-Enabled Digital Twin Assistant for Multi-task Cooperative Management in Autonomous Optical Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A GPT-enabled digital twin (DT) assistant is implemented with the capabilities of intention understanding, analysis, reasoning, and complex multi-task collaboration, which … |
YAO ZHANG et. al. | 2024 Optical Fiber Communications Conference and Exhibition … | 2024-03-24 |
1116 | WaveFormer: Wavelet Transformer for Noise-Robust Video Inpainting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Video inpainting aims to fill in the missing regions of the video frames with plausible content. Benefiting from the outstanding long-range modeling capacity, the … |
Zhiliang Wu; Changchang Sun; Hanyu Xuan; Gaowen Liu; Yan Yan; | AAAI Conference on Artificial Intelligence | 2024-03-24 |
1117 | A Transformer Approach for Electricity Price Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach to electricity price forecasting (EPF) using a pure Transformer model. |
Oscar Llorente; Jose Portela; | arxiv-cs.LG | 2024-03-24 |
1118 | Using Large Language Models for OntoClean-based Ontology Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology. |
Yihang Zhao; Neil Vetter; Kaveh Aryan; | arxiv-cs.AI | 2024-03-23 |
1119 | VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the purpose of future research, CafeBERT is made publicly available for research purposes. |
Phong Nguyen-Thuan Do; Son Quoc Tran; Phu Gia Hoang; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen; | arxiv-cs.CL | 2024-03-23 |
1120 | LlamBERT: Large-scale Low-cost Data Annotation in NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LlamBERT, a hybrid approach that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer encoders like BERT and RoBERTa. |
Bálint Csanády; Lajos Muzsai; Péter Vedres; Zoltán Nádasdy; András Lukács; | arxiv-cs.CL | 2024-03-23 |
1121 | Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish. |
Lanxin Xu; Shuo Wang; | arxiv-cs.CV | 2024-03-22 |
1122 | Can Large Language Models Explore In-context? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. |
Akshay Krishnamurthy; Keegan Harris; Dylan J. Foster; Cyril Zhang; Aleksandrs Slivkins; | arxiv-cs.LG | 2024-03-22 |
1123 | MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 – Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. |
SADIYA SAYARA CHOWDHURY PUSPO et. al. | arxiv-cs.CL | 2024-03-22 |
1124 | Comprehensive Evaluation and Insights Into The Use of Large Language Models in The Automation of Behavior-Driven Development Acceptance Test Formulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this manuscript, we propose a novel approach to enhance BDD practices using large language models (LLMs) to automate acceptance test generation. |
SHANTHI KARPURAPU et. al. | arxiv-cs.SE | 2024-03-22 |
1125 | GPT-Connect: Interaction Between Text-Driven Human Motion Generator and 3D Scenes in A Training-free Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, intuitively training a separate scene-aware motion generator in a supervised way can require a large amount of motion samples to be troublesomely collected and annotated in a large scale of different 3D scenes. To handle this task rather in a relatively convenient manner, in this paper, we propose a novel GPT-connect framework. |
Haoxuan Qu; Ziyan Guo; Jun Liu; | arxiv-cs.CV | 2024-03-22 |
1126 | On Zero-Shot Counterspeech Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech – counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind. |
Punyajoy Saha; Aalok Agrawal; Abhik Jana; Chris Biemann; Animesh Mukherjee; | arxiv-cs.CL | 2024-03-22 |
1127 | K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In many literary texts, emotions are indirectly conveyed through descriptions of actions, facial expressions, and appearances, necessitating emotion inference for narrative understanding. In this paper, we introduce K-Act2Emo, a Korean commonsense knowledge graph (CSKG) comprising 1,900 indirect emotional expressions and the emotions inferable from them. |
Kyuhee Kim; Surin Lee; Sangah Lee; | arxiv-cs.CL | 2024-03-21 |
1128 | LLM-based Extraction of Contradictions from Patents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI’s GPT-4. |
Stefan Trapp; Joachim Warschat; | arxiv-cs.CL | 2024-03-21 |
1129 | Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature. |
Sheetal Harris; Jinshuo Liu; Hassan Jalil Hadi; Yue Cao; | arxiv-cs.CL | 2024-03-20 |
1130 | Extracting Emotion Phrases from Tweets Using BART Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we applied an approach to sentiment analysis based on a question-answering framework. |
Mahdi Rezapour; | arxiv-cs.CL | 2024-03-20 |
1131 | Open Access NAO (OAN): A ROS2-based Software Framework for HRI Applications with The NAO Robot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new software framework for HRI experimentation with the sixth version of the common NAO robot produced by the United Robotics Group. |
Antonio Bono; Kenji Brameld; Luigi D’Alfonso; Giuseppe Fedele; | arxiv-cs.RO | 2024-03-20 |
1132 | AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods for integrating such multimodal information often stumble, leading to less-than-ideal outcomes in the task of facial action unit detection. To overcome these shortcomings, we propose a novel approach utilizing audio-visual multimodal data. |
JUN YU et. al. | arxiv-cs.CV | 2024-03-20 |
1133 | Retina Vision Transformer (RetinaViT): Introducing Scaled Patches Into Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Humans see low and high spatial frequency components at the same time, and combine the information from both to form a visual scene. Drawing on this neuroscientific inspiration, we propose an altered Vision Transformer architecture where patches from scaled down versions of the input image are added to the input of the first Transformer Encoder layer. |
Yuyang Shu; Michael E. Bain; | arxiv-cs.CV | 2024-03-20 |
1134 | Generating Automatic Feedback on UI Mockups with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the potential of using large language models for automatic feedback. |
Peitong Duan; Jeremy Warner; Yang Li; Bjoern Hartmann; | arxiv-cs.HC | 2024-03-19 |
1135 | Automated Data Curation for Robust Language Model Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an automated data curation pipeline CLEAR (Confidence-based LLM Evaluation And Rectification) for instruction tuning datasets, that can be used with any LLM and fine-tuning procedure. |
Jiuhai Chen; Jonas Mueller; | arxiv-cs.CL | 2024-03-19 |
1136 | A Hyperspectral Unmixing Model Using Convolutional Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Sreejam Muraleedhara Bhakthan; L. Agilandeeswari; | Earth Sci. Informatics | 2024-03-19 |
1137 | Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks where multiple outputs are required for a single shared input. |
BO-RU LU et. al. | arxiv-cs.CL | 2024-03-19 |
1138 | TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an end-to-end model called TT-BLIP that applies the bootstrapping language-image pretraining for unified vision-language understanding and generation (BLIP) for three types of information: BERT and BLIP\textsubscript{Txt} for text, ResNet and BLIP\textsubscript{Img} for images, and bidirectional BLIP encoders for multimodal information. |
Eunjee Choi; Jong-Kook Kim; | arxiv-cs.LG | 2024-03-19 |
1139 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is not aligned with the prompt compression objective. To address these issues, we propose a data distillation procedure to derive knowledge from an LLM to compress prompts without losing crucial information, and meantime, introduce an extractive text compression dataset. |
ZHUOSHI PAN et. al. | arxiv-cs.CL | 2024-03-19 |
1140 | Navigating Compiler Errors with AI Assistance — A Study of GPT Hints in An Introductory Programming Course Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler errors within a platform for automated assessment of programming assignments. |
Maciej Pankiewicz; Ryan S. Baker; | arxiv-cs.SE | 2024-03-19 |
1141 | Navigating Compiler Errors with AI Assistance – A Study of GPT Hints in An Introductory Programming Course Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler … |
Maciej Pankiewicz; Ryan S. Baker; | Proceedings of the 2024 on Innovation and Technology in … | 2024-03-19 |
1142 | End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a neuro-symbolic framework for jointly learning structured states and symbolic policies, whose key idea is to distill the vision foundation model into an efficient perception module and refine it during policy learning. |
LIRUI LUO et. al. | arxiv-cs.AI | 2024-03-19 |
1143 | CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the dataset and benchmark naive, traditional, and Transformer models. |
Korbinian Randl; John Pavlopoulos; Aron Henriksson; Tony Lindgren; | arxiv-cs.CL | 2024-03-18 |
1144 | GPT-4 As Evaluator: Evaluating Large Language Models on Pest Management in Agriculture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the rapidly evolving field of artificial intelligence (AI), the application of large language models (LLMs) in agriculture, particularly in pest management, remains nascent. We … |
SHANGLONG YANG et. al. | ArXiv | 2024-03-18 |
1145 | Shifting The Lens: Detecting Malicious Npm Packages Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this study is to assist security analysts in detecting malicious packages through the empirical study of using Large Language Models (LLMs) to detect malicious code in the npm ecosystem. |
Nusrat Zahan; Philipp Burckhardt; Mikola Lysenko; Feross Aboukhadijeh; Laurie Williams; | arxiv-cs.CR | 2024-03-18 |
1146 | Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its impressive capabilities, the financial cost associated with GPT-4V’s inference presents a substantial barrier for its wide use. To address this challenge, our work introduces Collage Prompting, a budget-friendly prompting approach that concatenates multiple images into a single visual input. |
Siyu Xu; Yunke Wang; Daochang Liu; Chang Xu; | arxiv-cs.CV | 2024-03-18 |
1147 | Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social … |
CHENYU HOU et. al. | Proceedings of the 14th Learning Analytics and Knowledge … | 2024-03-18 |
1148 | How Far Are We on The Decision-Making of LLMs? Evaluating LLMs’ Gaming Ability in Multi-Agent Environments IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GAMA($\gamma$)-Bench, a new framework for evaluating LLMs’ Gaming Ability in Multi-Agent environments. |
JEN-TSE HUANG et. al. | arxiv-cs.AI | 2024-03-18 |
1149 | Evaluating Named Entity Recognition: A Comparative Analysis of Mono- and Multilingual Transformer Models on A Novel Brazilian Corporate Earnings Call Transcripts Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our study aimed to evaluate their performance on a financial Named Entity Recognition (NER) task and determine the computational requirements for fine-tuning and inference. |
Ramon Abilio; Guilherme Palermo Coelho; Ana Estela Antunes da Silva; | arxiv-cs.CL | 2024-03-18 |
1150 | AI-Generated Text Detector for Arabic Language Using Encoder-Based Transformer Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The effectiveness of existing AI detectors is notably hampered when processing Arabic texts. This study introduces a novel AI text classifier designed specifically for Arabic, … |
Hamed Alshammari; Ahmed El-Sayed; Khaled Elleithy; | Big Data Cogn. Comput. | 2024-03-18 |
1151 | An Empirical Study on JIT Defect Prediction Based on BERT-style Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction. |
Yuxiang Guo; Xiaopeng Gao; Bo Jiang; | arxiv-cs.SE | 2024-03-17 |
1152 | Embracing The Generative AI Revolution: Advancing Tertiary Education in Cybersecurity with GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigated the impact of GPTs, specifically ChatGPT, on tertiary education in cybersecurity, and provided recommendations for universities to adapt their curricula to meet the evolving needs of the industry. |
Raza Nowrozy; David Jam; | arxiv-cs.CY | 2024-03-17 |
1153 | Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. |
Daniel Enström; Viktor Kjellberg; Moa Johansson; | arxiv-cs.LG | 2024-03-17 |
1154 | Using An LLM to Turn Sign Spottings Into Spoken Language Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a hybrid SLT approach, Spotter+GPT, that utilizes a sign spotter and a powerful Large Language Model (LLM) to improve SLT performance. |
Ozge Mercanoglu Sincan; Necati Cihan Camgoz; Richard Bowden; | arxiv-cs.CV | 2024-03-15 |
1155 | ATOM: Asynchronous Training of Massive Models for Deep Learning in A Decentralized Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce \atom, a resilient distributed training framework designed for asynchronous training of vast models in a decentralized setting using cost-effective hardware, including consumer-grade GPUs and Ethernet. |
Xiaofeng Wu; Jia Rao; Wei Chen; | arxiv-cs.DC | 2024-03-15 |
1156 | Evaluating LLMs for Gender Disparities in Notable Persons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect hallucinated responses or to altogether decline to even answer prompt at all. |
Lauren Rhue; Sofie Goethals; Arun Sundararajan; | arxiv-cs.CL | 2024-03-14 |
1157 | AI on AI: Exploring The Utility of GPT As An Expert Annotator of AI Publications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results indicate that with effective prompt engineering, chatbots can be used as reliable data annotators even where subject-area expertise is required. To evaluate the utility of chatbot-annotated datasets on downstream classification tasks, we train a new classifier on GPT-labeled data and compare its performance to the arXiv-trained model. |
Autumn Toney-Wails; Christian Schoeberl; James Dunham; | arxiv-cs.CL | 2024-03-14 |
1158 | Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Targeting at VL PEFT tasks, we propose a family of operations, called routing functions, to enhance VL alignment in the low-rank bottlenecks. |
Tingyu Qu; Tinne Tuytelaars; Marie-Francine Moens; | arxiv-cs.CV | 2024-03-14 |
1159 | Sabiá-2: A New Generation of Portuguese Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Sabi\’a-2, a family of large language models trained on Portuguese texts. |
Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira; Ramon Pires; | arxiv-cs.CL | 2024-03-14 |
1160 | ViTCN: Vision Transformer Contrastive Network For Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Zhang et al proposed a dataset called RAVEN which can be used to test Machine Learning model abstract reasoning ability. In this paper, we purposed Vision Transformer Contrastive Network which build on previous work with the Contrastive Perceptual Inference network (CoPiNet), which set a new benchmark for permutationinvariant models Raven Progressive Matrices by incorporating contrast effects from psychology, cognition, and education, and extends this foundation by leveraging the cutting-edge Vision Transformer architecture. |
Bo Song; Yuanhao Xu; Yichao Wu; | arxiv-cs.CV | 2024-03-14 |
1161 | FBPT: A Fully Binary Point Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices. |
Zhixing Hou; Yuzhang Shang; Yan Yan; | arxiv-cs.CV | 2024-03-14 |
1162 | Evaluating The Application of Large Language Models to Generate Feedback in Programming Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study investigates the application of large language models, specifically GPT-4, to enhance programming education. The research outlines the design of a web application that … |
Sven Jacobs; Steffen Jaschke; | 2024 IEEE Global Engineering Education Conference (EDUCON) | 2024-03-13 |
1163 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
DONG YUAN et. al. | arxiv-cs.CL | 2024-03-13 |
1164 | GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored By Compliance, Context and Attribute Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security. |
Raza Nowrozy; Khandakar Ahmed; Hua Wang; | arxiv-cs.CY | 2024-03-13 |
1165 | Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare four of the currently most relevant large, web-crawled corpora (CC100, MaCoCu, mC4 and OSCAR) across eleven lower-resourced European languages. |
RIK VAN NOORD et. al. | arxiv-cs.CL | 2024-03-13 |
1166 | Pre-trained Low-light Image Enhancement Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Low‐light image enhancement is a longstanding challenge in low‐level vision, as images captured in low‐light conditions often suffer from significant aesthetic quality flaws. … |
Jingyao Zhang; Shijie Hao; Yuan Rao; | IET Image Process. | 2024-03-12 |
1167 | Pose Pattern Mining Using Transformer for Motion Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Seo-El Lee; Hyun Yoo; Kyungyong Chung; | Appl. Intell. | 2024-03-12 |
1168 | The Future of Document Indexing: GPT and Donut Revolutionize Table of Content Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model. |
Degaga Wolde Feyisa; Haylemicheal Berihun; Amanuel Zewdu; Mahsa Najimoghadam; Marzieh Zare; | arxiv-cs.IR | 2024-03-12 |
1169 | SIFiD: Reassess Summary Factual Inconsistency Detection with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4. |
JIUDING YANG et. al. | arxiv-cs.CL | 2024-03-12 |
1170 | In-context Learning Enables Multimodal Large Language Models to Classify Cancer Pathology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. |
DYKE FERBER et. al. | arxiv-cs.CV | 2024-03-12 |
1171 | GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present GPT Reddit Dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset designed to assess the performance of detection models in identifying generated responses from ChatGPT. |
Zubair Qazi; William Shiao; Evangelos E. Papalexakis; | arxiv-cs.CL | 2024-03-12 |
1172 | Rethinking ASTE: A Minimalist Tagging Scheme Alongside Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches to ASTE often complicate the task with additional structures or external data. In this research, we propose a novel tagging scheme and employ a contrastive learning approach to mitigate these challenges. |
Qiao Sun; Liujia Yang; Minghao Ma; Nanyang Ye; Qinying Gu; | arxiv-cs.CL | 2024-03-12 |
1173 | Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Context-observant LLM-Enabled Autonomous Robots (CLEAR) platform offers a general solution for large language model (LLM)-enabled robot autonomy. CLEAR-controlled robots use … |
JACOB P. MACDONALD et. al. | Companion of the 2024 ACM/IEEE International Conference on … | 2024-03-11 |
1174 | Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. |
NICHOLAS CARLINI et. al. | arxiv-cs.CR | 2024-03-11 |
1175 | Development of A Reliable and Accessible Caregiving Language Model (CaLM) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we focused on caregivers of individuals with Alzheimer’s Disease Related Dementias. |
BAMBANG PARMANTO et. al. | arxiv-cs.CL | 2024-03-11 |
1176 | Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Which we use in another set of transformer encoder layers to learn the inter-chunk representations. We analyze the adaptability of Large Language Models (LLMs) with multi-billion parameters (GPT-Neo, and GPT-J) with the hierarchical framework of MESc and compare them with their standalone performance on legal texts. |
Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki; | arxiv-cs.CL | 2024-03-11 |
1177 | QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}. |
JIUN-MAN CHEN et. al. | arxiv-cs.CV | 2024-03-11 |
1178 | JayBot — Aiding University Students and Admission with An LLM-based Chatbot Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This demo paper presents JayBot, an LLM-based chatbot system aimed at enhancing the user experience of prospective and current students, faculty, and staff at a UK university. The … |
Julius Odede; Ingo Frommholz; | Proceedings of the 2024 Conference on Human Information … | 2024-03-10 |
1179 | LLMs Still Can’t Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard’s Capacity to Handle Object-Oriented Programming Assignments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we experimented with three prominent LLMs – GPT-3.5, GPT-4, and Bard – to solve real-world OOP exercises used in educational settings, subsequently validating their solutions using an Automatic Assessment Tool (AAT). |
Bruno Pereira Cipriano; Pedro Alves; | arxiv-cs.SE | 2024-03-10 |
1180 | S3L: Spectrum Transformer for Self-Supervised Learning in Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the realm of Earth observation and remote sensing data analysis, the advancement of hyperspectral imaging (HSI) classification technology is of paramount importance. … |
Hufeng Guo; Wenyi Liu; | Remote. Sens. | 2024-03-10 |
1181 | Enhancing Human Annotation: Leveraging Large Language Models and Efficient Batch Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) are capable of assessing document and query characteristics, including relevance, and are now being used for a variety of different classification … |
Oleg Zendel; J. Culpepper; Falk Scholer; Paul Thomas; | Proceedings of the 2024 Conference on Human Information … | 2024-03-10 |
1182 | GPT As Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. |
HAO LU et. al. | arxiv-cs.CV | 2024-03-09 |
1183 | A Dataset and Benchmark for Hospital Course Summarization with Adapted Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel pre-processed dataset, the MIMIC-IV-BHC, encapsulating clinical note and brief hospital course (BHC) pairs to adapt LLMs for BHC synthesis. |
ASAD AALI et. al. | arxiv-cs.CL | 2024-03-08 |
1184 | How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named “GPTRadScore”. |
QINGQING ZHU et. al. | arxiv-cs.AI | 2024-03-08 |
1185 | To Err Is Human, But Llamas Can Learn It Too Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs). |
Agnes Luhtaru; Taido Purason; Martin Vainikko; Maksym Del; Mark Fishel; | arxiv-cs.CL | 2024-03-08 |
1186 | Electron Density-based GPT for Optimization and Suggestion of Host–guest Binders Related Papers Related Patents Related Grants Related Venues Related Experts View |
JUAN MANUEL PARRILLA GUTIERREZ et. al. | Nature Computational Science | 2024-03-08 |
1187 | Will GPT-4 Run DOOM? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4’s reasoning and planning capabilities extend to the 1993 first-person shooter Doom. |
Adrian de Wynter; | arxiv-cs.CL | 2024-03-08 |
1188 | RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models’ reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. |
ZIHAO WANG et. al. | arxiv-cs.CL | 2024-03-08 |
1189 | The Impact of Quantization on The Robustness of Transformer-based Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the effect of quantization on the robustness of Transformer-based models. |
Seyed Parsa Neshaei; Yasaman Boreshban; Gholamreza Ghassem-Sani; Seyed Abolghasem Mirroshandel; | arxiv-cs.CL | 2024-03-08 |
1190 | An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design an error-based human annotation framework to assess the GPT-4’s simplification capabilities. |
Xuanxin Wu; Yuki Arase; | arxiv-cs.CL | 2024-03-07 |
1191 | Using GPT-4 to Provide Tiered, Formative Code Feedback Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have shown promise in generating sensible code explanation and feedback in programming exercises. In this experience report, we discuss the process of … |
Ha Nguyen; Vicki Allan; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-07 |
1192 | A Large Scale RCT on Effective Error Messages in CS1 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial conducted in an open-access, online introductory computer … |
Sierra Wang; John C. Mitchell; C. Piech; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-07 |
1193 | Feedback-Generation for Programming Exercises With GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. |
Imen Azaiz; Natalie Kiesler; Sven Strickroth; | arxiv-cs.AI | 2024-03-07 |
1194 | Federated Recommendation Via Hybrid Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism. |
Huimin Zeng; Zhenrui Yue; Qian Jiang; Dong Wang; | arxiv-cs.IR | 2024-03-07 |
1195 | Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead of following the popular practice of directly translating existing English resources into Japanese (e.g., Japanese-Alpaca), we propose an efficient self-instruct method based on GPT-4. |
YIKUN SUN et. al. | arxiv-cs.CL | 2024-03-06 |
1196 | Assessing The Aesthetic Evaluation Capabilities of GPT-4 with Vision: Insights from Group and Individual Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, it has been recognized that large language models demonstrate high performance on various intellectual tasks. |
Yoshia Abe; Tatsuya Daikoku; Yasuo Kuniyoshi; | arxiv-cs.AI | 2024-03-06 |
1197 | Whodunit: Classifying Code As Human Authored or GPT-4 Generated — A Case Study on CodeChef Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study shows that code stylometry is a promising approach for distinguishing between GPT-4 generated code and human-authored code. |
Oseremen Joy Idialu; Noble Saji Mathews; Rungroj Maipradit; Joanne M. Atlee; Mei Nagappan; | arxiv-cs.SE | 2024-03-06 |
1198 | Probabilistic Topic Modelling with Transformer Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Transformer-Representation Neural Topic Model (TNTM), which combines the benefits of topic representations in transformer-based embedding spaces and probabilistic modelling. |
Arik Reuter; Anton Thielmann; Christoph Weisser; Benjamin Säfken; Thomas Kneib; | arxiv-cs.LG | 2024-03-06 |
1199 | Designing Informative Metrics for Few-Shot Example Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a complexity-based prompt selection approach for sequence tagging tasks. |
Rishabh Adiga; Lakshminarayanan Subramanian; Varun Chandrasekaran; | arxiv-cs.CL | 2024-03-06 |
1200 | Can Large Language Models Do Analytical Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports. |
YEBOWEN HU et. al. | arxiv-cs.CL | 2024-03-06 |
1201 | Japanese-English Sentence Translation Exercises Dataset for Automatic Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the task of automatic assessment of Sentence Translation Exercises (STEs), that have been used in the early stage of L2 language learning. |
NAOKI MIURA et. al. | arxiv-cs.CL | 2024-03-05 |
1202 | AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys. |
Anjalee De Silva; Janaka L. Wijekoon; Rashini Liyanarachchi; Rrubaa Panchendrarajan; Weranga Rajapaksha; | arxiv-cs.AI | 2024-03-05 |
1203 | An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model Is Not A General Substitute for GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs. |
HUI HUANG et. al. | arxiv-cs.CL | 2024-03-05 |
1204 | Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled By GPT-4 for Enhanced Interpretability and Public Engagement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: And the public requires complex techniques to inquiry and understand socio-cultural and institutional factors, often hinders the public’s understanding of flood risks. To overcome these challenges, our study introduces an innovative solution: a customized AI Assistant powered by the GPT-4 Large Language Model. |
Rafaela Martelo; Ruo-Qian Wang; | arxiv-cs.AI | 2024-03-05 |
1205 | Design2Code: How Far Are We From Automating Front-End Engineering? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This can enable a new paradigm of front-end development, in which multimodal LLMs might directly convert visual designs into code implementations. In this work, we formalize this as a Design2Code task and conduct comprehensive benchmarking. |
Chenglei Si; Yanzhe Zhang; Zhengyuan Yang; Ruibo Liu; Diyi Yang; | arxiv-cs.CL | 2024-03-05 |
1206 | InjectTST: A Transformer Method of Injecting Global Information Into Independent Channels for Long Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, an injection method for global information into channel-independent Transformer, InjectTST, is proposed in this paper. |
CE CHI et. al. | arxiv-cs.LG | 2024-03-05 |
1207 | JMI at SemEval 2024 Task 3: Two-step Approach for Multimodal ECAC Using In-context Learning with GPT and Instruction-tuned Llama Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents our system development for SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations. |
Mohammed Abbas Ansari; Chandni Saxena; Tanvir Ahmad; | arxiv-cs.CL | 2024-03-05 |
1208 | Evolution Transformer: In-Context Evolutionary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: An alternative promising approach is to leverage data and directly discover powerful optimization principles via meta-optimization. In this work, we follow such a paradigm and introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies. |
Robert Tjarko Lange; Yingtao Tian; Yujin Tang; | arxiv-cs.AI | 2024-03-05 |
1209 | Predicting Learning Performance with Large Language Models: A Study in Adult Literacy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. … |
LIANG ZHANG et. al. | ArXiv | 2024-03-04 |
1210 | Using LLMs for The Extraction and Normalization of Product Attribute Values Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Web Data Commons – Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments. |
Alexander Brinkmann; Nick Baumann; Christian Bizer; | arxiv-cs.CL | 2024-03-04 |
1211 | What Is Missing in Multilingual Visual Reasoning and How to Fix It Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: NLP models today strive for supporting multiple languages and modalities, improving accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal … |
Yueqi Song; Simran Khanuja; Graham Neubig; | ArXiv | 2024-03-03 |
1212 | An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based large language models (LLMs) such as Generative Pre-trained Transformer (GPT) have become popular due to their remarkable performance across diverse … |
SANGSOO PARK et. al. | 2024 IEEE International Symposium on High-Performance … | 2024-03-02 |
1213 | Analysis of Privacy Leakage in Federated Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need … |
Minh N. Vu; Truc D. T. Nguyen; Tre’ R. Jeter; My T. Thai; | International Conference on Artificial Intelligence and … | 2024-03-02 |
1214 | LM4OPT: Unveiling The Potential of Large Language Models in Formulating Mathematical Optimization Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the rapidly evolving field of natural language processing, the translation of linguistic descriptions into mathematical formulation of optimization problems presents a formidable challenge, demanding intricate understanding and processing capabilities from Large Language Models (LLMs). This study compares prominent LLMs, including GPT-3.5, GPT-4, and Llama-2-7b, in zero-shot and one-shot settings for this task. |
Tasnim Ahmed; Salimur Choudhury; | arxiv-cs.CL | 2024-03-02 |
1215 | LAB: Large-Scale Alignment for ChatBots IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. |
SHIVCHANDER SUDALAIRAJ et. al. | arxiv-cs.CL | 2024-03-01 |
1216 | WaterFormer: A Global–Local Transformer for Underwater Image Enhancement With Environment Adaptor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Underwater image enhancement (UIE) is crucial for high-level vision in underwater robotics. While convolutional neural networks (CNNs) have made significant achievements in UIE, … |
JUNJIE WEN et. al. | IEEE Robotics & Automation Magazine | 2024-03-01 |
1217 | Spikeformer: Training High-performance Spiking Neural Network with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yudong Li; Yunlin Lei; Xu Yang; | Neurocomputing | 2024-03-01 |
1218 | Multi-modal Person Re-identification Based on Transformer Relational Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIANGTIAN ZHENG et. al. | Inf. Fusion | 2024-03-01 |
1219 | Driver Distraction Detection Using Semi-supervised Lightweight Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Adam A.Q. Mohammed; Xin Geng; Jing Wang; Zafar Ali; | Eng. Appl. Artif. Intell. | 2024-03-01 |
1220 | Transformer Based on The Prediction of Psoriasis Severity Treatment Response Related Papers Related Patents Related Grants Related Venues Related Experts View |
Cho-I Moon; Eun Bin Kim; Yoosang Baek; Onesok Lee; | Biomed. Signal Process. Control. | 2024-03-01 |
1221 | MGCoT: Multi-Grained Contextual Transformer for Table-based Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xianjie Mo; Yang Xiang; Youcheng Pan; Yongshuai Hou; Ping Luo; | Expert Syst. Appl. | 2024-03-01 |
1222 | DGFormer: Dynamic Graph Transformer for 3D Human Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhan-Heng Chen; Ju Dai; Junxuan Bai; Junjun Pan; | Pattern Recognit. | 2024-03-01 |
1223 | LCDFormer: Long-term Correlations Dual-graph Transformer for Traffic Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jiongbiao Cai; Chia-Hung Wang; Kun Hu; | Expert Syst. Appl. | 2024-03-01 |
1224 | An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models have great potential in the field of remote sensing super-resolution (SR) due to their excellent self-attention mechanisms. However, transformer models are … |
WENJIAN ZHANG et. al. | Remote. Sens. | 2024-03-01 |
1225 | Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese. |
Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari; | arxiv-cs.CL | 2024-03-01 |
1226 | PWDformer: Deformable Transformer for Long-term Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zheng Wang; Haowei Ran; Jinchang Ren; Meijun Sun; | Pattern Recognit. | 2024-03-01 |
1227 | T3SRS: Tensor Train Transformer for Compressing Sequential Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View |
HAO LI et. al. | Expert Syst. Appl. | 2024-03-01 |
1228 | Comparing Large Language Models and Human Programmers for Generating Programming Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4 employing the optimal prompt strategy outperforms 85 percent of human participants. |
Wenpin Hou; Zhicheng Ji; | arxiv-cs.SE | 2024-03-01 |
1229 | A Systematic Evaluation of Large Language Models for Generating Programming Code Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We systematically evaluated the performance of seven large language models in generating programming code using various prompt strategies, programming languages, and task … |
Wenpin Hou; Zhicheng Ji; | ArXiv | 2024-03-01 |
1230 | K-NN Attention-based Video Vision Transformer for Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Weirong Sun; Yujun Ma; Ruili Wang; | Neurocomputing | 2024-03-01 |
1231 | A Novel Full-convolution UNet-transformer for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Tianyou Zhu; Derui Ding; Feng Wang; Wei Liang; Bo Wang; | Biomed. Signal Process. Control. | 2024-03-01 |
1232 | Transformer Based Multiple Instance Learning for WSI Breast Cancer Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
CHENGYANG GAO et. al. | Biomed. Signal Process. Control. | 2024-03-01 |
1233 | Attention Combined Pyramid Vision Transformer for Polyp Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xiaogang Liu; Shuang Song; | Biomed. Signal Process. Control. | 2024-03-01 |
1234 | Here’s A Free Lunch: Sanitizing Backdoored Models with Model Merge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared to multiple advanced defensive approaches, our method offers an effective and efficient inference-stage defense against backdoor attacks on classification and instruction-tuned tasks without additional resources or specific knowledge. |
ANSH ARORA et. al. | arxiv-cs.CL | 2024-02-29 |
1235 | PeLLE: Encoder-based Language Models for Brazilian Portuguese Based on Open Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present PeLLE, a family of large language models based on the RoBERTa architecture, for Brazilian Portuguese, trained on curated, open data from the Carolina corpus. |
GUILHERME LAMARTINE DE MELLO et. al. | arxiv-cs.CL | 2024-02-29 |
1236 | PROC2PDDL: Open-Domain Planning Representations from Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. |
TIANYI ZHANG et. al. | arxiv-cs.CL | 2024-02-29 |
1237 | Can GPT Improve The State of Prior Authorization Via Guideline Based Automated Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate whether GPT can validate numerous key factors, in turn helping health plans reach a decision drastically faster. |
Shubham Vatsal; Ayush Singh; Shabnam Tafreshi; | arxiv-cs.CL | 2024-02-28 |
1238 | H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prior hardware accelerator designs primarily focused on single-chip solutions for 10 MB-class computer vision models. The GB-class transformer models for natural language … |
Yandong Luo; Shimeng Yu; | ACM Transactions on Design Automation of Electronic Systems | 2024-02-28 |
1239 | A Language Model Based Framework for New Concept Placement in Ontologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection. |
Hang Dong; Jiaoyan Chen; Yuan He; Yongsheng Gao; Ian Horrocks; | arxiv-cs.CL | 2024-02-27 |
1240 | Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters. |
Flavio Petruzzellis; Alberto Testolin; Alessandro Sperduti; | arxiv-cs.CL | 2024-02-27 |
1241 | Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The majority of the recent initiatives targeting medium to low-resource languages produced relatively small annotated datasets, with a skewed distribution, posing challenges for the development of sophisticated propaganda detection models. To address this challenge, we carefully develop the largest propaganda dataset to date, ArPro, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques. |
Maram Hasanain; Fatema Ahmed; Firoj Alam; | arxiv-cs.CL | 2024-02-27 |
1242 | CAPT: Category-level Articulation Estimation from A Single Point Cloud Using Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CAPT: category-level articulation estimation from a point cloud using Transformer. |
Lian Fu; Ryoichi Ishikawa; Yoshihiro Sato; Takeshi Oishi; | arxiv-cs.CV | 2024-02-27 |
1243 | If in A Crowdsourced Data Annotation Pipeline, A GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were … |
Zeyu He; Huang Chieh-Yang; C. C. Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang; | ArXiv | 2024-02-26 |
1244 | MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks. |
PAN LU et. al. | iclr | 2024-02-26 |
1245 | Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation can significantly benefit from large-scale video generative pre-training. |
HONGTAO WU et. al. | iclr | 2024-02-26 |
1246 | Looped Transformers Are Better at Learning Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures. |
Liu Yang; Kangwook Lee; Robert D Nowak; Dimitris Papailiopoulos; | iclr | 2024-02-26 |
1247 | AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that vector graphics are typically encoded using low-level graphics primitives, generating them directly is difficult. To address this, we propose the use of TikZ, a well-known abstract graphics language that can be compiled to vector graphics, as an intermediate representation of scientific figures. |
Jonas Belouadi; Anne Lauscher; Steffen Eger; | iclr | 2024-02-26 |
1248 | Masked Distillation Advances Self-Supervised Transformer Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a masked image modelling (MIM) based self-supervised neural architecture search method specifically designed for vision transformers, termed as MaskTAS, which completely avoids the expensive costs of data labeling inherited from supervised learning. |
CAIXIA YAN et. al. | iclr | 2024-02-26 |
1249 | Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the effect of code on enhancing LLMs’ reasoning capability by introducing different constraints on the Code Usage Frequency of GPT-4 Code Interpreter. |
AOJUN ZHOU et. al. | iclr | 2024-02-26 |
1250 | Transformer-VQ: Linear-Time Transformers Via Vector Quantization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time. |
Lucas Dax Lingle; | iclr | 2024-02-26 |
1251 | Graph Transformers on EHRs: Better Representation Improves Downstream Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose GT-BEHRT, a new approach that leverages temporal visit embeddings extracted from a graph transformer and uses a BERT-based model to obtain more robust patient representations, especially on longer EHR sequences. |
Raphael Poulain; Rahmatollah Beheshti; | iclr | 2024-02-26 |
1252 | The Devil Is in The Neurons: Interpreting and Mitigating Social Biases in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we try to unveil the mystery of social bias inside language models by introducing the concept of {\sc Social Bias Neurons}. |
YAN LIU et. al. | iclr | 2024-02-26 |
1253 | The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity. |
Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Re; | iclr | 2024-02-26 |
1254 | Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We used an ablation study to show that joint training on neuronal responses and behavior boosted performance, highlighting the model’s ability to associate behavioral and neural representations in an unsupervised manner. |
Antonis Antoniades; Yiyi Yu; Joe S Canzano; William Yang Wang; Spencer Smith; | iclr | 2024-02-26 |
1255 | NOLA: Compressing LoRA Using Linear Combination of Random Basis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce NOLA, which overcomes the rank one lower bound present in LoRA. |
Soroush Abbasi Koohpayegani; Navaneet K L; Parsa Nooralinejad; Soheil Kolouri; Hamed Pirsiavash; | iclr | 2024-02-26 |
1256 | Xformer: Hybrid X-Shaped Transformer for Image Denoising IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a hybrid X-shaped vision Transformer, named Xformer, which performs notably on image denoising tasks. |
JIALE ZHANG et. al. | iclr | 2024-02-26 |
1257 | DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT). |
XIANJUN YANG et. al. | iclr | 2024-02-26 |
1258 | The Reversal Curse: LLMs Trained on “A Is B” Fail to Learn “B Is A” Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is worth noting, however, that if ”_A_ is _B_” appears _in-context_, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as ”Uriah Hawthorne is the composer of _Abyssal Melodies_” and showing that they fail to correctly answer ”Who composed _Abyssal Melodies? |
LUKAS BERGLUND et. al. | iclr | 2024-02-26 |
1259 | A Multi-Level Framework for Accelerating Training Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by a set of key observations of inter- and intra-layer similarities among feature maps and attentions that can be identified from typical training processes, we propose a multi-level framework for training acceleration. |
Longwei Zou; Han Zhang; Yangdong Deng; | iclr | 2024-02-26 |
1260 | Quantum Linear Algebra Is All You Need for Transformer Architectures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative machine learning methods such as large-language models are revolutionizing the creation of text and images. While these models are powerful they also harness a large … |
Naixu Guo; Zhan Yu; Aman Agrawal; P. Rebentrost; | ArXiv | 2024-02-26 |
1261 | Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring The Design of Next-generation Neuromorphic Chips IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a general Transformer-based SNN architecture, termed as “Meta-SpikeFormer, whose goals are: (1) *Lower-power*, supports the spike-driven paradigm that there is only sparse addition in the network; (2) *Versatility*, handles various vision tasks; (3) *High-performance*, shows overwhelming performance advantages over CNN-based SNNs; (4) *Meta-architecture*, provides inspiration for future next-generation Transformer-based neuromorphic chip designs. |
MAN YAO et. al. | iclr | 2024-02-26 |
1262 | Is Self-Repair A Silver Bullet for Code Generation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze Code Llama, GPT-3.5 and GPT-4’s ability to perform self-repair on problems taken from HumanEval and APPS. |
Theo X. Olausson; Jeevana Priya Inala; Chenglong Wang; Jianfeng Gao; Armando Solar-Lezama; | iclr | 2024-02-26 |
1263 | Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing (e.g., mathematical, causal) reasoning tasks. |
Murong Yue; Jie Zhao; Min Zhang; Liang Du; Ziyu Yao; | iclr | 2024-02-26 |
1264 | MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We believe that the enhanced multi-modal generation capabilities of GPT-4 stem from the utilization of sophisticated large language models (LLM). To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen advanced LLM, Vicuna, using one projection layer. |
Deyao Zhu; Jun Chen; Xiaoqian Shen; Xiang Li; Mohamed Elhoseiny; | iclr | 2024-02-26 |
1265 | If in A Crowdsourced Data Annotation Pipeline, A GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were … |
Zeyu He; Chieh-Yang Huang; Chien-Kuang Cornelia Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang; | arxiv-cs.HC | 2024-02-26 |
1266 | Massive Editing for Large Language Models Via Meta Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameter using the normal equation. |
Chenmien Tan; Ge Zhang; Jie Fu; | iclr | 2024-02-26 |
1267 | CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we design a special Transformer, i.e., **C**hannel **A**ligned **R**obust Blen**d** Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting. |
XUE WANG et. al. | iclr | 2024-02-26 |
1268 | GeoLLM: Extracting Geospatial Knowledge from Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged for geospatial prediction tasks. |
ROHIN MANVI et. al. | iclr | 2024-02-26 |
1269 | VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Video temporal grounding (VTG) aims to locate specific temporal segments from an untrimmed video based on a linguistic query. Most existing VTG models are trained on extensive … |
Yifang Xu; Yunzhuo Sun; Zien Xie; Benxiang Zhai; Sidan Du; | ArXiv | 2024-02-25 |
1270 | HPE Transformer: Learning to Optimize Multi-Group Multicast Beamforming Under Nonconvex QoS Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate real-time implementations, this paper proposes a deep learning-based approach, which consists of a beamforming structure assisted problem transformation and a customized neural network architecture named hierarchical permutation equivariance (HPE) transformer. |
Yang Li; Ya-Feng Liu; | arxiv-cs.IT | 2024-02-25 |
1271 | From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatility IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education. |
PRAVNEET KAUR et. al. | arxiv-cs.CL | 2024-02-25 |
1272 | Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including semantic understanding, intelligent writing, and reasoning, paving the way for a more generalized form of artificial intelligence. |
Shuning Huo; Yafei Xiang; Hanyi Yu; Mengran Zhu; Yulu Gong; | arxiv-cs.CL | 2024-02-25 |
1273 | SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection. |
Ayan Datta; Aryan Chandramania; Radhika Mamidi; | arxiv-cs.CL | 2024-02-24 |
1274 | TV-SAM: Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model (TV-SAM) without any manual annotations. |
ZEKUN JIANG et. al. | arxiv-cs.CV | 2024-02-24 |
1275 | Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We propose a novel approach for machine-generated text detection using a RoBERTa model with weighted layer averaging and AdaLoRA for parameter-efficient fine-tuning. Our method … |
Ayan Datta; Aryan Chandramania; Radhika Mamidi; | ArXiv | 2024-02-24 |
1276 | ArabianGPT: Native Arabic GPT-based Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, there is a theoretical and practical imperative for developing LLMs predominantly focused on Arabic linguistic elements. To address this gap, this paper proposes ArabianGPT, a series of transformer-based models within the ArabianLLM suite designed explicitly for Arabic. |
Anis Koubaa; Adel Ammar; Lahouari Ghouti; Omar Najar; Serry Sibaee; | arxiv-cs.CL | 2024-02-23 |
1277 | Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing PEFT methods pose challenges in hyperparameter selection, such as choosing the rank for LoRA or Adapter, or specifying the length of soft prompts. To address these challenges, we propose a novel fine-tuning approach for neural models, named Representation EDiting (RED), which modifies the representations generated at some layers through the application of scaling and biasing operations. |
MULING WU et. al. | arxiv-cs.LG | 2024-02-23 |
1278 | Towards Efficient Active Learning in NLP Via Pretrained Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. |
Artem Vysogorets; Achintya Gopal; | arxiv-cs.LG | 2024-02-23 |
1279 | Multimodal Transformer With A Low-Computational-Cost Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multimodal Transformers significantly suffer from a quadratic complexity of the multi-head attention with the input sequence length, especially as the number of modalities increases. To address this, we introduce Low-Cost Multimodal Transformer (LoCoMT), a novel multimodal attention mechanism that aims to reduce computational cost during training and inference with minimal performance loss. |
Sungjin Park; Edward Choi; | arxiv-cs.LG | 2024-02-23 |
1280 | A First Look at GPT Apps: Landscape and Vulnerability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. |
ZEJUN ZHANG et. al. | arxiv-cs.CR | 2024-02-23 |
1281 | Self-Supervised Pre-Training for Table Structure Recognition Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we resolve the issue by proposing a self-supervised pre-training (SSP) method for TSR transformers. |
ShengYun Peng; Seongmin Lee; Xiaojing Wang; Rajarajeswari Balasubramaniyan; Duen Horng Chau; | arxiv-cs.CV | 2024-02-23 |
1282 | Whose LLM Is It Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse inputs. |
Ariel Rosenfeld; Teddy Lazebnik; | arxiv-cs.CL | 2024-02-22 |
1283 | Tokenization Counts: The Impact of Tokenization on Arithmetic in Frontier LLMs IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Tokenization, the division of input text into input tokens, is an often overlooked aspect of the large language model (LLM) pipeline and could be the source of useful or harmful … |
Aaditya K. Singh; DJ Strouse; | ArXiv | 2024-02-22 |
1284 | OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. |
TIANYU ZHENG et. al. | arxiv-cs.SE | 2024-02-22 |
1285 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper highlights the best practices of the PGI, Persona, Grouping, and Intelligence, method, a strategic framework that achieved a remarkable error rate of only 3,15 percent across 4,000 responses generated by GPT in response to a real business challenge. |
Aline Ioste; | arxiv-cs.CL | 2024-02-21 |
1286 | Knowledge Graph Enhanced Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of postedit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME. |
MENGQI ZHANG et. al. | arxiv-cs.CL | 2024-02-21 |
1287 | TransGOP: Transformer-Based Gaze Object Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, this paper introduces Transformer into the fields of gaze object prediction and proposes an end-to-end Transformer-based gaze object prediction method named TransGOP. |
Binglu Wang; Chenxi Guo; Yang Jin; Haisheng Xia; Nian Liu; | arxiv-cs.CV | 2024-02-21 |
1288 | On The Expressive Power of A Variant of The Looped Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide theoretical evidence of the expressive power of the AlgoFormer in solving some challenging problems, mirroring human-designed algorithms. |
YIHANG GAO et. al. | arxiv-cs.LG | 2024-02-21 |
1289 | Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker. |
Younghun Lee; Dan Goldwasser; Laura Schwab Reese; | arxiv-cs.CL | 2024-02-21 |
1290 | An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present an optimized, fine-tuned transformer-based DistilBERT model designed for the detection of phishing emails. |
Mohammad Amaz Uddin; Iqbal H. Sarker; | arxiv-cs.LG | 2024-02-21 |
1291 | SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy. |
PRAKAMYA MISHRA et. al. | arxiv-cs.CL | 2024-02-21 |
1292 | Towards Equipping Transformer with The Ability of Systematic Compositionality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tentatively provide a successful implementation of a multi-layer CAT on the basis of the especially popular BERT. |
Chen Huang; Peixin Qin; Wenqiang Lei; Jiancheng Lv; | aaai | 2024-02-20 |
1293 | DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency Via Efficient Data Sampling and Routing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present DeepSpeed Data Efficiency, a framework that makes better use of data, increases training efficiency, and improves model quality. |
CONGLONG LI et. al. | aaai | 2024-02-20 |
1294 | Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a multilingual idiom KB (IdiomKB) developed using large LMs to address this. |
SHUANG LI et. al. | aaai | 2024-02-20 |
1295 | Are ELECTRA’s Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We notice a significant drop in performance for the ELECTRA discriminator’s last layer in comparison to prior layers. We explore this drop and propose a way to repair the embeddings using a novel truncated model fine-tuning (TMFT) method. |
Ivan Rep; David Dukić; Jan Šnajder; | arxiv-cs.CL | 2024-02-20 |
1296 | SentinelLMs: Encrypted Input Adaptation and Fine-Tuning of Language Models for Private and Secure Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this introduces two fundamental risks: (a) the transmission of user inputs to the server via the network gives rise to interception vulnerabilities, and (b) privacy concerns emerge as organizations that deploy such models store user data with restricted context. To address this, we propose a novel method to adapt and fine-tune transformer-based language models on passkey-encrypted user-specific text. |
Abhijit Mishra; Mingda Li; Soham Deo; | aaai | 2024-02-20 |
1297 | CAR-Transformer: Cross-Attention Reinforcement Transformer for Cross-Lingual Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Cross-Attention Reinforcement (CAR) module and incorporate the module into the transformer backbone to formulate the CAR-Transformer. |
Yuang Cai; Yuyu Yuan; | aaai | 2024-02-20 |
1298 | SCTNet: Single-Branch CNN with Transformer Semantic Information for Real-Time Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the additional branch incurs undesirable computational overhead and slows inference speed. To eliminate this dilemma, we propose SCTNet, a single branch CNN with transformer semantic information for real-time segmentation. |
ZHENGZE XU et. al. | aaai | 2024-02-20 |
1299 | Span Graph Transformer for Document-Level Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to the length limit for input text, these models typically consider text at the sentence-level and cannot capture the long-range contextual dependency within a document. To address this issue, we propose a novel Span Graph Transformer (SGT) method for document-level NER, which constructs long-range contextual dependencies at both the token and span levels. |
Hongli Mao; Xian-Ling Mao; Hanlin Tang; Yu-Ming Shang; Heyan Huang; | aaai | 2024-02-20 |
1300 | S2WAT: Image Style Transfer Via Hierarchical Vision Transformer Using Strips Window Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces Strips Window Attention Transformer (S2WAT), a novel hierarchical vision transformer designed for style transfer. |
Chiyu Zhang; Xiaogang Xu; Lei Wang; Zaiyan Dai; Jun Yang; | aaai | 2024-02-20 |
1301 | Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Transformer-based HSI reconstruction method called dual-window multiscale Transformer (DWMT), which is a coarse-to-fine process, reconstructing the global properties of HSI with the long-range dependencies. |
Fulin Luo; Xi Chen; Xiuwen Gong; Weiwen Wu; Tan Guo; | aaai | 2024-02-20 |
1302 | Transformer Tricks: Precomputing The First Layer Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This micro-paper describes a trick to speed up inference of transformers with RoPE (such as LLaMA, Mistral, PaLM, and Gemma). For these models, a large portion of the first … |
Nils Graef; | arxiv-cs.LG | 2024-02-20 |
1303 | Equity-Transformer: Solving NP-Hard Min-Max Routing Problems As Sequential Generation with Equity Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes Equity-Transformer to solve large-scale min-max routing problems. |
Jiwoo Son; Minsu Kim; Sanghyeok Choi; Hyeonah Kim; Jinkyoo Park; | aaai | 2024-02-20 |
1304 | Referred By Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose MUTR, a Multi-modal Unified Temporal transformer for Referring video object segmentation. |
SHILIN YAN et. al. | aaai | 2024-02-20 |
1305 | Can Large Language Models Be Used to Provide Psychological Counselling? An Analysis of GPT-4-Generated Responses Using Role-play Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For this study, we collected counseling dialogue data via role-playing scenarios involving expert counselors, and the utterances were annotated with the intentions of the counselors. |
Michimasa Inaba; Mariko Ukiyo; Keiko Takamizo; | arxiv-cs.CL | 2024-02-20 |
1306 | How Easy Is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. |
Yusu Qian; Haotian Zhang; Yinfei Yang; Zhe Gan; | arxiv-cs.CV | 2024-02-20 |
1307 | Fairness-Aware Structured Pruning in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: WARNING: This work uses language that is offensive in nature. |
Abdelrahman Zayed; Gonçalo Mordido; Samira Shabanian; Ioana Baldini; Sarath Chandar; | aaai | 2024-02-20 |
1308 | Proxyformer: Nyström-Based Linear Transformer with Trainable Proxy Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel Nyström method-based transformer, called Proxyformer. |
Sangho Lee; Hayun Lee; Dongkun Shin; | aaai | 2024-02-20 |
1309 | Generalized Planning in PDDL Domains with Pretrained Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate this approach in seven PDDL domains and compare it to four ablations and four baselines. |
TOM SILVER et. al. | aaai | 2024-02-20 |
1310 | RhythmFormer: Extracting RPPG Signals Based on Hierarchical Temporal Periodic Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose RhythmFormer, a fully end-to-end transformer-based method for extracting rPPG signals by explicitly leveraging the quasi-periodic nature of rPPG. |
Bochao Zou; Zizheng Guo; Jiansheng Chen; Huimin Ma; | arxiv-cs.CV | 2024-02-20 |
1311 | Advancing GenAI Assisted Programming–A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to explore the best practices for utilizing GenAI as a programming tool, through a comparative analysis between GPT-4 and GLM-4. |
Angus Yang; Zehan Li; Jie Li; | arxiv-cs.SE | 2024-02-20 |
1312 | Your Large Language Model Is Secretly A Fairness Proponent and You Should Prompt It Like One Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to this, we validate that prompting LLMs with specific roles can allow LLMs to express diverse viewpoints. Building on this insight and observation, we develop FairThinking, a pipeline designed to automatically generate roles that enable LLMs to articulate diverse perspectives for fair expressions. |
TIANLIN LI et. al. | arxiv-cs.CL | 2024-02-19 |
1313 | Enabling Weak LLMs to Judge Response Reliability Via Meta Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, to enable weak LLMs to effectively assess the reliability of LLM responses, we propose a novel cross-query-comparison-based method called $\textit{Meta Ranking}$ (MR). |
ZIJUN LIU et. al. | arxiv-cs.CL | 2024-02-19 |
1314 | Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks. |
Jonathan Hayase; Ema Borevkovic; Nicholas Carlini; Florian Tramèr; Milad Nasr; | arxiv-cs.CL | 2024-02-19 |
1315 | Evaluation of ChatGPT’s Smart Contract Auditing Capabilities Based on Chain of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of enhancing smart contract security audits using the GPT-4 model. |
Yuying Du; Xueyan Tang; | arxiv-cs.CR | 2024-02-19 |
1316 | Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a circuit discovery framework alternative to activation patching. |
ZHENGFU HE et. al. | arxiv-cs.LG | 2024-02-19 |
1317 | Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. |
Anna Martin-Boyle; Aahan Tyagi; Marti A. Hearst; Dongyeop Kang; | arxiv-cs.CL | 2024-02-19 |
1318 | Tree-Planted Transformers: Unidirectional Transformer Language Models with Implicit Syntactic Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new method dubbed tree-planting: instead of explicitly generating syntactic structures, we plant trees into attention weights of unidirectional Transformer LMs to implicitly reflect syntactic structures of natural language. |
Ryo Yoshida; Taiga Someya; Yohei Oseki; | arxiv-cs.CL | 2024-02-19 |
1319 | A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation. |
ARCHIT SHARMA et. al. | arxiv-cs.LG | 2024-02-19 |
1320 | FinBen: A Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. |
QIANQIAN XIE et. al. | arxiv-cs.CL | 2024-02-19 |
1321 | Creating A Fine Grained Entity Type Taxonomy Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy. |
Michael Gunn; Dohyun Park; Nidhish Kamath; | arxiv-cs.CL | 2024-02-19 |
1322 | Enhancing Large Language Models for Text-to-Testcase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: In this paper, we introduce a text-to-testcase generation approach based on a large language model (GPT-3.5) that is fine-tuned on our curated dataset with an effective prompt design. |
Saranya Alagarsamy; Chakkrit Tantithamthavorn; Chetan Arora; Aldeida Aleti; | arxiv-cs.SE | 2024-02-19 |
1323 | Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conclusion: In this paper, we show that while GPT-4 is superior to open-source models in zero-shot report labeling, the implementation of few-shot prompting can bring open-source models on par with GPT-4. |
FELIX J. DORFNER et. al. | arxiv-cs.CL | 2024-02-19 |
1324 | LongAgent: Scaling Language Models to 128k Context Through Multi-Agent Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose \textsc{LongAgent}, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. |
JUN ZHAO et. al. | arxiv-cs.CL | 2024-02-18 |
1325 | Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive analysis of GPT-4, GPT-3.5 Turbo, and FLAN-T5 models in detecting framing in news headlines. |
Valeria Pastorino; Jasivan A. Sivakumar; Nafise Sadat Moosavi; | arxiv-cs.CL | 2024-02-18 |
1326 | A Curious Case of Searching for The Correlation Between Training Data and Adversarial Robustness of Transformer Textual Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we want to prove that there is also a strong correlation between training data and model robustness. |
Cuong Dang; Dung D. Le; Thai Le; | arxiv-cs.LG | 2024-02-18 |
1327 | Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, we propose a two-stage instruction tuning framework, in which VLMs are firstly finetuned on Vision-Flan and further tuned on GPT-4 synthesized data. We find this two-stage tuning framework significantly outperforms the traditional single-stage visual instruction tuning framework and achieves the state-of-the-art performance across a wide range of multi-modal evaluation benchmarks. |
ZHIYANG XU et. al. | arxiv-cs.CL | 2024-02-18 |
1328 | Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike the traditional supervised learning approach in IR tasks, ChatGPT challenges existing paradigms, bringing forth new challenges and opportunities regarding text quality assurance, model bias, and efficiency. This paper seeks to examine the impact of ChatGPT on IR tasks and offer insights into its potential future developments. |
Yizheng Huang; Jimmy Huang; | arxiv-cs.IR | 2024-02-17 |
1329 | Reasoning Before Comparison: LLM-Enhanced Semantic Similarity Metrics for Domain Specialized Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage LLM to enhance the semantic analysis and develop similarity metrics for texts, addressing the limitations of traditional unsupervised NLP metrics like ROUGE and BLEU. |
SHAOCHEN XU et. al. | arxiv-cs.CL | 2024-02-17 |
1330 | Human-object Interaction Detection Based on Cascade Multi-scale Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Limin Xia; Xiaoyue Ding; | Appl. Intell. | 2024-02-16 |
1331 | Can Separators Improve Chain-of-Thought Prompting? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by human cognition, we introduce COT-SEP, a method that strategically employs separators at the end of each exemplar in CoT prompting. |
Yoonjeong Park; Hyunjin Kim; Chanyeol Choi; Junseong Kim; Jy-yong Sohn; | arxiv-cs.CL | 2024-02-16 |
1332 | In Search of Needles in A 11M Haystack: Recurrent Memory Finds What LLMs Miss IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. |
YURI KURATOV et. al. | arxiv-cs.CL | 2024-02-16 |
1333 | Enhancing ESG Impact Type Identification Through Early Fusion and Multilingual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the evolving landscape of Environmental, Social, and Corporate Governance (ESG) impact assessment, the ML-ESG-2 shared task proposes identifying ESG impact types. To address this challenge, we present a comprehensive system leveraging ensemble learning techniques, capitalizing on early and late fusion approaches. |
Hariram Veeramani; Surendrabikram Thapa; Usman Naseem; | arxiv-cs.CL | 2024-02-16 |
1334 | WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. |
Chenhui Hu; Pengfei Cao; Yubo Chen; Kang Liu; Jun Zhao; | arxiv-cs.CL | 2024-02-16 |
1335 | Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios. To address this gap, we introduce a new benchmark, Conan, designed for extracting and analysing intricate character relation graphs from detective narratives. |
RUNCONG ZHAO et. al. | arxiv-cs.CL | 2024-02-16 |
1336 | Fine-tuning Large Language Model (LLM) Artificial Intelligence Chatbots in Ophthalmology and LLM-based Evaluation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Notably, qualitative analysis and the glaucoma sub-analysis revealed clinical inaccuracies in the LLM-generated responses, which were appropriately identified by the GPT-4 evaluation. |
TING FANG TAN et. al. | arxiv-cs.AI | 2024-02-15 |
1337 | GPT-4’s Assessment of Its Performance in A USMLE-based Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates GPT-4’s assessment of its performance in healthcare applications. |
UTTAM DHAKAL et. al. | arxiv-cs.AI | 2024-02-14 |
1338 | Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent years have witnessed a substantial increase in the use of deep learning to solve various natural language processing (NLP) problems. Early deep learning models were … |
JIAJIA WANG et. al. | ACM Computing Surveys | 2024-02-14 |
1339 | Research and Application of Transformer Based Anomaly Detection Model: A Literature Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To inspire research on Transformer-based anomaly detection, this review offers a fresh perspective on the concept of anomaly detection. |
Mingrui Ma; Lansheng Han; Chunjie Zhou; | arxiv-cs.LG | 2024-02-14 |
1340 | An Analysis of Language Frequency and Error Correction for Esperanto Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current Grammar Error Correction (GEC) initiatives tend to focus on major languages, with less attention given to low-resource languages like Esperanto. In this article, we begin to bridge this gap by first conducting a comprehensive frequency analysis using the Eo-GP dataset, created explicitly for this purpose. |
Junhong Liang; | arxiv-cs.CL | 2024-02-14 |
1341 | Changes By Butterflies: Farsighted Forecasting with Group Reservoir Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Group Reservoir Transformer to predict long-term events more accurately and robustly by overcoming two challenges in Chaos: (1) the extensive historical sequences and (2) the sensitivity to initial conditions. |
Md Kowsher; Abdul Rafae Khan; Jia Xu; | arxiv-cs.LG | 2024-02-14 |
1342 | Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The prevalence of stress-related disorders has increased significantly in recent years, necessitating scalable methods to identify affected individuals. This paper proposes a … |
AHMAD RADWAN et. al. | Int. J. Web Serv. Res. | 2024-02-14 |
1343 | API Pack: A Massive Multi-Programming Language Dataset for API Call Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce API Pack, a massive multi-programming language dataset containing more than 1 million instruction-API call pairs to improve the API call generation capabilities of large language models. |
Zhen Guo; Adriana Meza Soria; Wei Sun; Yikang Shen; Rameswar Panda; | arxiv-cs.CL | 2024-02-14 |
1344 | L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a language agent with chain-of-3D-thoughts (L3GO), an inference-time approach that can reason about part-based 3D mesh generation of unconventional objects that current data-driven diffusion models struggle with. |
YUTARO YAMADA et. al. | arxiv-cs.AI | 2024-02-14 |
1345 | Leveraging Large Language Models for Enhanced NLP Task Performance Through Knowledge Distillation and Optimized Training Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach presents a scalable methodology that reduces manual annotation costs and increases efficiency, making it especially pertinent in resource-limited and closed-network environments. |
Yining Huang; Keke Tang; Meilian Chen; | arxiv-cs.CL | 2024-02-14 |
1346 | Measuring and Controlling Instruction (In)Stability in Language Model Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To combat attention decay and instruction drift, we propose a lightweight method called split-softmax, which compares favorably against two strong baselines. |
KENNETH LI et. al. | arxiv-cs.CL | 2024-02-13 |
1347 | Combining Insights From Multiple Large Language Models Improves Diagnostic Accuracy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Background: Large language models (LLMs) such as OpenAI’s GPT-4 or Google’s PaLM 2 are proposed as viable diagnostic support tools or even spoken of as replacements for curbside consults. |
Gioele Barabucci; Victor Shia; Eugene Chu; Benjamin Harack; Nathan Fu; | arxiv-cs.AI | 2024-02-13 |
1348 | Addressing Cognitive Bias in Medical Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we developed BiasMedQA, a benchmark for evaluating cognitive biases in LLMs applied to medical tasks. |
SAMUEL SCHMIDGALL et. al. | arxiv-cs.CL | 2024-02-12 |
1349 | Investigating The Impact of Data Contamination of Large Language Models in Text-to-SQL Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the impact of Data Contamination on the performance of GPT-3.5 in the Text-to-SQL code-generating tasks. |
FEDERICO RANALDI et. al. | arxiv-cs.CL | 2024-02-12 |
1350 | Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long. We describe a pretraining data mixture which allows this encoder to process both short and long context sequences, and a finetuning approach that adapts this base model to retrieval with only single-sample batches. |
Jon Saad-Falcon; Daniel Y. Fu; Simran Arora; Neel Guha; Christopher Ré; | arxiv-cs.IR | 2024-02-12 |
1351 | Lissard: Long and Simple Sequential Reasoning Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Lissard, a benchmark comprising seven tasks whose goal is to assess the ability of models to process and generate wide-range sequence lengths, requiring repetitive procedural execution. |
Mirelle Bueno; Roberto Lotufo; Rodrigo Nogueira; | arxiv-cs.CL | 2024-02-12 |
1352 | CyberMetric: A Benchmark Dataset Based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To accurately test the general knowledge of LLMs in cybersecurity, the research community needs a diverse, accurate, and up-to-date dataset. To address this gap, we present CyberMetric-80, CyberMetric-500, CyberMetric-2000, and CyberMetric-10000, which are multiple-choice Q&A benchmark datasets comprising 80, 500, 2000, and 10,000 questions respectively. |
Norbert Tihanyi; Mohamed Amine Ferrag; Ridhi Jain; Tamas Bisztray; Merouane Debbah; | arxiv-cs.AI | 2024-02-12 |
1353 | Transformer Mechanisms Mimic Frontostriatal Gating Operations When Trained on Human Working Memory Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the mechanisms that emerge within a vanilla attention-only Transformer trained on a simple sequence modeling task inspired by a task explicitly designed to study working memory gating in computational cognitive neuroscience. |
Aaron Traylor; Jack Merullo; Michael J. Frank; Ellie Pavlick; | arxiv-cs.AI | 2024-02-12 |
1354 | Enhancing Programming Error Messages in Real Time with Generative AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We extend this work by implementing feedback from ChatGPT for all programs submitted to our automated assessment tool, Athene, providing help for compiler, run-time, and logic errors. |
BAILEY KIMMEL et. al. | arxiv-cs.HC | 2024-02-12 |
1355 | Enhancing Multi-Criteria Decision Analysis with AI: Integrating Analytic Hierarchy Process and GPT-4 for Automated Decision Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study presents a new framework that incorporates the Analytic Hierarchy Process (AHP) and Generative Pre-trained Transformer 4 (GPT-4) large language model (LLM), bringing novel approaches to cybersecurity Multiple-criteria Decision Making (MCDA). |
Igor Svoboda; Dmytro Lande; | arxiv-cs.AI | 2024-02-11 |
1356 | Leveraging AI to Advance Science and Computing Education Across Africa: Challenges, Progress and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this chapter, we discuss challenges with using AI to advance education across Africa. |
George Boateng; | arxiv-cs.CY | 2024-02-11 |
1357 | Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection Via Retrieval-Augmented GPT-4 and LLaMA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study details our approach for the CASE 2024 Shared Task on Climate Activism Stance and Hate Event Detection, focusing on Hate Speech Detection, Hate Speech Target Identification, and Stance Detection as classification challenges. |
MAREK ŠUPPA et. al. | arxiv-cs.CL | 2024-02-09 |
1358 | Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, time series data are uniquely challenging due to significant distribution shifts and intrinsic noise levels. To address these two challenges,we introduce the Sparse Vector Quantized FFN-Free Transformer (Sparse-VQ). |
YANJUN ZHAO et. al. | arxiv-cs.LG | 2024-02-08 |
1359 | FACT-GPT: Fact-Checking Augmentation Via Claim Matching with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking. |
Eun Cheol Choi; Emilio Ferrara; | arxiv-cs.CL | 2024-02-08 |
1360 | Efficient Models for The Detection of Hate, Abuse and Profanity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is unacceptable in civil discourse.The detection of Hate, Abuse and Profanity in text is a vital component of creating civil and unbiased LLMs, which is needed not only for English, but for all languages. In this article, we briefly describe the creation of HAP detectors and various ways of using them to make models civil and acceptable in the output they generate. |
Christoph Tillmann; Aashka Trivedi; Bishwaranjan Bhattacharjee; | arxiv-cs.CL | 2024-02-08 |
1361 | Traditional Machine Learning Models and Bidirectional Encoder Representations From Transformer (BERT)-Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: Our goal was to identify efficient machine learning models for categorizing tweets related to eating disorders. |
José Alberto Benítez-Andrades; José-Manuel Alija-Pérez; Maria-Esther Vidal; Rafael Pastor-Vargas; María Teresa García-Ordás; | arxiv-cs.CL | 2024-02-08 |
1362 | Generative Pre-Trained Transformer (GPT) in Research: A Systematic Review on Data Augmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT (Generative Pre-trained Transformer) represents advanced language models that have significantly reshaped the academic writing landscape. These sophisticated language models … |
F. Sufi; | Inf. | 2024-02-08 |
1363 | Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. |
JONATHAN THOMM et. al. | arxiv-cs.LG | 2024-02-08 |
1364 | Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduced three visual related tasks, i.e. caption classification, pairwise captioning, and culture tag selection, to systematically delve into fine-grained visual cultural evaluation. |
YONG CAO et. al. | arxiv-cs.CL | 2024-02-08 |
1365 | Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach for building a Named Entity Recognition (NER) model built upon a Bidirectional Encoder Representations from Transformers (BERT) architecture, specifically utilizing the SlovakBERT model. |
Bibiána Lajčinová; Patrik Valábek; Michal Spišiak; | arxiv-cs.CL | 2024-02-08 |
1366 | Opening The AI Black Box: Program Synthesis Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. |
ERIC J. MICHAUD et. al. | arxiv-cs.LG | 2024-02-07 |
1367 | Improving Cross-Domain Low-Resource Text Generation Through LLM Post-Editing: A Programmer-Interpreter Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the editing strategies in these methods are not optimally designed for text-generation tasks. To address these limitations, we propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output. |
Zhuang Li; Levon Haroutunian; Raj Tumuluri; Philip Cohen; Gholamreza Haffari; | arxiv-cs.CL | 2024-02-07 |
1368 | Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, we are the first to introduce SNNs into the realm of rPPG, proposing a hybrid neural network (HNN) model, the Spiking-PhysFormer, aimed at reducing power consumption. |
MINGXUAN LIU et. al. | arxiv-cs.CV | 2024-02-07 |
1369 | The Use of A Large Language Model for Cyberbullying Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Several machine learning (ML) algorithms have been proposed for this purpose. |
Bayode Ogunleye; Babitha Dharmaraj; | arxiv-cs.CL | 2024-02-06 |
1370 | The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity. |
Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Ré; | arxiv-cs.LG | 2024-02-06 |
1371 | Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The popularization of the internet and the widespread use of smartphones have led to a rapid growth in the number of social media users. While information technology has brought … |
Shifeng Chen; Jialin Wang; Ketai He; | Inf. | 2024-02-06 |
1372 | CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on synthetic data generation and demonstrate the capability of training a GPT model using a particular patient representation derived from CEHR-BERT, enabling us to generate patient sequences that can be seamlessly converted to the Observational Medical Outcomes Partnership (OMOP) data format. |
CHAO PANG et. al. | arxiv-cs.LG | 2024-02-06 |
1373 | Behind The Screen: Investigating ChatGPT’s Dark Personality Traits and Conspiracy Beliefs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: ChatGPT is notorious for its intransparent behavior. This paper tries to shed light on this, providing an in-depth analysis of the dark personality traits and conspiracy beliefs of GPT-3.5 and GPT-4. |
Erik Weber; Jérôme Rutinowski; Markus Pauly; | arxiv-cs.CL | 2024-02-06 |
1374 | MobilityGPT: Enhanced Human Mobility Modeling with A GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we reformat human mobility modeling as an autoregressive generation task, leveraging Generative Pre-trained Transformer (GPT). To ensure its controllable generation to alleviate the above challenges, we propose a geospatially-aware generative model, MobilityGPT. |
Ammar Haydari; Dongjie Chen; Zhengfeng Lai; Chen-Nee Chuah; | arxiv-cs.LG | 2024-02-05 |
1375 | Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a straightforward yet potent Conversation Reconstruction Attack. |
Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang; | arxiv-cs.CR | 2024-02-05 |
1376 | Self-Discover: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. |
PEI ZHOU et. al. | arxiv-cs.AI | 2024-02-05 |
1377 | Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods. |
Lingxiao Zhao; Xueying Ding; Leman Akoglu; | arxiv-cs.LG | 2024-02-05 |
1378 | Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To investigate the effect language on the formation of abstractions, we implement a novel multimodal serial reproduction framework by asking people who receive a visual stimulus to reproduce it in a linguistic format, and vice versa. |
SREEJAN KUMAR et. al. | arxiv-cs.AI | 2024-02-05 |
1379 | UniMem: Towards A Unified View of Long-Context Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We re-formulate 16 existing methods based on UniMem and analyze four representative methods: Transformer-XL, Memorizing Transformer, RMT, and Longformer into equivalent UniMem forms to reveal their design principles and strengths. Based on these analyses, we propose UniMix, an innovative approach that integrates the strengths of these algorithms. |
JUNJIE FANG et. al. | arxiv-cs.CL | 2024-02-05 |
1380 | A Survey on Transformer Compression IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV), specially for constructing large language models (LLM) and large vision … |
YEHUI TANG et. al. | ArXiv | 2024-02-05 |
1381 | Illuminate: A Novel Approach for Depression Detection with Explainable Analysis and Proactive Therapy Using Prompt Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper introduces a novel paradigm for depression detection and treatment using advanced Large Language Models (LLMs): Generative Pre-trained Transformer 4 (GPT-4), Llama 2 … |
Aryan Agrawal; | ArXiv | 2024-02-05 |
1382 | SWAG: Storytelling With Action Guidance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Storytelling With Action Guidance (SWAG), a novel approach to storytelling with LLMs. |
Zeeshan Patel; Karim El-Refai; Jonathan Pei; Tianle Li; | arxiv-cs.CL | 2024-02-05 |
1383 | Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our major contribution is building PubCLogs by repurposing 5.6 million pairs of coclicked articles from PubMed’s user query logs. |
Ashley Shin; Qiao Jin; James Anibal; Zhiyong Lu; | arxiv-cs.IR | 2024-02-05 |
1384 | Identifying Reasons for Contraceptive Switching from Real-World Data Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate that GPT-4 can accurately extract reasons for contraceptive switching, outperforming baseline BERT-based models with microF1 scores of 0.849 and 0.881 for contraceptive start and stop extraction, respectively. |
BRENDA Y. MIAO et. al. | arxiv-cs.CL | 2024-02-05 |
1385 | Evaluating Large Language Models in Analysing Classroom Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the application of Large Language Models (LLMs), specifically GPT-4, in the analysis of classroom dialogue, a crucial research task for both teaching diagnosis and quality improvement. |
Yun Long; Haifeng Luo; Yu Zhang; | arxiv-cs.CL | 2024-02-04 |
1386 | DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size — adding a few thousand parameters for large-scale models in the 100B parameters range. |
Matteo Pagliardini; Amirkeivan Mohtashami; Francois Fleuret; Martin Jaggi; | arxiv-cs.CL | 2024-02-04 |
1387 | Improving Assessment of Tutoring Practices Using Retrieval-Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize … |
ZIFEI HAN et. al. | ArXiv | 2024-02-04 |
1388 | Spin: An Efficient Secure Computation Framework with GPU Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose optimized protocols for non-linear functions that are critical for machine learning, as well as several novel optimizations specific to attention that is the fundamental unit of Transformer models, allowing Spin to perform non-trivial CNNs training and Transformer inference without sacrificing security. |
WUXUAN JIANG et. al. | arxiv-cs.CR | 2024-02-03 |
1389 | GPT-4V As Traffic Assistant: An In-depth Look at Vision Language Model on Complex Traffic Events Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The advent of large vision-language models (VLMs) such as GPT-4V, has introduced innovative approaches to addressing this issue. In this paper, we explore the ability of GPT-4V with a set of representative traffic incident videos and delve into the model’s capacity of understanding these complex traffic situations. |
Xingcheng Zhou; Alois C. Knoll; | arxiv-cs.CV | 2024-02-03 |
1390 | User Intent Recognition and Satisfaction with Large Language Models: A User Study with ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on a fine-grained intent taxonomy and intent-based prompt reformulations, we analyze (1) the quality of intent recognition and (2) user satisfaction with answers from intent-based prompt reformulations for two recent ChatGPT models, GPT-3.5 Turbo and GPT-4 Turbo. |
Anna Bodonhelyi; Efe Bozkir; Shuo Yang; Enkelejda Kasneci; Gjergji Kasneci; | arxiv-cs.HC | 2024-02-03 |
1391 | Data Quality Matters: Suicide Intention Detection on Social Media Posts Using A RoBERTa-CNN Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on identifying suicidal intentions in SuicideWatch Reddit posts and present a novel approach to suicide detection using the cutting-edge RoBERTa-CNN model, a variant of RoBERTa (Robustly optimized BERT approach). |
Emily Lin; Jian Sun; Hsingyu Chen; Mohammad H. Mahoor; | arxiv-cs.CL | 2024-02-03 |
1392 | MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonPerplexity submission for the Shared Task on Multimodal Hate Speech Event Detection at CASE 2024 at EACL 2024. |
AMRITA GANGULY et. al. | arxiv-cs.CL | 2024-02-02 |
1393 | ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer. |
ZIHAN LI et. al. | arxiv-cs.CV | 2024-02-02 |
1394 | LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: ChatGPT and other general large language models (LLMs) have achieved remarkable success, but they have also raised concerns about the misuse of AI-generated texts. |
RONGSHENG WANG et. al. | arxiv-cs.CL | 2024-02-02 |
1395 | COMET: Generating Commit Messages Using Delta Graph Context Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and leverages a transformer-based model to generate high-quality commit messages. |
Abhinav Reddy Mandli; Saurabhsingh Rajput; Tushar Sharma; | arxiv-cs.SE | 2024-02-02 |
1396 | Faster Inference of Integer SWIN Transformer By Removing The GELU Activation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we improve upon the inference latency of the state-of-the-art methods by removing the floating-point operations, which are associated with the GELU activation in Swin Transformer. |
Mohammadreza Tayaranian; Seyyed Hasan Mozafari; James J. Clark; Brett Meyer; Warren Gross; | arxiv-cs.CV | 2024-02-02 |
1397 | Generation, Distillation and Evaluation of Motivational Interviewing-Style Reflections with A Foundational Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method for distilling the generation of reflections from a Foundational Language Model (GPT-4) into smaller models. |
ANDREW BROWN et. al. | arxiv-cs.CL | 2024-02-01 |
1398 | A Transformer-CNN Parallel Network for Image Guided Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts View |
Tao Li; Xiucheng Dong; Jie Lin; Yonghong Peng; | Pattern Recognit. | 2024-02-01 |
1399 | Ultra Fast Transformers on FPGAs for Particle Physics Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we have implemented critical components of a transformer model, such as multi-head attention and softmax layers. |
ZHIXING JIANG et. al. | arxiv-cs.LG | 2024-02-01 |
1400 | Self-Supervised Contrastive Pre-Training for Multivariate Point Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new paradigm for self-supervised learning for multivariate point processes using a transformer encoder. |
Xiao Shou; Dharmashankar Subramanian; Debarun Bhattacharjya; Tian Gao; Kristin P. Bennet; | arxiv-cs.LG | 2024-02-01 |
1401 | Rail Surface Defect Detection Using A Transformer-based Network Related Papers Related Patents Related Grants Related Venues Related Experts View |
Feng Guo; Jian Liu; Yu Qian; Quanyi Xie; | J. Ind. Inf. Integr. | 2024-02-01 |
1402 | Spatiotemporal Fusion Transformer for Large-scale Traffic Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZHENGHONG WANG et. al. | Inf. Fusion | 2024-02-01 |
1403 | Efficient Image Analysis with Triple Attention Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
GeHui Li; Tongtong Zhao; | Pattern Recognit. | 2024-02-01 |
1404 | Intelligent Fault Diagnosis of Consumer Electronics Sensor in IoE Via Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: IoE era is coming with the development of information and communication technology. As a typical representative of the IoE intelligent era, consumer electronics products have … |
Wen-Chieh Lin; | IEEE Transactions on Consumer Electronics | 2024-02-01 |
1405 | Comparative Study of Large Language Model Architectures on Frontier Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Employing the same materials science text corpus and a comprehensive end-to-end pipeline, we conduct a comparative analysis of their training and downstream performance. |
Junqi Yin; Avishek Bose; Guojing Cong; Isaac Lyngaas; Quentin Anthony; | arxiv-cs.DC | 2024-02-01 |
1406 | Understanding The Expressive Power and Mechanisms of Transformer for Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. |
Mingze Wang; Weinan E; | arxiv-cs.LG | 2024-02-01 |
1407 | Transformer-based Sensor Failure Prediction and Classification Framework for UAVs Related Papers Related Patents Related Grants Related Venues Related Experts View |
MUHAMMAD WAQAS AHMAD et. al. | Expert Syst. Appl. | 2024-02-01 |
1408 | Masked Siamese Prompt Tuning for Few-Shot Natural Language Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, prompt-based learning has shown excellent performance on few-shot scenarios. Using frozen language models to tune trainable continuous prompt embeddings has become a … |
Shiwen Ni; Hung-Yu Kao; | IEEE Transactions on Artificial Intelligence | 2024-02-01 |
1409 | TFMFT: Transformer-based Multiple Fish Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View |
Weiran Li; Yeqiang Liu; Wenxu Wang; Zhenbo Li; Jun Yue; | Comput. Electron. Agric. | 2024-02-01 |
1410 | RobinNet: A Multimodal Speech Emotion Recognition System With Speaker Recognition for Social Interactions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: It is essential to understand the underlying emotions that are imparted through speech in order to study social communications as well as to generate seamless human–computer … |
Yash Khurana; Swamita Gupta; R. Sathyaraj; S. Raja; | IEEE Transactions on Computational Social Systems | 2024-02-01 |
1411 | Identification Method of Interturn Short Circuit Fault for Distribution Transformer Based on Power Loss Variation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Interturn short circuit fault of distribution transformer winding occurs frequently and is difficult to accurately real-time monitoring, which seriously affects the reliability of … |
R. XIAN et. al. | IEEE Transactions on Industrial Informatics | 2024-02-01 |
1412 | Dendritic Learning-Incorporated Vision Transformer for Image Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
Zhiming Zhang; Zhenyu Lei; M. Omura; Hideyuki Hasegawa; Shangce Gao; | IEEE CAA J. Autom. Sinica | 2024-02-01 |
1413 | STFormer: A Dual-stage Transformer Model Utilizing Spatio-temporal Graph Embedding for Multivariate Time Series Forecasting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multivariate Time Series (MTS) forecasting has gained significant importance in diverse domains. Although Recurrent Neural Network (RNN)-based approaches have made notable … |
Yuteng Xiao; Zhaoyang Liu; Hongsheng Yin; Xingang Wang; Yudong Zhang; | J. Intell. Fuzzy Syst. | 2024-02-01 |
1414 | HARDSEA: Hybrid Analog-ReRAM Clustering and Digital-SRAM In-Memory Computing Accelerator for Dynamic Sparse Self-Attention in Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Self-attention-based transformers have outperformed recurrent and convolutional neural networks (RNN/ CNNs) in many applications. Despite the effectiveness, calculating … |
SHIWEI LIU et. al. | IEEE Transactions on Very Large Scale Integration (VLSI) … | 2024-02-01 |
1415 | COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The development of Courses of Action (COAs) in military operations is traditionally a time-consuming and intricate process. Addressing this challenge, this study introduces COA-GPT, a novel algorithm employing Large Language Models (LLMs) for rapid and efficient generation of valid COAs. |
Vinicius G. Goecks; Nicholas Waytowich; | arxiv-cs.AI | 2024-02-01 |
1416 | Lightweight Transformer Image Feature Extraction Network IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the image feature extraction method based on Transformer has become a research hotspot. However, when using Transformer for image feature extraction, the model’s … |
Wenfeng Zheng; Siyu Lu; Youshuai Yang; Zhengtong Yin; Lirong Yin; | PeerJ Comput. Sci. | 2024-01-31 |
1417 | Global-Liar: Factuality of LLMs Over Time and Geographic Regions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ‘Global-Liar,’ a dataset uniquely balanced in terms of geographic and temporal representation, facilitating a more nuanced evaluation of LLM biases. |
Shujaat Mirza; Bruno Coelho; Yuyuan Cui; Christina Pöpper; Damon McCoy; | arxiv-cs.CL | 2024-01-31 |
1418 | Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present Gyan AI Paramanu (atom), a family of novel language models for Indian languages. It is a collection of auto-regressive monolingual, bilingual, and multilingual Indic … |
Mitodru Niyogi; Arnab Bhattacharya; | ArXiv | 2024-01-31 |
1419 | Program Code Generation with Generative AIs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our paper compares the correctness, efficiency, and maintainability of human-generated and AI-generated program code. For that, we analyzed the computational resources of AI- and … |
Baskhad Idrisov; Tim Schlippe; | Algorithms | 2024-01-31 |
1420 | Evaluating The Effectiveness of GPT-4 Turbo in Creating Defeaters for Assurance Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Identifying defeaters arguments that refute these ACs is essential for improving the robustness and confidence in ACs. To automate this task, we introduce a novel method that leverages the capabilities of GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify defeaters within ACs formalized using the Eliminative Argumentation (EA) notation. |
KIMYA KHAKZAD SHAHANDASHTI et. al. | arxiv-cs.SE | 2024-01-31 |
1421 | Fine-Tuning and Prompt Engineering for Large Language Models-based Code Review Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: We aim to investigate the performance of LLMs-based code review automation based on two contexts, i.e., when LLMs are leveraged by fine-tuning and prompting. |
Chanathip Pornprasit; Chakkrit Tantithamthavorn; | arxiv-cs.SE | 2024-01-31 |
1422 | Human-mediated Large Language Models for Robotic Intervention in Children with Autism Spectrum Disorders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This practice restricts the use of robots to limited, pre-mediated instructional curricula. In this paper, we increase robot autonomy in one such robotic intervention for children with ASD by implementing perspective-taking teaching. |
Ruchik Mishra; Karla Conn Welch; Dan O Popa; | arxiv-cs.RO | 2024-01-31 |
1423 | Spatial-Spectral BERT for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Several deep learning and transformer models have been recommended in previous research to deal with the classification of hyperspectral images (HSIs). Among them, one of the most … |
MAHMOOD ASHRAF et. al. | Remote. Sens. | 2024-01-31 |
1424 | Towards AI-Assisted Synthesis of Verified Dafny Methods IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we demonstrate how to improve two pretrained models’ proficiency in the Dafny verification-aware language. |
Md Rakib Hossain Misu; Cristina V. Lopes; Iris Ma; James Noble; | arxiv-cs.SE | 2024-01-31 |
1425 | ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Apart from the non-linearity, the low arithmetic intensity greatly reduces the processing parallelism, which becomes the bottleneck especially when dealing with a longer context. To address this challenge, we propose Constant Softmax (ConSmax), a software-hardware co-design as an efficient Softmax alternative. |
SHIWEI LIU et. al. | arxiv-cs.AR | 2024-01-31 |
1426 | BurstGPT: A Real-world Workload Dataset to Optimize LLM Serving Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents BurstGPT, an LLM serving workload with 5.29 million traces from regional Azure OpenAI GPT services over 121 days. |
YUXIN WANG et. al. | arxiv-cs.DC | 2024-01-31 |
1427 | Scavenging Hyena: Distilling Transformers Into Long Convolution Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a pioneering approach to address the efficiency concerns associated with LLM pre-training, proposing the use of knowledge distillation for cross-architecture transfer. |
Tokiniaina Raharison Ralambomihanta; Shahrad Mohammadzadeh; Mohammad Sami Nur Islam; Wassim Jabbour; Laurence Liang; | arxiv-cs.CL | 2024-01-30 |
1428 | Arabic Tweet Act: A Weighted Ensemble Pre-Trained Transformer Model for Classifying Arabic Speech Acts on Twitter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Twitter dialectal Arabic speech act classification approach based on a transformer deep learning neural network. |
Khadejaa Alshehri; Areej Alhothali; Nahed Alowidi; | arxiv-cs.CL | 2024-01-30 |
1429 | SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a subarray-level processing-in-memory architecture named SAL-PIM, HBM-based PIM architecture for the end-to-end acceleration of transformer-based text generation. |
Wontak Han; Hyunjun Cho; Donghyuk Kim; Joo-Young Kim; | arxiv-cs.AR | 2024-01-30 |
1430 | Enhancing Product Design Through AI-Driven Sentiment Analysis of Amazon Reviews Using BERT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding customer emotions and preferences is paramount for success in the dynamic product design landscape. This paper presents a study to develop a prediction pipeline to … |
Mahammad Khalid Shaik Vadla; Mahima Agumbe Suresh; V. Viswanathan; | Algorithms | 2024-01-30 |
1431 | Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide a Transformer-based model and a baseline benchmark for the Turkish Language. |
Savas Yildirim; | arxiv-cs.CL | 2024-01-30 |
1432 | 3DG: A Framework for Using Generative AI for Handling Sparse Learner Performance Data From Intelligent Tutoring Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Learning performance data (e.g., quiz scores and attempts) is significant for understanding learner engagement and knowledge mastery level. However, the learning performance data … |
Liang Zhang; Jionghao Lin; Conrad Borchers; Meng Cao; Xiangen Hu; | ArXiv | 2024-01-29 |
1433 | More Than Meets The AI: Evaluating The Performance of GPT-4 on Computer Graphics Assessment Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies have showcased the exceptional performance of LLMs (Large Language Models) on assessment questions across various discipline areas. This can be helpful if used to … |
Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Steffan Hooper; | Proceedings of the 26th Australasian Computing Education … | 2024-01-29 |
1434 | Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we construct dialogue modules based on a CBT scenario focused on conventional Socratic questioning using two kinds of LLMs: a Transformer-based dialogue model further trained with a social media empathetic counseling dataset, provided by Osaka Prefecture (OsakaED), and GPT-4, a state-of-the art LLM created by OpenAI. |
KENTA IZUMI et. al. | arxiv-cs.CL | 2024-01-29 |
1435 | TQCompressor: Improving Tensor Decomposition Methods in Neural Networks Via Permutations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce TQCompressor, a novel method for neural network model compression with improved tensor decompositions. |
V. ABRONIN et. al. | arxiv-cs.LG | 2024-01-29 |
1436 | Breaking Free Transformer Models: Task-specific Context Attribution Promises Improved Generalizability Without Fine-tuning Pre-trained LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a framework that allows for maintaining generalizability, and enhances the performance on the downstream task by utilizing task-specific context attribution. |
Stepan Tytarenko; Mohammad Ruhul Amin; | arxiv-cs.CL | 2024-01-29 |
1437 | You Tell Me: A Dataset of GPT-4-Based Behaviour Change Support Conversations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and … |
Selina Meyer; David Elsweiler; | arxiv-cs.HC | 2024-01-29 |
1438 | Leveraging Professional Radiologists’ Expertise to Enhance LLMs’ Evaluation for Radiology Reports Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current … |
QINGQING ZHU et. al. | ArXiv | 2024-01-29 |
1439 | Leveraging Professional Radiologists’ Expertise to Enhance LLMs’ Evaluation for Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Experimental results show that our Detailed GPT-4 (5-shot) model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our Regressed GPT-4 model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin. |
QINGQING ZHU et. al. | arxiv-cs.CL | 2024-01-29 |
1440 | An Insight Into Security Code Review with LLMs: Capabilities, Obstacles and Influential Factors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we conducted an empirical study to explore the potential of LLMs in detecting security defects during code review. |
JIAXIN YU et. al. | arxiv-cs.SE | 2024-01-29 |
1441 | Evaluating LLM – Generated Multimodal Diagnosis from Medical Images and Symptom Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) constitute a breakthrough state-of-the-art Artificial Intelligence technology which is rapidly evolving and promises to aid in medical diagnosis. … |
Dimitrios P. Panagoulias; M. Virvou; G. Tsihrintzis; | ArXiv | 2024-01-28 |
1442 | Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We apply a novel Mixture of Experts (MoE) extension pipeline to pretrained BERT models, where every multi-layer perceptron section is enlarged and copied into multiple distinct experts. |
Logan Hallee; Rohan Kapur; Arjun Patel; Jason P. Gleghorn; Bohdan Khomtchouk; | arxiv-cs.LG | 2024-01-28 |
1443 | Identifying and Improving Disability Bias in GPT-Based Resume Screening Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, without examining the potential of bias, this may negatively impact marginalized populations, including people with disabilities. To address this important concern, we present a resume audit study, in which we ask ChatGPT (specifically, GPT-4) to rank a resume against the same resume enhanced with an additional leadership award, scholarship, panel presentation, and membership that are disability related. |
Kate Glazko; Yusuf Mohammed; Ben Kosa; Venkatesh Potluri; Jennifer Mankoff; | arxiv-cs.CY | 2024-01-28 |
1444 | UnMASKed: Quantifying Gender Biases in Masked Language Models Through Linguistically Informed Job Market Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluated six prominent models: BERT, RoBERTa, DistilBERT, BERT-multilingual, XLM-RoBERTa, and DistilBERT-multilingual. |
Iñigo Parra; | arxiv-cs.CL | 2024-01-28 |
1445 | Semantics of Multiword Expressions in Transformer-Based Models: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing this gap, we provide the first in-depth survey of MWE processing with transformer models. We overall find that they capture MWE semantics inconsistently, as shown by reliance on surface patterns and memorized information. |
Filip Miletić; Sabine Schulte im Walde; | arxiv-cs.CL | 2024-01-27 |
1446 | A New Method for Vehicle Logo Recognition Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we implement real-time VLR using Swin Transformer and fine-tune it for optimal performance. |
Yang Li; Doudou Zhang; Jianli Xiao; | arxiv-cs.CV | 2024-01-27 |
1447 | Large Language Model for Vulnerability Detection: Emerging Results and Future Directions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the effectiveness of LLMs in detecting software vulnerabilities is largely unexplored. This paper aims to bridge this gap by exploring how LLMs perform with various prompts, particularly focusing on two state-of-the-art LLMs: GPT-3.5 and GPT-4. |
Xin Zhou; Ting Zhang; David Lo; | arxiv-cs.SE | 2024-01-27 |
1448 | Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Importantly, we find that coding fidelity improves considerably when the LLM is prompted to give rationale justifying its coding decisions (chain-of-thought reasoning). We present these and other findings along with a set of best practices for adapting traditional codebooks for LLMs. |
Zackary Okun Dunivin; | arxiv-cs.CL | 2024-01-26 |
1449 | Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a transformer-based CL framework focusing on learning tasks that involve both vision and language, known as Vision-and-Language (VaL) tasks. |
Yuliang Cai; Mohammad Rostami; | arxiv-cs.CV | 2024-01-26 |
1450 | From GPT-4 to Gemini and Beyond: Assessing The Landscape of MLLMs on Generalizability, Trustworthiness and Causality Through Four Modalities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. |
CHAOCHAO LU et. al. | arxiv-cs.CV | 2024-01-26 |
1451 | (Chat)GPT V BERT: Dawn of Justice for Semantic Change Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we specifically focus on the temporal problem of semantic change, and evaluate their ability to solve two diachronic extensions of the Word-in-Context (WiC) task: TempoWiC and HistoWiC. |
Francesco Periti; Haim Dubossarsky; Nina Tahmasebi; | arxiv-cs.CL | 2024-01-25 |
1452 | Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution. |
CHENG QIAN et. al. | arxiv-cs.CL | 2024-01-25 |
1453 | Chat GPT for Professional English Course Development Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Digitalization of all life spheres is the reality of modern world development. The global digitization creates a powerful information environment. Its navigation requires serious … |
I. KOSTIKOVA et. al. | Int. J. Interact. Mob. Technol. | 2024-01-25 |
1454 | Relative Value Biases in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Studies of reinforcement learning in humans and animals have demonstrated a preference for options that yielded relatively better outcomes in the past, even when those options are associated with lower absolute reward. |
William M. Hayes; Nicolas Yax; Stefano Palminteri; | arxiv-cs.CL | 2024-01-25 |
1455 | MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. |
PATRICK LEE et. al. | arxiv-cs.CL | 2024-01-25 |
1456 | Evaluating GPT-3.5’s Awareness and Summarization Abilities for European Constitutional Texts with Shared Topics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, using the renowned GPT-3.5, we leverage generative large language models to understand constitutional passages that transcend national boundaries. |
Candida M. Greco; A. Tagarelli; | arxiv-cs.CL | 2024-01-25 |
1457 | When Geoscience Meets Generative AI and Large Language Models: Foundations, Trends, and Future Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative Artificial Intelligence (GAI) represents an emerging field that promises the creation of synthetic data and outputs in different modalities. GAI has recently shown … |
A. Hadid; Tanujit Chakraborty; Daniel Busby; | ArXiv | 2024-01-25 |
1458 | An In-Depth Review of ChatGPT’s Pros and Cons for Learning and Teaching in Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As technology progresses, there has been an increasing interest in using Chatbot GPT (Generative Pre-trained Transformer) in education. Chatbot GPT, or ChatGPT, gained one million … |
A. Samala; Xiaoming Zhai; Kumiko Aoki; Ljubiša Bojić; Simona Žikić; | Int. J. Interact. Mob. Technol. | 2024-01-25 |
1459 | Unmasking and Quantifying Racial Bias of Large Language Models in Medical Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite few attempts in the past, the precise impact and extent of these biases remain uncertain. Through both qualitative and quantitative analyses, we find that these models tend to project higher costs and longer hospitalizations for White populations and exhibit optimistic views in challenging medical scenarios with much higher survival rates. |
Yifan Yang; Xiaoyu Liu; Qiao Jin; Furong Huang; Zhiyong Lu; | arxiv-cs.CL | 2024-01-24 |
1460 | A Comparative Study of Zero-shot Inference with Large Language Models and Supervised Modeling in Breast Cancer Pathology Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explored whether recent LLMs can reduce the need for large-scale data annotations. |
MADHUMITA SUSHIL et. al. | arxiv-cs.CL | 2024-01-24 |
1461 | Automated Root Causing of Cloud Incidents Using In-Context Learning with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the high cost of fine-tuning LLM, we propose an in-context learning approach for automated root causing, which eliminates the need for fine-tuning. |
XUCHAO ZHANG et. al. | arxiv-cs.CL | 2024-01-24 |
1462 | Learning Daily Human Mobility with A Transformer-Based Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The generation and prediction of daily human mobility patterns have raised significant interest in many scientific disciplines. Using various data sources, previous studies have … |
Weiying Wang; T. Osaragi; | ISPRS Int. J. Geo Inf. | 2024-01-24 |
1463 | Discovering Mathematical Formulas from Data Via GPT-guided Monte Carlo Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To optimize the trade-off between efficiency and versatility, we introduce SR-GPT, a novel algorithm for symbolic regression that integrates Monte Carlo Tree Search (MCTS) with a Generative Pre-Trained Transformer (GPT). |
YANJIE LI et. al. | arxiv-cs.LG | 2024-01-24 |
1464 | Can GPT-3.5 Generate and Code Discharge Summaries? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We report micro- and macro-F1 scores on the full codeset, generation codes, and their families. |
MATÚŠ FALIS et. al. | arxiv-cs.CL | 2024-01-24 |
1465 | ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce ConTextual, a novel dataset featuring human-crafted instructions that require context-sensitive reasoning for text-rich images. |
Rohan Wadhawan; Hritik Bansal; Kai-Wei Chang; Nanyun Peng; | arxiv-cs.CV | 2024-01-24 |
1466 | Convolutional Initialization for Data-Efficient Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast, convolutional neural networks (CNNs) can achieve state-of-the-art performance by leveraging their architectural inductive bias. In this paper, we investigate whether this inductive bias can be reinterpreted as an initialization bias within a vision transformer network. |
Jianqiao Zheng; Xueqian Li; Simon Lucey; | arxiv-cs.CV | 2024-01-23 |
1467 | TAT-LLM: A Specialized Language Model for Discrete Reasoning Over Tabular and Textual Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address question answering (QA) over a hybrid of tabular and textual data that are very common content on the Web (e.g. SEC filings), where discrete reasoning capabilities are often required. |
FENGBIN ZHU et. al. | arxiv-cs.CL | 2024-01-23 |
1468 | Contrastive Learning in Distilled Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. |
Valerie Lim; Kai Wen Ng; Kenneth Lim; | arxiv-cs.CL | 2024-01-22 |
1469 | Enhancing In-context Learning Via Linear Probe Calibration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input. |
MOMIN ABBAS et. al. | arxiv-cs.CL | 2024-01-22 |
1470 | Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on Subtask A & B. Each subtask is supported by three datasets for training, development, and testing. |
FENG XIONG et. al. | arxiv-cs.CL | 2024-01-22 |
1471 | Freely Long-Thinking Transformer (FraiLT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Freely Long-Thinking Transformer (FraiLT) is an improved transformer model designed to enhance processing capabilities without scaling up size. It utilizes a recursive approach, iterating over a subset of layers multiple times, and introduces iteration encodings to maintain awareness across these cycles. |
Akbay Tabak; | arxiv-cs.LG | 2024-01-21 |
1472 | Revolutionizing Finance with LLMs: An Overview of Applications and Insights IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide a comprehensive overview of the emerging integration of LLMs into various financial tasks. |
HUAQIN ZHAO et. al. | arxiv-cs.CL | 2024-01-21 |
1473 | CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Free-text radiology reports present a rich data source for various medical tasks, but effectively labeling these texts remains challenging. |
JAWOOK GU et. al. | arxiv-cs.CL | 2024-01-21 |
1474 | Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fault protection mechanism that incurs zero space cost. |
BINGBING LI et. al. | arxiv-cs.LG | 2024-01-21 |
1475 | Unfair TOS: An Automated Approach Using Customized BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present SOTA(State of The Art) results on unfair clause detection from ToS documents based on unprecedented custom BERT Fine-tuning in conjunction with SVC(Support Vector Classifier). |
Bathini Sai Akash; Akshara Kupireddy; Lalita Bhanu Murthy; | arxiv-cs.CL | 2024-01-20 |
1476 | Visualization Generation with Large Language Models: An Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the capability of a large language model to generate visualization specifications on the task of natural language to visualization (NL2VIS). |
GUOZHENG LI et. al. | arxiv-cs.HC | 2024-01-20 |
1477 | Evaluating and Enhancing Large Language Models Performance in Domain-specific Medicine: Osteoarthritis Management with DocOA Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The efficacy of large language models (LLMs) in domain-specific medicine, particularly for managing complex diseases such as osteoarthritis (OA), remains largely unexplored. This … |
XI CHEN et. al. | ArXiv | 2024-01-20 |
1478 | Mining Experimental Data from Materials Science Literature with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study is dedicated to assessing the capabilities of large language models (LLMs) such as GPT-3.5-Turbo, GPT-4, and GPT-4-Turbo in extracting structured information from … |
Luca Foppiano; Guillaume Lambard; Toshiyuki Amagasa; Masashi Ishii; | ArXiv | 2024-01-19 |
1479 | Cross-lingual Editing in Multilingual Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For more comprehensive information, the dataset used in this research and the associated code are publicly available at the following URL\url{https://github.com/lingo-iitgn/XME}. |
Himanshu Beniwal; Kowsik Nandagopan D; Mayank Singh; | arxiv-cs.CL | 2024-01-19 |
1480 | Mining Experimental Data from Materials Science Literature with Large Language Models: An Evaluation Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel methodology for the comparative analysis of intricate material expressions, emphasising the standardisation of chemical formulas to tackle the complexities inherent in materials science information assessment. |
Luca Foppiano; Guillaume Lambard; Toshiyuki Amagasa; Masashi Ishii; | arxiv-cs.CL | 2024-01-19 |
1481 | Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a comparative analysis of state-of-the-art Pre-trained Language Models (PTMs) for fine-grained emotion classification on two benchmark datasets from GitHub and Stack Overflow. |
Mia Mohammad Imran; | arxiv-cs.SE | 2024-01-19 |
1482 | Speech Swin-Transformer: Exploring A Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In speech signals, emotional information is distributed across different scales of speech features, e.\,g., word, phrase, and utterance. Drawing above inspiration, this paper presents a hierarchical speech Transformer with shifted windows to aggregate multi-scale emotion features for speech emotion recognition (SER), called Speech Swin-Transformer. |
YONG WANG et. al. | arxiv-cs.CL | 2024-01-19 |
1483 | DB-GPT: Large Language Model Meets Database IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xuanhe Zhou; Zhaoyan Sun; Guoliang Li; | Data Sci. Eng. | 2024-01-19 |
1484 | Custom Developer GPT for Ethical AI Solutions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main goal of this project is to create a new software artefact: a custom Generative Pre-trained Transformer (GPT) for developers to discuss and solve ethical issues through AI engineering. |
Lauren Olson; | arxiv-cs.SE | 2024-01-19 |
1485 | Image Recoloring for Color Vision Deficiency Compensation Using Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
LIGENG CHEN et. al. | Neural Comput. Appl. | 2024-01-18 |
1486 | Improving The Accuracy of Analog-Based In-Memory Computing Accelerators Post-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose two Post-Training (PT) optimization methods to improve accuracy after training is performed. |
COREY LAMMIE et. al. | arxiv-cs.ET | 2024-01-18 |
1487 | ChatQA: Surpassing GPT-4 on Conversational QA and RAG IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce ChatQA, a suite of models that outperform GPT-4 on retrieval-augmented generation (RAG) and conversational question answering (QA). |
ZIHAN LIU et. al. | arxiv-cs.CL | 2024-01-18 |
1488 | Gender Bias in Machine Translation and The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This chapter examines the role of Machine Translation in perpetuating gender bias, highlighting the challenges posed by cross-linguistic settings and statistical dependencies. |
Eva Vanmassenhove; | arxiv-cs.CL | 2024-01-18 |
1489 | GPT in Sheep’s Clothing: The Risk of Customized GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We aim to raise awareness of the fact that GPTs can be used maliciously, posing privacy and security risks to their users. |
SAGIV ANTEBI et. al. | arxiv-cs.CR | 2024-01-17 |
1490 | Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article provides a step-by-step generalizable guideline to identify and classify different forms of racist discourse in large corpora. In our approach, we start by conceptualizing racism and its different manifestations. |
Diana Davila Gordillo; Joan Timoneda; Sebastian Vallejo Vera; | arxiv-cs.CL | 2024-01-17 |
1491 | Efficient Slot Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a lightweight method which performs on par or better than the state-of-the-art PLM-based methods, while having almost 10x less trainable parameters. |
Vladimir Vlasov; | arxiv-cs.CL | 2024-01-17 |
1492 | Land Cover Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We compare convolutional neural networks (CNN) against transformer-based methods, showcasing their applications and advantages in LC studies. |
Antonio Rangel; Juan Terven; Diana M. Cordova-Esparza; E. A. Chavez-Urbiola; | arxiv-cs.CV | 2024-01-17 |
1493 | Human Vs. LMMs: Exploring The Discrepancy in Emoji Interpretation and Usage in Digital Communication Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Leveraging Large Multimodal Models (LMMs) to simulate human behaviors when processing multimodal information, especially in the context of social media, has garnered immense interest due to its broad potential and far-reaching implications. |
Hanjia Lyu; Weihong Qi; Zhongyu Wei; Jiebo Luo; | arxiv-cs.CV | 2024-01-16 |
1494 | Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This communication bottleneck exacerbates the already complex computational landscape, hindering the efficient utilization of high-performance computing resources. In this paper, we propose a lightweight optimization technique called ExFlow, to largely accelerate the inference of these MoE models. |
Jinghan Yao; Quentin Anthony; Aamir Shafi; Hari Subramoni; Dhabaleswar K.; | arxiv-cs.LG | 2024-01-16 |
1495 | Hidden Flaws Behind Expert-level Accuracy of Multimodal GPT-4 Vision in Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V’s high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows. |
QIAO JIN et. al. | arxiv-cs.CV | 2024-01-16 |
1496 | Enhancing Robustness of LLM-Synthetic Text Detectors for Academic Writing: A Comprehensive Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a comprehensive analysis of the impact of prompts on the text generated by LLMs and highlight the potential lack of robustness in one of the current state-of-the-art GPT detectors. |
Zhicheng Dou; Yuchen Guo; Ching-Chun Chang; Huy H. Nguyen; Isao Echizen; | arxiv-cs.CL | 2024-01-15 |
1497 | Cascaded Cross-Modal Transformer for Audio-Textual Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To attain superior classification performance, we propose to harness the inherent value of multimodal representations by transcribing speech using automatic speech recognition (ASR) models and translating the transcripts into different languages via pretrained translation models. |
Nicolae-Catalin Ristea; Andrei Anghel; Radu Tudor Ionescu; | arxiv-cs.CL | 2024-01-15 |
1498 | Towards Efficient Methods in Medical Question Answering Using Knowledge Graph Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in-domain pre-training is expensive in terms of time and resources. In this paper, we propose a resource-efficient approach for injecting domain knowledge into a model without relying on such domain-specific pre-training. |
Saptarshi Sengupta; Connor Heaton; Suhan Cui; Soumalya Sarkar; Prasenjit Mitra; | arxiv-cs.CL | 2024-01-15 |
1499 | Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we tackle the challenge of classifying the object category in point clouds, which previous works like PointCLIP struggle to address due to the inherent limitations of the CLIP architecture. |
Qi Sun; Xiao Cui; Wengang Zhou; Houqiang Li; | arxiv-cs.CV | 2024-01-15 |
1500 | Interference-Robust Millimeter-Wave Radar-Based Dynamic Hand Gesture Recognition Using 2-D CNN-Transformer Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Dynamic gesture recognition using millimeter-wave radar has a broad application prospect in the industrial Internet of Things (IoT) field. However, the existing methods in the … |
Biao Jin; Xiao Ma; Zhenkai Zhang; Zhuxian Lian; Biao Wang; | IEEE Internet of Things Journal | 2024-01-15 |