Paper Digest: Recent Papers on Transformer
Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to read, write, get answers and review.
Try us today and unlock the full potential of our services for free!
TABLE 1: Paper Digest: Recent Papers on Transformer
Paper | Author(s) | Source | Date | |
---|---|---|---|---|
1 | GPT Versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objectives of the study were to examine novel ethical issues arising from the application of LLMs in multi-robot systems. |
REBEKAH ROUSI et. al. | arxiv-cs.RO | 2024-11-21 |
2 | Evaluating The Robustness of Analogical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On digit-matrix problems, we find a similar pattern but only on one out of the two types of variants we tested. |
Martha Lewis; Melanie Mitchell; | arxiv-cs.CL | 2024-11-21 |
3 | BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, We experiment with four models from the BERT family: BERT Base, DistilBERT, ALBERT, and RoBERTa, and use multiclass classification to assess the alignment between CO and PO/PSO pairs. |
Natenaile Asmamaw Shiferaw; Simpenzwe Honore Leandre; Aman Sinha; Dillip Rout; | arxiv-cs.LG | 2024-11-21 |
4 | Exploring Large Language Models for Climate Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capability of GPT-4 in predicting rainfall at short-term (15-day) and long-term (12-month) scales. |
Yang Wang; Hassan A. Karimi; | arxiv-cs.LG | 2024-11-20 |
5 | SynEHRgy: Synthesizing Mixed-Type Structured Electronic Health Records Using Decoder-Only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel tokenization strategy tailored for structured EHR data, which encompasses diverse data types such as covariates, ICD codes, and irregularly sampled time series. |
Hojjat Karami; David Atienza; Anisoara Ionescu; | arxiv-cs.LG | 2024-11-20 |
6 | Topkima-Former: Low-energy, Low-Latency Inference for Transformers Using Top-k In-memory ADC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose innovations at the circuit, architecture, and algorithm levels to accelerate the transformer. |
SHUAI DONG et. al. | arxiv-cs.AR | 2024-11-20 |
7 | AI-Driven Agents with Prompts Designed for High Agreeableness Increase The Likelihood of Being Mistaken for A Human in The Turing Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Various explanations in the literature address why these GPT agents were perceived as human, including psychological frameworks for understanding anthropomorphism. These findings highlight the importance of personality engineering as an emerging discipline in artificial intelligence, calling for collaboration with psychology to develop ergonomic psychological models that enhance system adaptability in collaborative activities. |
U. LEÓN-DOMÍNGUEZ et. al. | arxiv-cs.AI | 2024-11-20 |
8 | Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Video Retrieval-Augmented Generation (Video-RAG), a training-free and cost-effective pipeline that employs visually-aligned auxiliary texts to help facilitate cross-modality alignment while providing additional information beyond the visual content. |
YONGDONG LUO et. al. | arxiv-cs.CV | 2024-11-20 |
9 | Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article aims to introduce a novel approach or model that attains improved performance for Vietnamese NLI. |
Dat Van-Thanh Nguyen; Tin Van Huynh; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen; | arxiv-cs.CL | 2024-11-20 |
10 | Explaining GPT-4’s Schema of Depression Using Machine Behavior Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leveraged contemporary measurement theory to decode how GPT-4 interrelates depressive symptoms to inform both clinical utility and theoretical understanding. |
ADITHYA V GANESAN et. al. | arxiv-cs.CL | 2024-11-20 |
11 | Benchmarking GPT-4 Against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a comprehensive evaluation of GPT-4’s translation capabilities compared to human translators of varying expertise levels. |
JIANHAO YAN et. al. | arxiv-cs.CL | 2024-11-20 |
12 | Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation of tokenizers used by 12 LLMs across all 22 official languages of India, with a focus on comparing the efficiency of their tokenization processes. |
S. Tamang; D. J. Bora; | arxiv-cs.CL | 2024-11-19 |
13 | Transformer Neural Processes — Kernel Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Transformer Neural Process – Kernel Regression (TNP-KR), a new architecture that incorporates a novel transformer block we call a Kernel Regression Block (KRBlock), which reduces the computational complexity of attention in transformer-based Neural Processes (TNPs) from $\mathcal{O}((n_C+n_T)^2)$ to $O(n_C^2+n_Cn_T)$ by eliminating masked computations, where $n_C$ is the number of context, and $n_T$ is the number of test points, respectively, and a fast attention variant that further reduces all attention calculations to $\mathcal{O}(n_C)$ in space and time complexity. |
DANIEL JENSON et. al. | arxiv-cs.LG | 2024-11-19 |
14 | Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of A Virtual Campus Environment with OpenAI GPT Integration with Unity 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach to multiple language learning, with Hindi the language to be learnt in our case, by using the integration of virtual reality environments and AI enabled tutoring systems using OpenAIs GPT api calls. |
Adithya TG; Abhinavaram N; Gowri Srinivasa; | arxiv-cs.HC | 2024-11-19 |
15 | Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explored the improvement in terms of multi-class disease classification via pre-trained language models over Medical-Abstracts-TC-Corpus that spans five medical conditions. |
Ahmed Akib Jawad Karim; Muhammad Zawad Mahmud; Samiha Islam; Aznur Azam; | arxiv-cs.CL | 2024-11-19 |
16 | Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ three distinct text vectorization methods for SVM: Term Frequency Inverse Document Frequency (TF-IDF), Word2Vec, and Bag of Words (BoW) evaluating their effectiveness in distinguishing between genuine and fake news. |
Ahmed Akib Jawad Karim; Kazi Hafiz Md Asad; Aznur Azam; | arxiv-cs.CL | 2024-11-19 |
17 | Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review examines the development of abstractive NLP-based text summarization approaches and compares them to existing techniques for extractive summarization. |
Leon Kopitar; Primoz Kocbek; Lucija Gosak; Gregor Stiglic; | arxiv-cs.CL | 2024-11-18 |
18 | Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, the dynamic multi-grained behavior-aware preference is hard to capture in interaction sequences, which reflects interaction-aware sequential pattern. To tackle these challenges, we propose a Multi-Grained Preference enhanced Transformer framework (M-GPT). |
CHUAN HE et. al. | arxiv-cs.IR | 2024-11-18 |
19 | CNMBert: A Model For Hanyu Pinyin Abbreviation to Character Conversion Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This task is typically one of text-length alignment, however, due to the limited informational content in pinyin abbreviations, achieving accurate conversion is challenging. In this paper, we propose CNMBert which stands for zh-CN Pinyin Multi-mask Bert Model as a solution to this issue. |
Zishuo Feng; Feng Cao; | arxiv-cs.CL | 2024-11-18 |
20 | Automatic A-C. Network Switching Units Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The desirable characteristics of automatic switching units designed for application in secondary a-c. distribution networks are discussed in this paper. Descriptions are given of … |
G. G. Grissinger; | Journal of the A.I.E.E. | |
21 | A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research introduces a novel text generation model that combines BERT’s semantic interpretation strengths with GPT-4’s generative capabilities, establishing a high standard in generating coherent, contextually accurate language. |
JIAJING CHEN et. al. | arxiv-cs.CL | 2024-11-18 |
22 | Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach that encapsulates conceptual relationships among variables within a well-defined knowledge graph, forming dynamic and learnable KGEs for seamless integration into the transformer architecture. |
Shubham Tanaji Kakde; Rony Mitra; Jasashwi Mandal; Manoj Kumar Tiwari; | arxiv-cs.LG | 2024-11-17 |
23 | Does Prompt Formatting Have Any Impact on LLM Performance? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although previous research has explored aspects like rephrasing prompt contexts, using various prompting techniques (like in-context learning and chain-of-thought), and ordering few-shot examples, our understanding of LLM sensitivity to prompt templates remains limited. Therefore, this paper examines the impact of different prompt templates on LLM performance. |
JIA HE et. al. | arxiv-cs.CL | 2024-11-15 |
24 | Brain-inspired Action Generation with Spiking Transformer Diffusion Policy Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Especially in Can task, we achieved an improvement of 8%. |
Qianhao Wang; Yinqian Sun; Enmeng Lu; Qian Zhang; Yi Zeng; | arxiv-cs.RO | 2024-11-15 |
25 | KuaiFormer: Transformer-Based Retrieval at Kuaishou Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce KuaiFormer, a novel transformer-based retrieval framework deployed in a large-scale content recommendation system. |
CHI LIU et. al. | arxiv-cs.IR | 2024-11-15 |
26 | CMATH: Cross-Modality Augmented Transformer with Hierarchical Variational Distillation for Multimodal Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Cross-Modality Augmented Transformer with Hierarchical Variational Distillation, called CMATH, which consists of two major components, i.e., Multimodal Interaction Fusion and Hierarchical Variational Distillation. |
XIAOFEI ZHU et. al. | arxiv-cs.MM | 2024-11-15 |
27 | Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although many model compression approaches have been explored, they often suffer from notorious performance degradation. To address this issue, we introduce a new method, namely Transformer Re-parameterization, to boost the performance of lightweight Transformer models. |
Zixing Zhang; Zhongren Dong; Weixiang Xu; Jing Han; | arxiv-cs.SD | 2024-11-14 |
28 | Adopting RAG for LLM-Aided Future Vehicle Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to enhance automated design and software development in the automotive industry. |
Vahid Zolfaghari; Nenad Petrovic; Fengjunjie Pan; Krzysztof Lebioda; Alois Knoll; | arxiv-cs.SE | 2024-11-14 |
29 | BabyLM Challenge: Exploring The Effect of Variation Sets on Language Model Training Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the context of the BabyLM Challenge, we focus on Variation Sets (VSs), sets of consecutive utterances expressing a similar intent with slightly different words and structures, which are ubiquitous in CDS. |
Akari Haga; Akiyo Fukatsu; Miyu Oba; Arianna Bisazza; Yohei Oseki; | arxiv-cs.CL | 2024-11-14 |
30 | LoRA-LiteE: A Computationally Efficient Framework for Chatbot Preference-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, RLHF methods are often computationally intensive and resource-demanding, limiting their scalability and accessibility for broader applications. To address these challenges, this study introduces LoRA-Lite Ensemble (LoRA-LiteE), an innovative framework that combines Supervised Fine-tuning (SFT) with Low-Rank Adaptation (LoRA) and Ensemble Learning techniques to effectively aggregate predictions of lightweight models, which aim to achieve a balance between the performance and computational cost. |
Yahe Yang; Chunliang Tao; Xiaojing Fan; | arxiv-cs.CL | 2024-11-14 |
31 | Towards Optimizing A Retrieval Augmented Generation Using Large Language Model on Academic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the growing trend of many organizations integrating Retrieval Augmented Generation (RAG) into their operations, we assess RAG on domain-specific data and test state-of-the-art models across various optimization techniques. |
Anum Afzal; Juraj Vladika; Gentrit Fazlija; Andrei Staradubets; Florian Matthes; | arxiv-cs.AI | 2024-11-13 |
32 | Evaluating World Models with LLM for Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a comprehensive evaluation of the world models with LLMs from the decision making perspective. |
Chang Yang; Xinrun Wang; Junzhe Jiang; Qinggang Zhang; Xiao Huang; | arxiv-cs.AI | 2024-11-13 |
33 | LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH). |
XIAONAN NIE et. al. | arxiv-cs.DC | 2024-11-13 |
34 | CamemBERT 2.0: A Smarter French Language Model Aged to Perfection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This issue emphasizes the need for updated models that reflect current linguistic trends. In this paper, we introduce two new versions of the CamemBERT base model-CamemBERTav2 and CamemBERTv2-designed to address these challenges. |
WISSAM ANTOUN et. al. | arxiv-cs.CL | 2024-11-13 |
35 | TRACE: Transformer-based Risk Assessment for Clinical Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TRACE (Transformer-based Risk Assessment for Clinical Evaluation), a novel method for clinical risk assessment based on clinical data, leveraging the self-attention mechanism for enhanced feature interaction and result interpretation. |
Dionysis Christopoulos; Sotiris Spanos; Valsamis Ntouskos; Konstantinos Karantzalos; | arxiv-cs.CV | 2024-11-13 |
36 | Circuit Complexity Bounds for RoPE-based Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we establish a tighter circuit complexity bound for Transformers with $\mathsf{RoPE}$ attention. |
BO CHEN et. al. | arxiv-cs.LG | 2024-11-12 |
37 | Derivational Morphology Reveals Analogical Generalization in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new method for investigating linguistic generalization in LLMs: focusing on GPT-J, we fit cognitive models that instantiate rule-based and analogical learning to the LLM training data and compare their predictions on a set of nonce adjectives with those of the LLM, allowing us to draw direct conclusions regarding underlying mechanisms. |
Valentin Hofmann; Leonie Weissweiler; David Mortensen; Hinrich Schütze; Janet Pierrehumbert; | arxiv-cs.CL | 2024-11-12 |
38 | Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using 385 questions spanning seven safety knowledge areas, the study analyzes the models’ accuracy, consistency, and reliability. |
Farouq Sammour; Jia Xu; Xi Wang; Mo Hu; Zhenyu Zhang; | arxiv-cs.AI | 2024-11-12 |
39 | Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the frequency of (anti-)solidarity towards women and migrants in German parliamentary debates between 1867 and 2022. |
AIDA KOSTIKOVA et. al. | emnlp | 2024-11-11 |
40 | Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its efficiency, Sentence-BERT tackles STS tasks from a classification perspective, overlooking the progressive nature of semantic relationships, which results in suboptimal performance. To bridge this gap, this paper presents an innovative regression framework and proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss. |
Bowen Zhang; Chunping Li; | emnlp | 2024-11-11 |
41 | Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-standard varieties from around the world). |
EVE FLEISIG et. al. | emnlp | 2024-11-11 |
42 | A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing hate speech detection solutions have utilized the features by treating each post as an isolated input instance for the classification. This paper addresses this issue by introducing a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information. |
Prashant Kapil; Asif Ekbal; | arxiv-cs.CL | 2024-11-11 |
43 | Can LLMs Replace Neil DeGrasse Tyson? Evaluating The Reliability of LLMs As Science Communicators Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on evaluating the reliability of current LLMs as science communicators. |
Prasoon Bajpai; Niladri Chatterjee; Subhabrata Dutta; Tanmoy Chakraborty; | emnlp | 2024-11-11 |
44 | TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A particular interest lies on keystroke dynamics (KD), which refers to the task of recognizing individuals’ identity based on their unique typing style. In this work, we propose the use of pre-trained language models (PLMs) to recognize such patterns. |
Matheus Simão; Fabiano Prado; Omar Abdul Wahab; Anderson Avila; | arxiv-cs.CR | 2024-11-11 |
45 | Comparing A BERT Classifier and A GPT Classifier for Detecting Connective Language Across Multiple Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an approach for detecting connective language-defined as language that facilitates engagement, understanding, and conversation-from social media discussions. |
Josephine Lukito; Bin Chen; Gina M. Masullo; Natalie Jomini Stroud; | emnlp | 2024-11-11 |
46 | SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy. |
PRAKAMYA MISHRA et. al. | emnlp | 2024-11-11 |
47 | GPT Vs RETRO: Exploring The Intersection of Retrieval and Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we apply PEFT methods (P-tuning, Adapters, and LoRA) to a modified Retrieval-Enhanced Transformer (RETRO) and a baseline GPT model across several sizes, ranging from 823 million to 48 billion parameters. |
Aleksander Ficek; Jiaqi Zeng; Oleksii Kuchaiev; | emnlp | 2024-11-11 |
48 | BiasWipe: Mitigating Unintended Bias in Text Classifiers Through Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a robust and generalizable technique BiasWipe to mitigate unintended bias in language models. |
Mamta Mamta; Rishikant Chigrupaatii; Asif Ekbal; | emnlp | 2024-11-11 |
49 | DAMRO: Dive Into The Attention Mechanism of LVLM to Reduce Object Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issue, we propose DAMRO, a novel training-free strategy that **D**ive into **A**ttention **M**echanism of LVLM to **R**educe **O**bject Hallucination. |
Xuan Gong; Tianshi Ming; Xinpeng Wang; Zhihua Wei; | emnlp | 2024-11-11 |
50 | SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks. |
VIKTORIIA A. CHEKALINA et. al. | emnlp | 2024-11-11 |
51 | White-Box Diffusion Transformer for Single-cell RNA-seq Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the process of data acquisition is often constrained by high cost and limited sample availability. To overcome these limitations, we propose a hybrid model based on Diffusion model and White-Box transformer that aims to generate synthetic and biologically plausible scRNA-seq data. |
Zhuorui Cui; Shengze Dong; Ding Liu; | arxiv-cs.LG | 2024-11-11 |
52 | Unraveling The Gradient Descent Dynamics of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradient Descent (GD) to achieve guaranteed convergence? |
Bingqing Song; Boran Han; Shuai Zhang; Jie Ding; Mingyi Hong; | arxiv-cs.LG | 2024-11-11 |
53 | On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models. |
YEKUN CHAI et. al. | emnlp | 2024-11-11 |
54 | BeeManc at The PLABA Track of TAC-2024: RoBERTa for Task 1 — LLaMA3.1 and GPT-4o for Task 2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In task one, we applied fine-tuned ReBERTa-Base models to identify and classify the difficult terms, jargon and acronyms in the biomedical abstracts and reported the F1 score. |
Zhidong Ling; Zihao Li; Pablo Romero; Lifeng Han; Goran Nenadic; | arxiv-cs.CL | 2024-11-11 |
55 | On The Reliability of Psychological Scales on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits. |
JEN-TSE HUANG et. al. | emnlp | 2024-11-11 |
56 | TreeCoders: Trees of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TreeCoders, a novel family of transformer trees. |
Pierre Colonna D’Istria; Abdulrahman Altahhan; | arxiv-cs.CL | 2024-11-11 |
57 | Pron Vs Prompt: Can Large Language Models Already Challenge A World-Class Fiction Author at Creative Text Writing? Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Are LLMs ready to compete in creative writing skills with a top (rather than average) novelist? To provide an initial answer for this question, we have carried out a contest … |
Guillermo Marco; Julio Gonzalo; M.Teresa Mateo-Girona; Ram�n Del Castillo Santos; | emnlp | 2024-11-11 |
58 | BudgetMLAgent: A Cost-Effective LLM Multi-Agent System for Automating Machine Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the motivation of developing a cost-efficient LLM based solution for solving ML tasks, we propose an LLM Multi-Agent based system which leverages combination of experts using profiling, efficient retrieval of past observations, LLM cascades, and ask-the-expert calls. |
Shubham Gandhi; Manasi Patwardhan; Lovekesh Vig; Gautam Shroff; | arxiv-cs.MA | 2024-11-11 |
59 | Generalizing Clinical De-identification Models By Privacy-safe Data Augmentation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, labeling standards and the formats of patient records vary across different institutions. Our study addresses these issues by exploiting GPT-4 for data augmentation through one-shot and zero-shot prompts. |
Woojin Kim; Sungeun Hahm; Jaejin Lee; | emnlp | 2024-11-11 |
60 | MTLS: Making Texts Into Linguistic Symbols Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we shift the focus to the symbolic properties and introduce MTLS: a pre-training method to improve the multilingual capability of models by Making Texts into Linguistic Symbols. |
Wenlong Fei; Xiaohua Wang; Min Hu; Qingyu Zhang; Hongbo Li; | emnlp | 2024-11-11 |
61 | Knowledge Graph Enhanced Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of post-edit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME. |
MENGQI ZHANG et. al. | emnlp | 2024-11-11 |
62 | Will LLMs Replace The Encoder-Only Models in Temporal Relation Classification? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task. |
Gabriel Roccabruna; Massimo Rizzoli; Giuseppe Riccardi; | emnlp | 2024-11-11 |
63 | Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first of its kind benchmark for depression-anxiety comorbidity classification from social media posts. |
AMEY HENGLE et. al. | emnlp | 2024-11-11 |
64 | Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months. |
Michael Lan; Philip Torr; Fazl Barez; | emnlp | 2024-11-11 |
65 | Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese. |
Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari; | emnlp | 2024-11-11 |
66 | Leveraging Pre-trained Language Models for Linguistic Analysis: A Case of Argument Structure Constructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of pre-trained language models in identifying argument structure constructions, important for modeling both first and second language learning. |
Hakyung Sung; Kristopher Kyle; | emnlp | 2024-11-11 |
67 | ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision. |
Mallika Garg; Debashis Ghosh; Pyari Mohan Pradhan; | arxiv-cs.CV | 2024-11-11 |
68 | FOOL ME IF YOU CAN! An Adversarial Dataset to Investigate The Robustness of LMs in Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models still struggle with recognizing semantic boundaries and often misclassify homonyms in adversarial context. Therefore, we propose FOOL: FOur-fold Obscure Lexical, a new coarse-grained WSD dataset, which includes four different test sets designed to assess the robustness of language models in WSD tasks. |
MOHAMAD BALLOUT et. al. | emnlp | 2024-11-11 |
69 | Evaluating ChatGPT-3.5 Efficiency in Solving Coding Problems of Different Complexity Levels: An Empirical Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We assess the performance of ChatGPT’s GPT-3.5-turbo model on LeetCode, a popular platform with algorithmic coding challenges for technical interview practice, across three difficulty levels: easy, medium, and hard. |
Minda Li; Bhaskar Krishnamachari; | arxiv-cs.SE | 2024-11-11 |
70 | Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) evolve, evaluating their output reliably becomes increasingly difficult due to the high cost of human evaluation. To address this, we introduce FLAMe, a family of Foundational Large Autorater Models. |
TU VU et. al. | emnlp | 2024-11-11 |
71 | Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a straightforward yet potent Conversation Reconstruction Attack. |
Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang; | emnlp | 2024-11-11 |
72 | High-Fidelity Cellular Network Control-Plane Traffic Generation Without Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the feasibility of developing a high-fidelity MCN control plane traffic generator by leveraging generative ML models. |
Z. Jonny Kong; Nathan Hu; Y. Charlie Hu; Jiayi Meng; Yaron Koral; | arxiv-cs.NI | 2024-11-11 |
73 | Split and Merge: Aligning Position Biases in LLM-based Evaluators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLM-based evaluators exhibit position bias, or inconsistency, when used to evaluate candidate answers in pairwise comparisons, favoring either the first or second answer regardless of content. To address this limitation, we propose PORTIA, an alignment-based system designed to mimic human comparison strategies to calibrate position bias in a lightweight yet effective manner. |
ZONGJIE LI et. al. | emnlp | 2024-11-11 |
74 | Using Language Models to Disambiguate Lexical Choices in Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English. |
Josh Barua; Sanjay Subramanian; Kayo Yin; Alane Suhr; | emnlp | 2024-11-11 |
75 | Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge. |
Steven Y. Feng; Noah Goodman; Michael Frank; | emnlp | 2024-11-11 |
76 | GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Iterative Refinement Induced Self-Jailbreak (IRIS), a novel approach that leverages the reflective capabilities of LLMs for jailbreaking with only black-box access. |
Govind Ramesh; Yao Dou; Wei Xu; | emnlp | 2024-11-11 |
77 | Subword Segmentation in LLMs: Looking at Inflection and Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study two criteria: (i) adherence to morpheme boundaries and (ii) the segmentation consistency of the different inflected forms of a lemma. |
Marion Di Marco; Alexander Fraser; | emnlp | 2024-11-11 |
78 | MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how to build small fact-checking models that have GPT-4-level performance but for 400x lower cost. |
Liyan Tang; Philippe Laban; Greg Durrett; | emnlp | 2024-11-11 |
79 | Evaluating Psychological Safety of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we designed unbiased prompts to systematically evaluate the psychological safety of large language models (LLMs). |
Xingxuan Li; Yutong Li; Lin Qiu; Shafiq Joty; Lidong Bing; | emnlp | 2024-11-11 |
80 | DA3: A Distribution-Aware Adversarial Attack Against Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, they are easy to detect using straightforward detection methods, diminishing the efficacy of such attacks. To address this issue, we propose a Distribution-Aware Adversarial Attack (DA3) method. |
Yibo Wang; Xiangjue Dong; James Caverlee; Philip S. Yu; | emnlp | 2024-11-11 |
81 | Universal Response and Emergence of Induction in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By applying our method, we observe signatures of induction behavior within the residual stream of Gemma-2-2B, Llama-3.2-3B, and GPT-2-XL. Across all models, we find that these induction signatures gradually emerge within intermediate layers and identify the relevant model sections composing this behavior. |
Niclas Luick; | arxiv-cs.LG | 2024-11-11 |
82 | Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements. |
SEUNGONE KIM et. al. | emnlp | 2024-11-11 |
83 | Annotation Alignment: Comparing LLM and Human Annotations of Conversational Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that larger datasets are needed to resolve whether GPT-4 exhibits disparities in how well it correlates with different demographic groups. |
Rajiv Movva; Pang Wei Koh; Emma Pierson; | emnlp | 2024-11-11 |
84 | Ambient AI Scribing Support: Comparing The Performance of Specialized AI Agentic Architecture to Leading Foundational Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares Sporo Health’s AI Scribe, a proprietary model fine-tuned for medical scribing, with various LLMs (GPT-4o, GPT-3.5, Gemma-9B, and Llama-3.2-3B) in clinical documentation. |
Chanseo Lee; Sonu Kumar; Kimon A. Vogt; Sam Meraj; | arxiv-cs.AI | 2024-11-10 |
85 | LProtector: An LLM-driven Vulnerability Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG). |
ZE SHENG et. al. | arxiv-cs.CR | 2024-11-10 |
86 | Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This thesis introduces a Parameter-Efficient Fine-Tuning (PEFT) approach tailored for GPT-like models, aiming to mitigate hallucinations and enhance reproducibility, particularly in the computational domain of mass spectrometry. |
Daniil Sulimov; | arxiv-cs.CL | 2024-11-10 |
87 | Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing finance benchmarks often suffer from limited language and task coverage, as well as challenges such as low-quality datasets and inadequate adaptability for LLM evaluation. To address these limitations, we propose Golden Touchstone, the first comprehensive bilingual benchmark for financial LLMs, which incorporates representative datasets from both Chinese and English across eight core financial NLP tasks. |
XIAOJUN WU et. al. | arxiv-cs.CL | 2024-11-09 |
88 | AI’s Spatial Intelligence: Evaluating AI’s Understanding of Spatial Transformations in PSVT:R and Augmented Reality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies show Artificial Intelligence (AI) with language and vision capabilities still face limitations in spatial reasoning. In this paper, we have studied generative AI’s spatial capabilities of understanding rotations of objects utilizing its image and language processing features. |
Uttamasha Monjoree; Wei Yan; | arxiv-cs.AI | 2024-11-09 |
89 | GPT Semantic Cache: Reducing LLM Costs and Latency Via Semantic Embedding Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GPT Semantic Cache, a method that leverages semantic caching of query embeddings in in-memory storage (Redis). |
Sajal Regmi; Chetan Phakami Pun; | arxiv-cs.LG | 2024-11-07 |
90 | Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams. |
Adriana Caraeni; Alexander Scarlatos; Andrew Lan; | arxiv-cs.CY | 2024-11-07 |
91 | FineTuneBench: How Well Do Commercial Fine-tuning APIs Infuse Knowledge Into LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce FineTuneBench, an evaluation framework and dataset for understanding how well commercial fine-tuning APIs can successfully learn new and updated knowledge. |
Eric Wu; Kevin Wu; James Zou; | arxiv-cs.CL | 2024-11-07 |
92 | High Entropy Alloy Property Predictions Using Transformer-based Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a language transformer-based machine learning model to predict key mechanical properties of high-entropy alloys (HEAs), addressing the challenges due to their complex, multi-principal element compositions and limited experimental data. |
Spyros Kamnis; Konstantinos Delibasis; | arxiv-cs.CE | 2024-11-07 |
93 | Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We examine implications of architectural differences between GPT-2 and LLaMa as well as LlaMa and Mamba. |
RYAN CAMPBELL et. al. | arxiv-cs.LG | 2024-11-06 |
94 | A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to explore how LLMs can alleviate the burden of manual summarization, streamline workflow efficiencies, and support informed decision-making in healthcare settings. |
YIMING LI et. al. | arxiv-cs.CL | 2024-11-06 |
95 | Understanding The Effects of Human-written Paraphrases in LLM-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we devise a new data collection strategy to collect Human & LLM Paraphrase Collection (HLPC), a first-of-its-kind dataset that incorporates human-written texts and paraphrases, as well as LLM-generated texts and paraphrases. |
Hiu Ting Lau; Arkaitz Zubiaga; | arxiv-cs.CL | 2024-11-06 |
96 | Rethinking Decoders for Transformer-based Semantic Segmentation: Compression Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we argue that there are fundamental connections between semantic segmentation and compression, especially between the Transformer decoders and Principal Component Analysis (PCA). |
Qishuai Wen; Chun-Guang Li; | arxiv-cs.CV | 2024-11-05 |
97 | Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the feasibility of using large language models (LLMs), specifically GPT-4o (ChatGPT), for automated grading of conceptual questions in an undergraduate Mechanical Engineering course. |
RUJUN GAO et. al. | arxiv-cs.CY | 2024-11-05 |
98 | Enhancing Transformer Training Efficiency with Dynamic Dropout Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Dynamic Dropout, a novel regularization technique designed to enhance the training efficiency of Transformer models by dynamically adjusting the dropout rate based on training epochs or validation loss improvements. |
Hanrui Yan; Dan Shao; | arxiv-cs.LG | 2024-11-05 |
99 | From Medprompt to O1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Following on the Medprompt study with GPT-4, we systematically evaluate the o1-preview model across various medical benchmarks. |
HARSHA NORI et. al. | arxiv-cs.CL | 2024-11-05 |
100 | Automatic Generation of Question Hints for Mathematics Problems Using Large Language Models in Educational Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present here the study of several dimensions: 1) identifying error patterns made by simulated students on secondary-level math exercises; 2) developing various prompts for GPT-4o as a teacher and evaluating their effectiveness in generating hints that enable simulated students to self-correct; and 3) testing the best-performing prompts, based on their ability to produce relevant hints and facilitate error correction, with Llama-3-8B-Instruct as the teacher, allowing for a performance comparison with GPT-4o. |
Junior Cedric Tonga; Benjamin Clement; Pierre-Yves Oudeyer; | arxiv-cs.CL | 2024-11-05 |
101 | Evaluating The Ability of Large Language Models to Generate Verifiable Specifications in VeriFast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, prior work has not explored how well LLMs can perform specification generation for specifications based in an ownership logic, such as separation logic. To address this gap, this paper explores the effectiveness of large language models (LLMs), specifically OpenAI’s GPT models, in generating fully correct specifications based on separation logic for static verification of human-written programs in VeriFast. |
MARILYN REGO et. al. | arxiv-cs.SE | 2024-11-04 |
102 | Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we identify representation collapse in the model’s intermediate layers as a key factor limiting their reasoning capabilities. |
MD RIFAT AREFIN et. al. | arxiv-cs.LG | 2024-11-04 |
103 | Wave Network: An Ultra-Small Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an innovative token representation and update method in a new ultra-small language model: the Wave network. |
Xin Zhang; Victor S. Sheng; | arxiv-cs.CL | 2024-11-04 |
104 | Advancements and Limitations of LLMs in Replicating Human Color-word Associations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compared multiple generations of LLMs (from GPT-3 to GPT-4o) against human color-word associations using data collected from over 10,000 Japanese participants, involving 17 colors and words from eight categories in Japanese. |
Makoto Fukushima; Shusuke Eshita; Hiroshige Fukuhara; | arxiv-cs.CL | 2024-11-04 |
105 | Ask, and It Shall Be Given: Turing Completeness of Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function. |
Ruizhong Qiu; Zhe Xu; Wenxuan Bao; Hanghang Tong; | arxiv-cs.LG | 2024-11-04 |
106 | Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging advancements in natural language processing, this study presents a systematic approach to enrich tabular datasets with features derived from large language model embeddings. |
Gjergji Kasneci; Enkelejda Kasneci; | arxiv-cs.LG | 2024-11-03 |
107 | Can Large Language Model Predict Employee Attrition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning (ML) advancements offer more scalable and accurate solutions, but large language models (LLMs) introduce new potential in human resource management by interpreting nuanced employee communication and detecting subtle turnover cues. |
Xiaoye Ma; Weiheng Liu; Changyi Zhao; Liliya R. Tukhvatulina; | arxiv-cs.LG | 2024-11-02 |
108 | Enhancing Neural Network Interpretability with Feature-Aligned Sparse Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose \textsc{Mutual Feature Regularization} \textbf{(MFR)}, a regularization technique for improving feature learning by encouraging SAEs trained in parallel to learn similar features. |
Luke Marks; Alasdair Paren; David Krueger; Fazl Barez; | arxiv-cs.LG | 2024-11-02 |
109 | Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, we introduce the Lingma SWE-GPT series, comprising Lingma SWE-GPT 7B and 72B. |
YINGWEI MA et. al. | arxiv-cs.SE | 2024-11-01 |
110 | Unsupervised Graph Transformer With Augmentation-Free Contrastive Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers, having the superior ability to capture both adjacent and long-range dependencies, have been applied to the graph representation learning field. Existing methods are … |
Han Zhao; Xu Yang; Kun-Juan Wei; Cheng Deng; Dacheng Tao; | IEEE Transactions on Knowledge and Data Engineering | 2024-11-01 |
111 | GameGen-X: Interactive Open-world Game Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos. |
Haoxuan Che; Xuanhua He; Quande Liu; Cheng Jin; Hao Chen; | arxiv-cs.CV | 2024-11-01 |
112 | LLMs: A Game-Changer for Software Engineers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a critical analysis of technical strengths, limitations, real-world case studies, and future research directions, this paper argues that LLMs are not just reshaping how software is developed but are redefining the role of developers. |
Md Asraful Haque; | arxiv-cs.SE | 2024-11-01 |
113 | Trans-VNet: Transformer-based Tooth Semantic Segmentation in CBCT Images Related Papers Related Patents Related Grants Related Venues Related Experts View |
Chen Wang; Jingyu Yang; Baoyu Wu; Ruijun Liu; Peng Yu; | Biomed. Signal Process. Control. | 2024-11-01 |
114 | Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism. |
BIN LEI et. al. | arxiv-cs.AI | 2024-11-01 |
115 | Transformer-CNN for Small Image Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yan-Lin Chen; Chun-Liang Lin; Yu-Chen Lin; Tzu-Chun Chen; | Signal Process. Image Commun. | 2024-11-01 |
116 | GPT for Games: An Updated Scoping Review (2020-2024) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to illustrate the state of the art in innovative GPT applications in games, offering a foundation to enrich game development and enhance player experiences through cutting-edge AI innovations. |
Daijin Yang; Erica Kleinman; Casper Harteveld; | arxiv-cs.AI | 2024-10-31 |
117 | Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces. |
Nouar AlDahoul; Myles Joshua Toledo Tan; Harishwar Reddy Kasireddy; Yasir Zaki; | arxiv-cs.CV | 2024-10-31 |
118 | GPT or BERT: Why Not Both? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple way to merge masked language modeling with causal language modeling. |
Lucas Georges Gabriel Charpentier; David Samuel; | arxiv-cs.CL | 2024-10-31 |
119 | Handwriting Recognition in Historical Documents with Multimodal LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I evaluate the accuracy of handwritten document transcriptions generated by Gemini against the current state of the art Transformer based methods. |
Lucian Li; | arxiv-cs.CV | 2024-10-31 |
120 | IO Transformer: Evaluating SwinV2-Based Reward Models for Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines SwinV2-based reward models, called the Input-Output Transformer (IO Transformer) and the Output Transformer. |
Maxwell Meyer; Jack Spruyt; | arxiv-cs.CV | 2024-10-31 |
121 | Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases. |
MUHAMMED SAEED et. al. | arxiv-cs.CL | 2024-10-31 |
122 | EDT: An Efficient Diffusion Transformer Framework Inspired By Human-like Sketching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the computation budget of transformer-based DPMs, this work proposes the Efficient Diffusion Transformer (EDT) framework. |
Xinwang Chen; Ning Liu; Yichen Zhu; Feifei Feng; Jian Tang; | arxiv-cs.CV | 2024-10-31 |
123 | Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN). |
IBNE HASSAN et. al. | arxiv-cs.CV | 2024-10-31 |
124 | An Empirical Analysis of GPT-4V’s Performance on Fashion Aesthetic Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time. |
YUKI HIRAKAWA et. al. | arxiv-cs.CV | 2024-10-31 |
125 | LoFLAT: Local Feature Matching Using Focused Linear Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper. |
Naijian Cao; Renjie He; Yuchao Dai; Mingyi He; | arxiv-cs.CV | 2024-10-30 |
126 | ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching. |
JUNJIE NI et. al. | arxiv-cs.CV | 2024-10-30 |
127 | Automated Personnel Selection for Software Engineers Using LLM-Based Profile Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a fresh dataset and technique as well as shows how transformer models could improve recruiting procedures. |
Ahmed Akib Jawad Karim; Shahria Hoque; Md. Golam Rabiul Alam; Md. Zia Uddin; | arxiv-cs.SE | 2024-10-30 |
128 | ProTransformer: Robustify Transformers Via Plug-and-Play Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures. |
Zhichao Hou; Weizhi Gao; Yuchen Shen; Feiyi Wang; Xiaorui Liu; | arxiv-cs.LG | 2024-10-30 |
129 | EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The former hurts the fairness of benchmarks, and the latter hinders practitioners from selecting superior LLMs for specific programming domains. To address these two limitations, we propose a new benchmark – EvoCodeBench, which has the following advances: (1) Evolving data. |
JIA LI et. al. | arxiv-cs.CL | 2024-10-30 |
130 | GPT-4o Reads The Mind in The Eyes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using two versions of a widely used theory of mind test, the Reading the Mind in Eyes Test and the Multiracial Reading the Mind in the Eyes Test, we found that GPT-4o outperformed humans in interpreting mental states from upright faces but underperformed humans when faces were inverted. |
JAMES W. A. STRACHAN et. al. | arxiv-cs.HC | 2024-10-29 |
131 | Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context for preference. |
Rishabh Adiga; Besmira Nushi; Varun Chandrasekaran; | arxiv-cs.CL | 2024-10-29 |
132 | AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent work, AmpleGCG~\citep{liao2024amplegcg}, demonstrates that a generative model can quickly produce numerous customizable gibberish adversarial suffixes for any harmful query, exposing a range of alignment gaps in out-of-distribution (OOD) language spaces. To bring more attention to this area, we introduce AmpleGCG-Plus, an enhanced version that achieves better performance in fewer attempts. |
Vishal Kumar; Zeyi Liao; Jaylen Jones; Huan Sun; | arxiv-cs.CL | 2024-10-29 |
133 | Benchmarking OpenAI O1 in Cyber Security Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate OpenAI’s o1-preview and o1-mini models, benchmarking their performance against the earlier GPT-4o model. |
Dan Ristea; Vasilios Mavroudis; Chris Hicks; | arxiv-cs.CR | 2024-10-29 |
134 | Is GPT-4 Less Politically Biased Than GPT-3.5? A Renewed Investigation of ChatGPT’s Political Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the political biases and personality traits of ChatGPT, specifically comparing GPT-3.5 to GPT-4. |
Erik Weber; Jérôme Rutinowski; Niklas Jost; Markus Pauly; | arxiv-cs.CL | 2024-10-28 |
135 | SepMamba: State-space Models for Speaker Separation Using Mamba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers. |
THOR HØJHUS AVENSTRUP et. al. | arxiv-cs.SD | 2024-10-28 |
136 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD. |
Rambod Azimi; Rishav Rishav; Marek Teichmann; Samira Ebrahimi Kahou; | arxiv-cs.CL | 2024-10-28 |
137 | Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a medical literature summary generation method based on the BERT model to address the challenges brought by the current explosion of medical information. |
JIACHENG HU et. al. | arxiv-cs.CL | 2024-10-28 |
138 | Gender Bias in LLM-generated Interview Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications. |
Haein Kong; Yongsu Ahn; Sangyub Lee; Yunho Maeng; | arxiv-cs.CL | 2024-10-28 |
139 | UOttawa at LegalLens-2024: Transformer-based Classification Experiments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents the methods used for LegalLens-2024 shared task, which focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals. |
Nima Meghdadi; Diana Inkpen; | arxiv-cs.CL | 2024-10-28 |
140 | Sequential Choice in Ordered Bundles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate several predictive models, including two custom Transformers using decoder-only and encoder-decoder architectures, fine-tuned GPT-3, a custom LSTM model, a reinforcement learning model, two Markov models, and a zero-order model. |
Rajeev Kohli; Kriste Krstovski; Hengyu Kuang; Hengxu Lin; | arxiv-cs.LG | 2024-10-28 |
141 | A Simple Yet Effective Corpus Construction Framework for Indonesian Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How to efficiently construct high-quality evaluation corpora for GEC in low-resource languages has become a significant challenge. To fill these gaps, in this paper, we present a framework for constructing GEC corpora. |
NANKAI LIN et. al. | arxiv-cs.CL | 2024-10-28 |
142 | Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project explores the security vulnerabilities in relation to prompt injection attacks. |
Md Abdur Rahman; Fan Wu; Alfredo Cuzzocrea; Sheikh Iqbal Ahamed; | arxiv-cs.CL | 2024-10-27 |
143 | SeisGPT: A Physics-Informed Data-Driven Large Model for Real-Time Seismic Response Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods, which rely on complex finite element models often struggle with balancing computational efficiency and accuracy. To address this challenge, we introduce SeisGPT, a data-driven, large physics-informed model that leverages deep neural networks based on the Generative Pre-trained Transformer (GPT) architecture. |
SHIQIAO MENG et. al. | arxiv-cs.CE | 2024-10-26 |
144 | Sequential Large Language Model-Based Hyper-Parameter Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study introduces SLLMBO, an innovative framework that leverages Large Language Models (LLMs) for hyperparameter optimization (HPO), incorporating dynamic search space adaptability, enhanced parameter landscape exploitation, and a hybrid, novel LLM-Tree-structured Parzen Estimator (LLM-TPE) sampler. |
Kanan Mahammadli; | arxiv-cs.LG | 2024-10-26 |
145 | Notes on The Mathematical Structure of GPT LLM Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM. … |
Spencer Becker-Kahn; | arxiv-cs.LG | 2024-10-25 |
146 | No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts. |
ISRAEL FAMA et. al. | arxiv-cs.CL | 2024-10-24 |
147 | Integrating Large Language Models with Internet of Things Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection, macroprogramming over IoT systems, and sensor data processing. |
Mingyu Zong; Arvin Hekmati; Michael Guastalla; Yiyi Li; Bhaskar Krishnamachari; | arxiv-cs.AI | 2024-10-24 |
148 | GPT-Signal: Generative AI for Semi-automated Feature Engineering in The Alpha Research Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the recent development of Generative Artificial Intelligence(Gen AI) and Large Language Models (LLMs), we present a novel way of leveraging GPT-4 to generate new return-predictive formulaic alphas, making alpha mining a semi-automated process, and saving time and energy for investors and traders. |
Yining Wang; Jinman Zhao; Yuri Lawryshyn; | arxiv-cs.CE | 2024-10-24 |
149 | Scaling Up Masked Diffusion Models on Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Fully leveraging the probabilistic formulation of MDMs, we propose a simple yet effective \emph{unsupervised classifier-free guidance} that effectively exploits large-scale unpaired data, boosting performance for conditional inference. |
SHEN NIE et. al. | arxiv-cs.AI | 2024-10-24 |
150 | Probing Ranking LLMs: Mechanistic Interpretability in Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the mechanistic workings of state-of-the-art, fine-tuning-based passage-reranking transformer networks. |
Tanya Chowdhury; James Allan; | arxiv-cs.IR | 2024-10-24 |
151 | Lightweight Neural App Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel mobile phone control architecture, termed “app agents, for efficient interactions and controls across various Android apps. |
FILIPPOS CHRISTIANOS et. al. | arxiv-cs.AI | 2024-10-23 |
152 | Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords. |
Farshad Jafari; Claire Arthur; | arxiv-cs.IT | 2024-10-23 |
153 | Locating Information in Large Language Models Via Random Matrix Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the weight matrices of pretrained transformer models — specifically BERT and Llama — using random matrix theory (RMT) as a zero-information hypothesis. |
Max Staats; Matthias Thamm; Bernd Rosenow; | arxiv-cs.LG | 2024-10-23 |
154 | An Eye for An AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that although GPT-4o exhibits great potential in solving questions with visual information independently, major limitations still exist to the accuracy and quality of the generated results. We propose several novel approaches for CG educators to incorporate GenAI into CG teaching despite these limitations. |
Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Jacqueline Whalley; | arxiv-cs.AI | 2024-10-22 |
155 | In Context Learning and Reasoning for Symbolic Regression with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore the potential of LLMs to perform symbolic regression — a machine-learning method for finding simple and accurate equations from datasets. |
Samiha Sharlin; Tyler R. Josephson; | arxiv-cs.CL | 2024-10-22 |
156 | Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through evaluations of edited models and analysis of extracted representations, we show that KE inadvertently affects representations of entities beyond the targeted one, distorting relevant structures that allow a model to infer unseen knowledge about an entity. |
KENTO NISHI et. al. | arxiv-cs.LG | 2024-10-22 |
157 | GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although large language models (LLMs) have demonstrated potential in code generation tasks, they often encounter issues such as refusal to code or hallucination in geospatial code generation due to a lack of domain-specific knowledge and code corpora. To address these challenges, this paper presents and open-sources the GeoCode-PT and GeoCode-SFT corpora, along with the GeoCode-Eval evaluation dataset. |
SHUYANG HOU et. al. | arxiv-cs.SE | 2024-10-22 |
158 | Interpreting Affine Recurrence Learning in GPT-style Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In-context learning allows transformers to generalize during inference without modifying their weights, yet the precise operations driving this capability remain largely opaque. This paper presents an investigation into the mechanistic interpretability of these transformers, focusing specifically on their ability to learn and predict affine recurrences as an ICL task. |
Samarth Bhargav; Alexander Gu; | arxiv-cs.LG | 2024-10-22 |
159 | Graph Transformers Dream of Electric Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The input to the Transformer is simply the graph incidence matrix; no other explicit positional encoding information is provided. We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms. |
Xiang Cheng; Lawrence Carin; Suvrit Sra; | arxiv-cs.LG | 2024-10-22 |
160 | Using GPT Models for Qualitative and Quantitative News Analytics in The 2024 US Presidental Election Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper considers an approach of using Google Search API and GPT-4o model for qualitative and quantitative analyses of news through retrieval-augmented generation (RAG). |
Bohdan M. Pavlyshenko; | arxiv-cs.CL | 2024-10-21 |
161 | Exploring Pretraining Via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pretraining strategy that uses active forgetting to achieve similar cross lingual transfer in decoder-only LLMs. |
Divyanshu Aggarwal; Ashutosh Sathe; Sunayana Sitaram; | arxiv-cs.CL | 2024-10-21 |
162 | Learning to Differentiate Pairwise-Argument Representations for Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable encoders to produce clearly distinguishable representations, we propose a joint learning framework. |
ZHIPANG WANG et. al. | cikm | 2024-10-21 |
163 | BART-based Hierarchical Attentional Network for Sentence Ordering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel BART-based Hierarchical Attentional Ordering Network (BHAONet), aiming to address the coherence modeling challenge within paragraphs, which stands as a cornerstone in comprehension, generation, and reasoning tasks. |
Yiping Yang; Baiyun Cui; Yingming Li; | cikm | 2024-10-21 |
164 | Comparative Study of Multilingual Idioms and Similes in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the gap in the literature concerning the comparative performance of LLMs in interpreting different types of figurative language across multiple languages. |
PARIA KHOSHTAB et. al. | arxiv-cs.CL | 2024-10-21 |
165 | Inferring Visualization Intent from Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider a conversational approach to visualization, where users specify their needs at each step in natural language, with a visualization being returned in turn. |
Haotian Li; Nithin Chalapathi; Huamin Qu; Alvin Cheung; Aditya G. Parameswaran; | cikm | 2024-10-21 |
166 | Improving Neuron-level Interpretability with White-box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE), explicitly engineered to capture sparse, low-dimensional structures within data distributions. |
Hao Bai; Yi Ma; | arxiv-cs.CL | 2024-10-21 |
167 | Application of Large Language Models in Chemistry Reaction Data Extraction and Cleaning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a paradigm that leverages prompt-tuning, fine-tuning techniques, and a verifier to check the extracted information. |
XIAOBAO HUANG et. al. | cikm | 2024-10-21 |
168 | Does ChatGPT Have A Poetic Style? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the GPT models, especially GPT-4, can successfully produce poems in a range of both common and uncommon English-language forms in superficial yet noteworthy ways, such as by producing poems of appropriate lengths for sonnets (14 lines), villanelles (19 lines), and sestinas (39 lines). |
Melanie Walsh; Anna Preus; Elizabeth Gronski; | arxiv-cs.CL | 2024-10-20 |
169 | Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias. |
Sanguk Lee; Kai-Qi Yang; Tai-Quan Peng; Ruth Heo; Hui Liu; | arxiv-cs.AI | 2024-10-20 |
170 | BERTtime Stories: Investigating The Role of Synthetic Story Data in Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe our contribution to the Strict and Strict-Small tracks of the 2nd iteration of the BabyLM Challenge. |
Nikitas Theodoropoulos; Giorgos Filandrianos; Vassilis Lyberatos; Maria Lymperaiou; Giorgos Stamou; | arxiv-cs.CL | 2024-10-20 |
171 | DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-based Proximal Policy Optimization (DTPPO) method. |
Anning Wei; Jintao Liang; Kaiyuan Lin; Ziyue Li; Rui Zhao; | arxiv-cs.MA | 2024-10-19 |
172 | Bias Amplification: Language Models As Increasingly Biased Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the gap in understanding the bias amplification of LLMs with four main contributions. Firstly, we propose a theoretical framework, defining the necessary and sufficient conditions for its occurrence, and emphasizing that it occurs independently of model collapse. |
ZE WANG et. al. | arxiv-cs.AI | 2024-10-19 |
173 | Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This scarcity of annotated data impedes the development of effective machine learning models for cancer document classification. To address this challenge, we present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics. |
ELIAS HOSSAIN et. al. | arxiv-cs.AI | 2024-10-19 |
174 | From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation By Natural Language Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces SecCode, a framework that leverages an innovative interactive encouragement prompting (EP) technique for secure code generation with \textit{only NL} prompts. |
SHIGANG LIU et. al. | arxiv-cs.CR | 2024-10-18 |
175 | Automated Genre-Aware Article Scoring and Feedback Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the development of an advanced intelligent article scoring system that not only assesses the overall quality of written work but also offers detailed feature-based scoring tailored to various article genres. |
CHIHANG WANG et. al. | arxiv-cs.CL | 2024-10-18 |
176 | XPerT: Extended Persistence Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel transformer architecture called the \textit{Extended Persistence Transformer (xPerT)}, which is highly scalable than the compared to Persformer, an existing transformer for persistence diagrams. |
Sehun Kim; | arxiv-cs.LG | 2024-10-18 |
177 | Harmony: A Home Agent for Responsive Management and Action Optimization with A Locally Deployed Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to optimize the privacy and economy of data processing while maintaining the powerful functions of LLMs, we propose Harmony, a smart home assistant framework that uses a locally deployable small-scale LLM. |
Ziqi Yin; Mingxin Zhang; Daisuke Kawahara; | arxiv-cs.HC | 2024-10-18 |
178 | Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs. |
XINGYU TAN et. al. | arxiv-cs.CL | 2024-10-18 |
179 | SBI-RAG: Enhancing Math Word Problem Solving for Students Through Schema-Based Instruction and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Schema-based instruction (SBI) is an evidence-based strategy that helps students categorize problems based on their structure, improving problem-solving accuracy. Building on this, we propose a Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG) framework that incorporates a large language model (LLM). |
Prakhar Dixit; Tim Oates; | arxiv-cs.LG | 2024-10-17 |
180 | FaithBench: A Diverse Hallucination Benchmark for Summarization By Modern LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FaithBench, a summarization hallucination benchmark comprising challenging hallucinations made by 10 modern LLMs from 8 different families, with ground truth annotations by human experts. |
FORREST SHENG BAO et. al. | arxiv-cs.CL | 2024-10-17 |
181 | Transfer Learning on Transformers for Building Energy Consumption Forecasting — A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting. |
Robert Spencer; Surangika Ranathunga; Mikael Boulic; Andries van Heerden; Teo Susnjak; | arxiv-cs.LG | 2024-10-17 |
182 | Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employed a cross-agent prediction model to compare the metacognitive performance of humans and ChatGPT in a language-based memory task involving garden-path sentences preceded by either fitting or unfitting context sentences. |
Markus Huff; Elanur Ulakçı; | arxiv-cs.CL | 2024-10-17 |
183 | Linguistically Grounded Analysis of Language Models Using Shapley Head Values Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the processing of morphosyntactic phenomena, by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs). |
Marcell Fekete; Johannes Bjerva; | arxiv-cs.CL | 2024-10-17 |
184 | Measuring and Modifying The Readability of English Texts with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, in a pre-registered human experiment (N = 59), we ask whether Turbo can reliably make text easier or harder to read. We find evidence to support this hypothesis, though considerable variance in human judgments remains unexplained. |
Sean Trott; Pamela D. Rivière; | arxiv-cs.CL | 2024-10-17 |
185 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed raster order using BERT- or GPT-like transformer architectures. |
LIJIE FAN et. al. | arxiv-cs.CV | 2024-10-17 |
186 | Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that transformer-based solutions pose higher computational demands, consistently yield inferior performance, and experience significant performance degradation when quantized to accommodate resource-constrained devices. |
Clayton Souza Leite; Henry Mauranen; Aziza Zhanabatyrova; Yu Xiao; | arxiv-cs.LG | 2024-10-17 |
187 | Detecting AI-Generated Texts in Cross-Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs. |
You Zhou; Jie Wang; | arxiv-cs.CL | 2024-10-17 |
188 | Context-Scaling Versus Task-Scaling in In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training. |
Amirhesam Abedsoltan; Adityanarayanan Radhakrishnan; Jingfeng Wu; Mikhail Belkin; | arxiv-cs.LG | 2024-10-16 |
189 | With A Grain of SALT: Are LLMs Fair Across Social Dimensions? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an analysis of biases in open-source Large Language Models (LLMs) across various genders, religions, and races. |
Samee Arif; Zohaib Khan; Agha Ali Raza; Awais Athar; | arxiv-cs.CL | 2024-10-16 |
190 | Stabilize The Latent Space for Image Autoregressive Modeling: A Unified Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This finding contrasts sharply with the field of NLP, where the autoregressive model GPT has established a commanding presence. To address this discrepancy, we introduce a unified perspective on the relationship between latent space and generative models, emphasizing the stability of latent space in image generative modeling. |
YONGXIN ZHU et. al. | arxiv-cs.CV | 2024-10-16 |
191 | Reconstruction of Differentially Private Text Sanitization Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two attacks (black-box and white-box) based on the accessibility to LLMs and show that LLMs could connect the pair of DP-sanitized text and the corresponding private training data of LLMs by giving sample text pairs as instructions (in the black-box attacks) or fine-tuning data (in the white-box attacks). |
SHUCHAO PANG et. al. | arxiv-cs.CR | 2024-10-16 |
192 | SELF-BART : A Transformer-based Molecular Representation Model Using SELFIES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we develop an encoder-decoder model based on BART that is capable of leaning molecular representations and generate new molecules. |
INDRA PRIYADARSINI et. al. | arxiv-cs.CE | 2024-10-16 |
193 | When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether GPTs can appropriately respond to unanswerable math word problems by applying prompts typically used in solvable mathematical scenarios. |
Asir Saadat; Tasmia Binte Sogir; Md Taukir Azam Chowdhury; Syem Aziz; | arxiv-cs.CL | 2024-10-16 |
194 | Unifying Economic and Language Models for Enhanced Sentiment Analysis of The Oil Market Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these LMs often have difficulty with domain-specific terminology, limiting their effectiveness in the crude oil sector. Addressing this gap, we introduce CrudeBERT, a fine-tuned LM specifically for the crude oil market. |
Himmet Kaplan; Ralf-Peter Mundani; Heiko Rölke; Albert Weichselbraun; Martin Tschudy; | arxiv-cs.IR | 2024-10-16 |
195 | Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, a few multi-lingual LLMs have emerged, but their performance in low-resource languages, especially the most spoken languages in South Asia, is less explored. To address this gap, in this study, we evaluate LLMs such as GPT-4, Llama 2, and Gemini to analyze their effectiveness in English compared to other low-resource languages from South Asia (e.g., Bangla, Hindi, and Urdu). |
Krishno Dey; Prerona Tarannum; Md. Arid Hasan; Imran Razzak; Usman Naseem; | arxiv-cs.CL | 2024-10-16 |
196 | Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Jigsaw Puzzles (JSP), a straightforward yet effective multi-turn jailbreak strategy against the advanced LLMs. |
Hao Yang; Lizhen Qu; Ehsan Shareghi; Gholamreza Haffari; | arxiv-cs.CL | 2024-10-15 |
197 | Table-LLM-Specialist: Language Model Specialists for Tables Using Iterative Generator-Validator Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Table-LLM-Specialist, or Table-Specialist for short, as a new self-trained fine-tuning paradigm specifically designed for table tasks. |
JUNJIE XING et. al. | arxiv-cs.CL | 2024-10-15 |
198 | In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have achieved impressive results across various tasks. |
Alireza Shamshiri; Kyeong Rok Ryu; June Young Park; | arxiv-cs.CL | 2024-10-15 |
199 | TraM : Enhancing User Sleep Prediction with Transformer-based Multivariate Time Series Modeling and Machine Learning Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach that leverages Transformer-based multivariate time series model and Machine Learning Ensembles to predict the quality of human sleep, emotional states, and stress levels. |
Jinjae Kim; Minjeong Ma; Eunjee Choi; Keunhee Cho; Chanwoo Lee; | arxiv-cs.LG | 2024-10-15 |
200 | De-jargonizing Science for Journalists with GPT-4: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study offers an initial evaluation of a human-in-the-loop system leveraging GPT-4 (a large language model or LLM), and Retrieval-Augmented Generation (RAG) to identify and define jargon terms in scientific abstracts, based on readers’ self-reported knowledge. |
Sachita Nishal; Eric Lee; Nicholas Diakopoulos; | arxiv-cs.CL | 2024-10-15 |
201 | Embedding Self-Correction As An Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often encounter difficulties in certain aspects of mathematical reasoning, leading to flawed reasoning and erroneous results. To mitigate these issues, we introduce a novel mechanism, the Chain of Self-Correction (CoSC), specifically designed to embed self-correction as an inherent ability in LLMs, enabling them to validate and rectify their own results. |
Kuofeng Gao; Huanqia Cai; Qingyao Shuai; Dihong Gong; Zhifeng Li; | arxiv-cs.AI | 2024-10-14 |
202 | Rethinking Legal Judgement Prediction in A Realistic Scenario in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking. |
Shubham Kumar Nigam; Aniket Deroy; Subhankar Maity; Arnab Bhattacharya; | arxiv-cs.CL | 2024-10-14 |
203 | Performance in A Dialectal Profiling Task of LLMs for Varieties of Brazilian Portuguese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results offer sociolinguistic contributions for an equity fluent NLP technology. |
Raquel Meister Ko Freitag; Túlio Sousa de Gois; | arxiv-cs.CL | 2024-10-14 |
204 | RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers. |
Md Kowsher; Tara Esmaeilbeig; Chun-Nam Yu; Mojtaba Soltanalian; Niloofar Yousefi; | arxiv-cs.CL | 2024-10-13 |
205 | Evaluating Gender Bias of LLMs in Making Morality Judgements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates whether current closed and open-source LLMs possess gender bias, especially when asked to give moral opinions. To evaluate these models, we curate and introduce a new dataset GenMO (Gender-bias in Morality Opinions) comprising parallel short stories featuring male and female characters respectively. |
Divij Bajaj; Yuanyuan Lei; Jonathan Tong; Ruihong Huang; | arxiv-cs.CL | 2024-10-13 |
206 | Transformer-based Language Models for Reasoning in The Description Logic ALCQ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this way, we systematically investigate the logical reasoning capabilities of a supervised fine-tuned DeBERTa-based model and two large language models (GPT-3.5, GPT-4) with few-shot prompting. |
Angelos Poulis; Eleni Tsalapati; Manolis Koubarakis; | arxiv-cs.CL | 2024-10-12 |
207 | \llinstruct: An Instruction-tuned Model for English Language Proficiency Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications. |
Debanjan Ghosh; Sophia Chan; | arxiv-cs.CL | 2024-10-11 |
208 | Improving Legal Entity Recognition Using A Hybrid Transformer Model and Semantic Filtering Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel hybrid model that enhances the accuracy and precision of Legal-BERT, a transformer model fine-tuned for legal text processing, by introducing a semantic similarity-based filtering mechanism. |
Duraimurugan Rajamanickam; | arxiv-cs.CL | 2024-10-11 |
209 | Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism Via Dual Diffusion Models and GPT Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Traditional methods often rely on extensive and costly data collection using sonar sensors, jeopardizing data quality and diversity. To overcome these limitations, this study proposes a new sonar image synthesis framework, Synth-SONAR leveraging diffusion models and GPT prompting. |
Purushothaman Natarajan; Kamal Basha; Athira Nambiar; | arxiv-cs.CV | 2024-10-11 |
210 | Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a pipeline for developing in-house LLMs tailored to identify differential diagnoses from radiology reports. |
LUOYAO CHEN et. al. | arxiv-cs.CL | 2024-10-11 |
211 | Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture. |
Evan Lucas; Dylan Kangas; Timothy C Havens; | arxiv-cs.CL | 2024-10-11 |
212 | Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data. |
Grace Proebsting; Adam Poliak; | arxiv-cs.CL | 2024-10-11 |
213 | AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For instance, attacks tend to be less effective when models pay more attention to system prompts designed to ensure LLM safety alignment. Building on this discovery, we introduce an enhanced method that manipulates models’ attention scores to facilitate LLM jailbreaking, which we term AttnGCG. |
ZIJUN WANG et. al. | arxiv-cs.CL | 2024-10-11 |
214 | HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a method named HorGait, which utilizes a hybrid model with a Transformer architecture for gait recognition on the planar projection of 3D point clouds from LiDAR. |
JIAXING HAO et. al. | arxiv-cs.CV | 2024-10-10 |
215 | Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While morally clear scenarios are more discernible to LLMs, greater difficulty is encountered in morally ambiguous contexts. In this investigation, we explored LLM calibration to show that human and LLM judgments are poorly aligned in such scenarios. |
PRANAV SENTHILKUMAR et. al. | arxiv-cs.CL | 2024-10-10 |
216 | Evaluating Transformer Models for Suicide Risk Detection on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on leveraging state-of-the-art natural language processing solutions for identifying suicide risk in social media posts as a submission for the IEEE BigData 2024 Cup: Detection of Suicide Risk on Social Media conducted by the kubapok team. |
Jakub Pokrywka; Jeremi I. Kaczmarek; Edward J. Gorzelańczyk; | arxiv-cs.CL | 2024-10-10 |
217 | VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce VibeCheck, a system for automatically comparing a pair of LLMs by discovering identifying traits of a model (vibes) that are well-defined, differentiating, and user-aligned. |
Lisa Dunlap; Krishna Mandal; Trevor Darrell; Jacob Steinhardt; Joseph E Gonzalez; | arxiv-cs.CL | 2024-10-10 |
218 | Robust AI-Generated Text Detection By Restricted Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on the robustness of classifier-based detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains. |
KRISTIAN KUZNETSOV et. al. | arxiv-cs.CL | 2024-10-10 |
219 | The Rise of AI-Generated Content in Wikipedia Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative, to establish lower bounds on the presence of AI-generated content in recently created Wikipedia pages. |
Creston Brooks; Samuel Eggert; Denis Peskoff; | arxiv-cs.CL | 2024-10-10 |
220 | Optimized Spatial Architecture Mapping Flow for Transformer Accelerators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the design process for existing spatial architectures is predominantly manual, and it often involves time-consuming redesigns for new applications and new problem dimensions, which greatly limits the development of optimally designed accelerators for Transformer models. To address these challenges, we propose SAMT (Spatial Architecture Mapping for Transformers), a comprehensive framework designed to optimize the dataflow mapping of Transformer inference workloads onto spatial accelerators. |
HAOCHENG XU et. al. | arxiv-cs.AR | 2024-10-09 |
221 | SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new selective PEFT method, namely SparseGrad, that performs well on MLP blocks. |
VIKTORIIA CHEKALINA et. al. | arxiv-cs.CL | 2024-10-09 |
222 | Stanceformer: Target-Aware Transformer for Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, these models yield similar performance regardless of whether we utilize or disregard target information, undermining the task’s significance. To address this challenge, we introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference. |
Krishna Garg; Cornelia Caragea; | arxiv-cs.CL | 2024-10-09 |
223 | InAttention: Linear Context Scaling for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we modify the decoder-only transformer, replacing self-attention with InAttention, which scales linearly with context length during inference by having tokens attend only to initial states. |
Joseph Eisner; | arxiv-cs.LG | 2024-10-09 |
224 | SWE-Bench+: Enhanced Coding Benchmark for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a systematic evaluation of the quality of SWE-bench remains missing. In this paper, we addressed this gap by presenting an empirical analysis of the SWE-bench dataset. |
REEM ALEITHAN et. al. | arxiv-cs.SE | 2024-10-09 |
225 | SC-Bench: A Large-Scale Dataset for Smart Contract Auditing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SC-Bench, the first dataset for automated smart-contract auditing research. |
Shihao Xia; Mengting He; Linhai Song; Yiying Zhang; | arxiv-cs.CR | 2024-10-08 |
226 | Solving Multi-Goal Robotic Tasks with Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, no existing methods effectively combine offline training, multi-goal learning, and transformer-based architectures. In this paper, we address these challenges by introducing a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics. |
Paul Gajewski; Dominik Żurek; Marcin Pietroń; Kamil Faber; | arxiv-cs.RO | 2024-10-08 |
227 | Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini – frontier models trained to be helpful, harmless, and honest – can engage in specification gaming without training on a curriculum of tasks, purely from in-context iterative reflection (which we call in-context reinforcement learning, ICRL). |
Leo McKee-Reid; Christoph Sträter; Maria Angelica Martinez; Joe Needham; Mikita Balesni; | arxiv-cs.AI | 2024-10-08 |
228 | A GPT-based Decision Transformer for Multi-Vehicle Coordination at Unsignalized Intersections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the application of the Decision Transformer, a decision-making algorithm based on the Generative Pre-trained Transformer (GPT) architecture, to multi-vehicle coordination at unsignalized intersections. |
Eunjae Lee; Minhee Kang; Yoojin Choi; Heejin Ahn; | arxiv-cs.RO | 2024-10-08 |
229 | Unveiling Transformer Perception By Exploring Input Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a general method for the exploration of equivalence classes in the input space of Transformer models. |
Alessandro Benfenati; Alfio Ferrara; Alessio Marta; Davide Riva; Elisabetta Rocchetti; | arxiv-cs.LG | 2024-10-08 |
230 | A Comparative Study of Hybrid Models in Health Misinformation Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs), aiming to develop more effective tools for countering the spread of health misinformation during the pan-demic. |
Mkululi Sikosana; Oluwaseun Ajao; Sean Maudsley-Barton; | arxiv-cs.IR | 2024-10-08 |
231 | A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation. |
ARCHIT SHARMA et. al. | nips | 2024-10-07 |
232 | Weak-to-Strong Search: Align Large Language Models Via Searching Over Small Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce *weak-to-strong search*, framing the alignment of a large language model as a test-time greedy search to maximize the log-likelihood difference between small tuned and untuned models while sampling from the frozen large model. |
ZHANHUI ZHOU et. al. | nips | 2024-10-07 |
233 | Seshat Global History Databank Text Dataset and Benchmark of Large Language Models’ History Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This benchmarking is particularly challenging, given that human knowledge of history is inherently unbalanced, with more information available on Western history and recent periods. To address this challenge, we introduce a curated sample of the Seshat Global History Databank, which provides a structured representation of human historical knowledge, containing 36,000 data points across 600 historical societies and over 600 scholarly references. |
JAKOB HAUSER et. al. | nips | 2024-10-07 |
234 | SAND: Smooth Imputation of Sparse and Noisy Functional Data with Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the transformer architecture has come to dominate other models for text and image data, its application to irregularly-spaced longitudinal data has been limited. We introduce a variant of the transformer that enables it to more smoothly impute such functional data. |
Ju-Sheng Hong; Junwen Yao; Jonas Mueller; Jane-Ling Wang; | nips | 2024-10-07 |
235 | Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks. |
Jonathan Hayase; Ema Borevković; Nicholas Carlini; Florian Tramer; Milad Nasr; | nips | 2024-10-07 |
236 | Timer-XL: Long-Context Transformers for Unified Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Timer-XL, a generative Transformer for unified time series forecasting. |
Yong Liu; Guo Qin; Xiangdong Huang; Jianmin Wang; Mingsheng Long; | arxiv-cs.LG | 2024-10-07 |
237 | Achieving Efficient Alignment Through Learned Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce *Aligner*, a novel and simple alignment paradigm that learns the correctional residuals between preferred and dispreferred answers using a small model. |
JIAMING JI et. al. | nips | 2024-10-07 |
238 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 774M parameters. |
RHEA SUKTHANKER et. al. | nips | 2024-10-07 |
239 | Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods. |
Lingxiao Zhao; Xueying Ding; Leman Akoglu; | nips | 2024-10-07 |
240 | Finding Transformer Circuits With Edge Pruning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we frame circuit discovery as an optimization problem and propose _Edge Pruning_ as an effective and scalable solution. |
Adithya Bhaskar; Alexander Wettig; Dan Friedman; Danqi Chen; | nips | 2024-10-07 |
241 | In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introducing the concept of state vector. |
DONGFANG LI et. al. | nips | 2024-10-07 |
242 | VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach to reduce vision compute by leveraging redundant vision tokens “skipping layers” rather than decreasing the number of vision tokens. |
SHIWEI WU et. al. | nips | 2024-10-07 |
243 | $M^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents $M^3$GPT, an advanced \textbf{M}ultimodal, \textbf{M}ultitask framework for \textbf{M}otion comprehension and generation. |
MINGSHUANG LUO et. al. | nips | 2024-10-07 |
244 | Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Efficient Multi-Task Learning (EMTAL), a novel approach that transforms a pre-trained Transformer into an efficient multi-task learner during training, and reparameterizes the knowledge back to the original Transformer for efficient inference. |
Hanwen Zhong; Jiaxin Chen; Yutong Zhang; Di Huang; Yunhong Wang; | nips | 2024-10-07 |
245 | APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents APIGen, an automated data generation pipeline designed to produce verifiable high-quality datasets for function-calling applications. |
ZUXIN LIU et. al. | nips | 2024-10-07 |
246 | FinBen: An Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. |
QIANQIAN XIE et. al. | nips | 2024-10-07 |
247 | ETO:Efficient Transformer-based Local Feature Matching By Organizing Multiple Homography Hypotheses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient transformer-based network architecture for local feature matching. |
JUNJIE NI et. al. | nips | 2024-10-07 |
248 | Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. |
JONATHAN THOMM et. al. | nips | 2024-10-07 |
249 | Does RoBERTa Perform Better Than BERT in Continual Learning: An Attention Sink Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we observe that pre-trained models may allocate high attention scores to some ‘sink’ tokens, such as [SEP] tokens, which are ubiquitous across various tasks. |
Xueying Bai; Yifan Sun; Niranjan Balasubramanian; | arxiv-cs.LG | 2024-10-07 |
250 | The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent research suggests that state-space models (SSMs) like Mamba can be competitive with Transformer models for language modeling with advantageous deployment characteristics. Given the focus and expertise on training large-scale Transformer models, we consider the challenge of converting these pretrained models into SSMs for deployment. |
Junxiong Wang; Daniele Paliotta; Avner May; Alexander Rush; Tri Dao; | nips | 2024-10-07 |
251 | SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. |
PEI ZHOU et. al. | nips | 2024-10-07 |
252 | Perception of Knowledge Boundary for Large Language Models Through Semi-open-ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perceive the LLMs’ knowledge boundary with semi-open-ended questions by discovering more ambiguous answers. |
ZHIHUA WEN et. al. | nips | 2024-10-07 |
253 | LSH-MoE: Communication-efficient MoE Training Via Locality-Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH). |
XIAONAN NIE et. al. | nips | 2024-10-07 |
254 | SemCoder: Training Code Language Models with Comprehensive Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair. |
YANGRUIBO DING et. al. | nips | 2024-10-07 |
255 | Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library. |
YE FANG et. al. | nips | 2024-10-07 |
256 | Alleviating Distortion in Image Generation Via Multi-Resolution Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. |
QIHAO LIU et. al. | nips | 2024-10-07 |
257 | Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work has proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top $k$ similar tokens. |
CHAU TRAN et. al. | nips | 2024-10-07 |
258 | LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The KG datastore is designed as a plug-and-play module, allowing for seamless integration with various model architectures. We introduce and evaluate three distinct frameworks within this paradigm: KG-LLaVA, which integrates the pre-trained LLaVA model with KG-RAG; Med-XPT, a custom framework combining MedCLIP, a transformer-based projector, and GPT-2; and Bio-LLaVA, which adapts LLaVA by incorporating the Bio-ViT-L vision model. |
Ameer Hamza; Yong Hyun Ahn; Sungyoung Lee; Seong Tae Kim; | arxiv-cs.CV | 2024-10-07 |
259 | Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a method that is able to distill a pre-trained Transformer architecture into alternative architectures such as state space models (SSMs). |
Aviv Bick; Kevin Li; Eric Xing; J. Zico Kolter; Albert Gu; | nips | 2024-10-07 |
260 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based **M**ulti-**A**gent framework for **G**itHub **I**ssue re**S**olution, **MAGIS**, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents. |
WEI TAO et. al. | nips | 2024-10-07 |
261 | RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent. |
SHAOTENG LIU et. al. | nips | 2024-10-07 |
262 | SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel transformer-to-SNN conversion method that outputs an end-to-end spike-based transformer, named SpikedAttention. |
Sangwoo Hwang; Seunghyun Lee; Dahoon Park; Donghun Lee; Jaeha Kung; | nips | 2024-10-07 |
263 | Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models Via Recounted Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new prompting technique tailored for temporal reasoning, Narrative-of-Thought (NoT), that first converts the events set to a Python class, then prompts a small model to generate a temporally grounded narrative, guiding the final generation of a temporal graph. |
Xinliang Frederick Zhang; Nick Beauchamp; Lu Wang; | arxiv-cs.CL | 2024-10-07 |
264 | Understanding Transformers Via N-Gram Statistics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper takes a first step in this direction by considering families of functions (i.e. rules) formed out of simple N-gram based statistics of the training data. By studying how well these rulesets approximate transformer predictions, we obtain a variety of novel discoveries: a simple method to detect overfitting during training without using a holdout set, a quantative measure of how transformers progress from learning simple to more complex statistical rules over the course of training, a model-variance criterion governing when transformer predictions tend to be described by N-gram rules, and insights into how well transformers can be approximated by N-gram rulesets in the limit where these rulesets become increasingly complex. |
Timothy Nguyen; | nips | 2024-10-07 |
265 | InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration. |
XIAOYI DONG et. al. | nips | 2024-10-07 |
266 | Approximation Rate of The Transformer Architecture for Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the approximation rate for single-layer Transformers with one head. |
Haotian Jiang; Qianxiao Li; | nips | 2024-10-07 |
267 | Differential Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise. |
TIANZHU YE et. al. | arxiv-cs.CL | 2024-10-07 |
268 | OccamLLM: Fast and Exact Language Model Arithmetic in A Single Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that enables exact arithmetic in a single autoregressive step, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities. |
OWEN DUGAN et. al. | nips | 2024-10-07 |
269 | Visual Autoregressive Modeling: Scalable Image Generation Via Next-Scale Prediction IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine next-scale prediction or next-resolution prediction, diverging from the standard raster-scan next-token prediction. |
Keyu Tian; Yi Jiang; Zehuan Yuan; BINGYUE PENG; Liwei Wang; | nips | 2024-10-07 |
270 | Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. |
Akshay Krishnamurthy; Keegan Harris; Dylan J Foster; Cyril Zhang; Aleksandrs Slivkins; | nips | 2024-10-07 |
271 | UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction. |
Yansong Ning; Hao Liu; | nips | 2024-10-07 |
272 | JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data. |
KUN ZHOU et. al. | nips | 2024-10-07 |
273 | DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size—adding a few thousand parameters for large-scale models in the 100B parameters range. |
Matteo Pagliardini; Amirkeivan Mohtashami; François Fleuret; Martin Jaggi; | nips | 2024-10-07 |
274 | Leveraging Free Energy in Pretraining Model Selection for Improved Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a Bayesian model selection criterion, called the downstream free energy, which quantifies a checkpoint’s adaptability by measuring the concentration of nearby favorable parameters for the downstream task. |
Michael Munn; Susan Wei; | arxiv-cs.LG | 2024-10-07 |
275 | ProtocoLLM: Automatic Evaluation Framework of LLMs on Domain-Specific Scientific Protocol Formulation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a flexible, automatic framework to evaluate LLM’s capability on SPFT: ProtocoLLM. |
Seungjun Yi; Jaeyoung Lim; Juyong Yoon; | arxiv-cs.CL | 2024-10-06 |
276 | Selective Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Transformer models face two key challenges when dealing with HSI scenes characterized by diverse land cover types and rich spectral information: (1) fixed receptive field representation overlooks effective contextual information; (2) redundant self-attention feature representation. To address these limitations, we propose a novel Selective Transformer (SFormer) for HSI classification. |
Yichu Xu; Di Wang; Lefei Zhang; Liangpei Zhang; | arxiv-cs.CV | 2024-10-04 |
277 | How Language Models Prioritize Contextual Grammatical Cues? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun. |
Hamidreza Amirzadeh; Afra Alishahi; Hosein Mohebbi; | arxiv-cs.CL | 2024-10-04 |
278 | Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts. |
AMEY HENGLE et. al. | arxiv-cs.CL | 2024-10-04 |
279 | Learning Semantic Structure Through First-Order-Logic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study whether transformer-based language models can extract predicate argument structure from simple sentences. |
Akshay Chaturvedi; Nicholas Asher; | arxiv-cs.CL | 2024-10-04 |
280 | Dynamic Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we introduce a Timestep-wise Dynamic Width (TDW) approach that adapts model width conditioned on the generation timesteps. |
WANGBO ZHAO et. al. | arxiv-cs.CV | 2024-10-04 |
281 | Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the tokenization techniques employed by state-of-the-art large language models (LLMs) and their implications on the cost and availability of services across different languages, especially low resource languages. |
Abrar Rahman; Garry Bowlin; Binit Mohanty; Sean McGunigal; | arxiv-cs.CL | 2024-10-04 |
282 | IndicSentEval: How Effectively Do Multilingual Transformer Models Encode Linguistic Properties for Indic Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate similar questions regarding encoding capability and robustness for 8 linguistic properties across 13 different perturbations in 6 Indic languages, using 9 multilingual Transformer models (7 universal and 2 Indic-specific). |
Akhilesh Aravapalli; Mounika Marreddy; Subba Reddy Oota; Radhika Mamidi; Manish Gupta; | arxiv-cs.CL | 2024-10-03 |
283 | SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% The Cost Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes SIEVE, a lightweight alternative that matches GPT-4o accuracy at a fraction of the cost. |
Jifan Zhang; Robert Nowak; | arxiv-cs.CL | 2024-10-03 |
284 | CulturalBench: A Robust, Diverse and Challenging Benchmark on Measuring The (Lack Of) Cultural Knowledge of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CulturalBench: a set of 1,227 human-written and human-verified questions for effectively assessing LLMs’ cultural knowledge, covering 45 global regions including the underrepresented ones like Bangladesh, Zimbabwe, and Peru. |
YU YING CHIU et. al. | arxiv-cs.CL | 2024-10-03 |
285 | AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first correct-by-construction learning-based system for step-by-step mathematical integration. |
Mert Ünsal; Timon Gehr; Martin Vechev; | arxiv-cs.LG | 2024-10-03 |
286 | Intrinsic Evaluation of RAG Systems for Deep-Logic Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Overall Performance Index (OPI), an intrinsic metric to evaluate retrieval-augmented generation (RAG) mechanisms for applications involving deep-logic queries. |
Junyi Hu; You Zhou; Jie Wang; | arxiv-cs.AI | 2024-10-03 |
287 | AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose AutoDAN-Turbo, a black-box jailbreak method that can automatically discover as many jailbreak strategies as possible from scratch, without any human intervention or predefined scopes (e.g., specified candidate strategies), and use them for red-teaming. |
XIAOGENG LIU et. al. | arxiv-cs.CR | 2024-10-03 |
288 | Coal Mining Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to coal mining question answering (QA) using large language models (LLMs) combined with tailored prompt engineering techniques. |
Antonio Carlos Rivera; Anthony Moore; Steven Robinson; | arxiv-cs.CL | 2024-10-03 |
289 | TSOTSALearning at LLMs4OL Tasks A and B : Combining Rules to Large Language Model for Ontology Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our contribution to the Large Language Model For Ontology Learning (LLMs4OL) challenge hosted by ISWC conference. The challenge involves extracting and … |
Carick Appolinaire Atezong Ymele; Azanzi Jiomekong; | LLMs4OL@ISWC | 2024-10-02 |
290 | Automatic Deductive Coding in Discourse Analysis: An Application of Large Language Models in Learning Analytics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding. |
Lishan Zhang; Han Wu; Xiaoshan Huang; Tengfei Duan; Hanxiang Du; | arxiv-cs.CL | 2024-10-02 |
291 | A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer. |
LIANG CHEN et. al. | arxiv-cs.CV | 2024-10-02 |
292 | Emotion-Aware Response Generation Using Affect-Enriched Embeddings with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel framework that integrates multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as LLAMA 2, Flan-T5, ChatGPT 3.0, and ChatGPT 4.0. |
Abdur Rasool; Muhammad Irfan Shahzad; Hafsa Aslam; Vincent Chan; | arxiv-cs.CL | 2024-10-02 |
293 | ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, even state-of-the-art vision-language models (VLMs), such as GPT-4o, still fall short of human-level performance, particularly in intricate web environments and long-horizon tasks. To address these limitations, we present ExACT, an approach to combine test-time search and self-learning to build o1-like models for agentic applications. |
XIAO YU et. al. | arxiv-cs.CL | 2024-10-02 |
294 | On The Adaptation of Unlimiformer for Decoder-Only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its main limitation is incompatibility with decoder-only transformers out of the box. In this work, we explore practical considerations of adapting Unlimiformer to decoder-only transformers and introduce a series of modifications to overcome this limitation. |
KIAN AHRABIAN et. al. | arxiv-cs.CL | 2024-10-02 |
295 | Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of LLMs and FinBERT for FSA, comparing their performance on news articles, financial reports and company announcements. |
Yanxin Shen; Pulin Kirin Zhang; | arxiv-cs.IR | 2024-10-02 |
296 | Creative and Context-Aware Translation of East Asian Idioms with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, compiling a dictionary of candidate translations demands much time and creativity even for expert translators. To alleviate such burden, we evaluate if GPT-4 can help generate high-quality translations. |
Kenan Tang; Peiyang Song; Yao Qin; Xifeng Yan; | arxiv-cs.CL | 2024-10-01 |
297 | Ayaka: A Versatile Transformer Accelerator With Low-Rank Estimation and Heterogeneous Dataflow Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer model has demonstrated outstanding performance in the field of artificial intelligence. However, its remarkable performance comes at the cost of substantial … |
YUBIN QIN et. al. | IEEE Journal of Solid-State Circuits | 2024-10-01 |
298 | SIGMA: Secure GPT Inference with Function Secret Sharing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the … |
KANAV GUPTA et. al. | Proc. Priv. Enhancing Technol. | 2024-10-01 |
299 | When Transformer Meets Large Graphs: An Expressive and Efficient Two-View Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The successes of applying Transformer to graphs have been witnessed on small graphs (e.g., molecular graphs), yet two barriers prevent its adoption on large graphs (e.g., citation … |
Weirui Kuang; Zhen Wang; Zhewei Wei; Yaliang Li; Bolin Ding; | IEEE Transactions on Knowledge and Data Engineering | 2024-10-01 |
300 | Vision Transformer Promotes Cancer Diagnosis: A Comprehensive Review Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xiaoyan Jiang; Shuihua Wang; Eugene Yu-Dong Zhang; | Expert Syst. Appl. | 2024-10-01 |
301 | MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone’s Potential with Masked Autoregressive Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on this analysis, we propose Masked Autoregressive Pretraining (MAP) to pretrain a hybrid Mamba-Transformer vision backbone network. |
Yunze Liu; Li Yi; | arxiv-cs.CV | 2024-10-01 |
302 | Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenges posed by the substantial training time and memory consumption associated with video transformers, focusing on the ViViT (Video Vision Transformer) model, in particular the Factorised Encoder version, as our baseline for action recognition tasks. |
Shreyank N Gowda; Anurag Arnab; Jonathan Huang; | eccv | 2024-09-30 |
303 | Evaluating The Fairness of Task-adaptive Pretraining on Unlabeled Test Data Before Few-shot Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Few-shot learning benchmarks are critical for evaluating modern NLP techniques. |
Kush Dubey; | arxiv-cs.CL | 2024-09-30 |
304 | Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face limitations in both shape reconstruction and texture generation. This paper introduces an innovative Analysis-by-Synthesis Transformer that addresses these limitations in a unified framework by effectively modeling pixel-to-shape and pixel-to-texture relationships. |
DIAN JIA et. al. | eccv | 2024-09-30 |
305 | TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost. |
Areeg Fahad Rasheed; M. Zarkoosh; Safa F. Abbas; Sana Sabah Al-Azzawi; | arxiv-cs.CL | 2024-09-30 |
306 | AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering. |
Weiran Huang; Xiuyuan Chen; Yuan Lin; Yuchen Zhang; | eccv | 2024-09-30 |
307 | GENIXER: Empowering Multimodal Large Language Models As A Powerful Data Generator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce , a comprehensive data generation pipeline consisting of four key steps: (i) instruction data collection, (ii) instruction template design, (iii) empowering MLLMs, and (iv) data generation and filtering. |
Henry Hengyuan Zhao; Pan Zhou; Mike Zheng Shou; | eccv | 2024-09-30 |
308 | OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes. |
WENZHAO ZHENG et. al. | eccv | 2024-09-30 |
309 | MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce MaskMamba, a novel hybrid model that combines Mamba and Transformer architectures, utilizing Masked Image Modeling for non-autoregressive image synthesis. |
Wenchao Chen; Liqiang Niu; Ziyao Lu; Fandong Meng; Jie Zhou; | arxiv-cs.CV | 2024-09-30 |
310 | HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction. |
Fangqin Zhou; Mert Kilickaya; Joaquin Vanschoren; Ran Piao; | eccv | 2024-09-30 |
311 | MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the hypergraph transformer-based method for trajectory prediction is yet to be explored. Therefore, we present a MultiscAle Relational Transformer (MART) network for multi-agent trajectory prediction. |
Seongju Lee; Junseok Lee; Yeonguk Yu; Taeri Kim; Kyoobin Lee; | eccv | 2024-09-30 |
312 | GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called , simultaneously applicable for various vision tasks only with a vanilla ViT.Interestingly, our builds a new benchmark in generalist performance, and fosters mutual enhancement across tasks, leading to significant improvements compared to isolated training. |
HAIYANG WANG et. al. | eccv | 2024-09-30 |
313 | ACE: All-round Creator and Editor Following Instructions Via Diffusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose ACE, an All-round Creator and Editor, which achieves comparable performance compared to those expert models in a wide range of visual generation tasks. |
ZHEN HAN et. al. | arxiv-cs.CV | 2024-09-30 |
314 | An Explainable Vision Question Answer Model Via Diffusion Chain-of-Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This means that generating explanations solely for the answer can lead to a semantic discrepancy between the content of the explanation and the question-answering content. To address this, we propose a step-by-step reasoning approach to reduce such semantic discrepancies. |
Chunhao LU; Qiang Lu; Jake Luo; | eccv | 2024-09-30 |
315 | Sparse Attention Decomposition Applied to Circuit Tracing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we seek to isolate and identify the features used to effect communication and coordination among attention heads in GPT-2 small. |
Gabriel Franco; Mark Crovella; | arxiv-cs.LG | 2024-09-30 |
316 | Depression Detection in Social Media Posts Using Transformer-based Models and Auxiliary Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing studies have explored various approaches to this problem but often fall short in terms of accuracy and robustness. To address these limitations, this research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers. |
Marios Kerasiotis; Loukas Ilias; Dimitris Askounis; | arxiv-cs.CL | 2024-09-30 |
317 | LingoQA: Video Question Answering for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving.We release our dataset and benchmark1 as an evaluation platform for vision-language models in autonomous driving. |
ANA-MARIA MARCU et. al. | eccv | 2024-09-30 |
318 | Multimodal Misinformation Detection By Learning from Synthetic Data with Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions. |
Fengzhu Zeng; Wenqian Li; Wei Gao; Yan Pang; | arxiv-cs.CL | 2024-09-29 |
319 | 3D-CT-GPT: Generating 3D Radiology Reports Through Integration of Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model specifically designed for generating radiology reports from 3D CT scans, particularly chest CTs. |
HAO CHEN et. al. | arxiv-cs.CV | 2024-09-28 |
320 | Efficient Federated Intrusion Detection in 5G Ecosystem Using Optimized BERT-based Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs). |
Frederic Adjewa; Moez Esseghir; Leila Merghem-Boulahia; | arxiv-cs.CR | 2024-09-28 |
321 | INSIGHTBUDDY-AI: Medication Extraction and Entity Linking Using Large Language Models and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects. |
Pablo Romero; Lifeng Han; Goran Nenadic; | arxiv-cs.CL | 2024-09-28 |
322 | INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore fundamental questions related to solving mathematical reasoning problems using natural language and code with state-of-the-art LLMs, including GPT-4o-mini and LLama-3.1-8b-Turbo. |
Xuyuan Xiong; Simeng Han; Ziyue Zhou; Arman Cohan; | arxiv-cs.CL | 2024-09-28 |
323 | FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research on food image understanding using recipe data has been a long-standing focus due to the diversity and complexity of the data. |
Yuki Imajuku; Yoko Yamakata; Kiyoharu Aizawa; | arxiv-cs.CV | 2024-09-27 |
324 | Cottention: Linear Transformers With Cosine Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Cottention, a novel attention mechanism that replaces the softmax operation with cosine similarity. |
Gabriel Mongaras; Trevor Dohm; Eric C. Larson; | arxiv-cs.LG | 2024-09-27 |
325 | Experimental Evaluation of Machine Learning Models for Goal-oriented Customer Service Chatbot with Pipeline Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a tailored experimental evaluation approach for goal-oriented customer service chatbots with pipeline architecture, focusing on three key components: Natural Language Understanding (NLU), dialogue management (DM), and Natural Language Generation (NLG). |
Nurul Ain Nabilah Mohd Isa; Siti Nuraishah Agos Jawaddi; Azlan Ismail; | arxiv-cs.AI | 2024-09-27 |
326 | Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Pre-trained language models offer promise for identifying suicidality from unstructured clinical narratives. |
ZEHAN LI et. al. | arxiv-cs.CL | 2024-09-27 |
327 | Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec. |
Richard Yue; John E. Ortega; | arxiv-cs.CL | 2024-09-26 |
328 | Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses from Closed-Domain LLM Vs. Clinical Teams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, responding to these patients’ inquiries has become a significant burden on healthcare workflows, consuming considerable time for clinical care teams. To address this, we introduce RadOnc-GPT, a specialized Large Language Model (LLM) powered by GPT-4 that has been designed with a focus on radiotherapeutic treatment of prostate cancer with advanced prompt engineering, and specifically designed to assist in generating responses. |
YUEXING HAO et. al. | arxiv-cs.AI | 2024-09-26 |
329 | The Application of GPT-4 in Grading Design University Students’ Assignment and Providing Feedback: An Exploratory Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to investigate whether GPT-4 can effectively grade assignments for design university students and provide useful feedback. |
Qian Huang; Thijs Willems; King Wang Poon; | arxiv-cs.AI | 2024-09-26 |
330 | Assessing The Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people, from multiple social media sources. |
Mukaffi Bin Moin; Pronay Debnath; Usafa Akther Rifa; Rijeet Bin Anis; | arxiv-cs.CL | 2024-09-25 |
331 | Beyond Turing Test: Can GPT-4 Sway Experts’ Decisions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the post-Turing era, evaluating large language models (LLMs) involves assessing generated text based on readers’ reactions rather than merely its indistinguishability from human-produced content. |
Takehiro Takayanagi; Hiroya Takamura; Kiyoshi Izumi; Chung-Chi Chen; | arxiv-cs.CE | 2024-09-25 |
332 | Reducing and Exploiting Data Augmentation Noise Through Meta Reweighting Contrastive Learning for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To boost deep learning models’ performance given augmented data/samples in text classification tasks, we propose a novel framework, which leverages both meta learning and contrastive learning techniques as parts of our design for reweighting the augmented samples and refining their feature representations based on their quality. |
Guanyi Mou; Yichuan Li; Kyumin Lee; | arxiv-cs.CL | 2024-09-25 |
333 | Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code. |
Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier; | arxiv-cs.CR | 2024-09-25 |
334 | SynChart: Synthesizing Charts from Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct a large-scale chart dataset, SynChart, which contains approximately 4 million diverse chart images with over 75 million dense annotations, including data tables, code, descriptions, and question-answer sets. We trained a 4.2B chart-expert model using this dataset and achieve near-GPT-4O performance on the ChartQA task, surpassing GPT-4V. |
MENGCHEN LIU et. al. | arxiv-cs.AI | 2024-09-24 |
335 | Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As large language models (LLMs) become advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality. |
Xufeng Duan; Xinyu Zhou; Bei Xiao; Zhenguang G. Cai; | arxiv-cs.CL | 2024-09-24 |
336 | MonoFormer: One Transformer for Both Diffusion and Autoregression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to study a simple idea: share one transformer for both autoregression and diffusion. |
CHUYANG ZHAO et. al. | arxiv-cs.CV | 2024-09-24 |
337 | GPT-4 As A Homework Tutor Can Improve Student Engagement and Learning Outcomes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools. |
Alessandro Vanzo; Sankalan Pal Chowdhury; Mrinmaya Sachan; | arxiv-cs.CY | 2024-09-24 |
338 | SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce \textit{multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries}, SOFI. |
Sebastian Janampa; Marios Pattichis; | arxiv-cs.CV | 2024-09-23 |
339 | Towards A Realistic Long-Term Benchmark for Open-Web Research Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present initial results of a forthcoming benchmark for evaluating LLM agents on white-collar tasks of economic value. |
Peter Mühlbacher; Nikos I. Bosse; Lawrence Phillips; | arxiv-cs.CL | 2024-09-23 |
340 | Improving Academic Skills Assessment with NLP and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP). |
Xinyi Huang; Yingyi Wu; Danyang Zhang; Jiacheng Hu; Yujian Long; | arxiv-cs.CL | 2024-09-23 |
341 | SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in FL environments. |
Minyeong Choe; Cheolhee Park; Changho Seo; Hyunil Kim; | arxiv-cs.LG | 2024-09-23 |
342 | Can Pre-trained Language Models Generate Titles for Research Papers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fine-tune pre-trained language models to generate titles of papers from their abstracts. |
Tohida Rehman; Debarshi Kumar Sanyal; Samiran Chattopadhyay; | arxiv-cs.CL | 2024-09-22 |
343 | Evaluating The Quality of Code Comments Generated By Large Language Models for Novice Programmers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated. |
Aysa Xuemo Fan; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Jiaze Ke; | arxiv-cs.SE | 2024-09-22 |
344 | The Use of GPT-4o and Other Large Language Models for The Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OpenAI’s ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft’s Copilot, Google’s Gemini 1.5 Pro, and Antrophic’s Claude 3.5 Sonnet can be effectively used in various phases of scientific research. |
Goran Bubaš; | arxiv-cs.AI | 2024-09-21 |
345 | Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Narrow Jump to Conclusions (NJTC) and Normalized Narrow Jump to Conclusions (N-NJTC) – parameter efficient alternatives to standard linear shortcutting that reduces shortcut parameter count by over 97%. |
Amrit Diggavi Seshadri; | arxiv-cs.AI | 2024-09-21 |
346 | AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the capabilities and potential of the intelligent personal assistant (IPA) CORE (Checklist Organizer for Research and Exploration), designed to support astronauts during procedures onboard the International Space Station (ISS), the Lunar Gateway station, and beyond. |
OLIVER BENSCH et. al. | arxiv-cs.AI | 2024-09-21 |
347 | Loop Neural Networks for Parameter Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size. |
Kei-Sing Ng; Qingchen Wang; | arxiv-cs.AI | 2024-09-21 |
348 | QMOS: Enhancing LLMs for Telecommunication with Question Masked Loss and Option Shuffling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces QMOS, an innovative approach which uses a Question-Masked loss and Option Shuffling trick to enhance the performance of LLMs in answering Multiple-Choice Questions in the telecommunications domain. |
Blessed Guda; Gabrial Zencha A.; Lawrence Francis; Carlee Joe-Wong; | arxiv-cs.CL | 2024-09-21 |
349 | On Importance of Pruning and Distillation for Efficient Low Resource NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the case of the low-resource Indic language Marathi. |
AISHWARYA MIRASHI et. al. | arxiv-cs.CL | 2024-09-21 |
350 | HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This approach ensures that the correlation between the original and updated parameters is preserved, leveraging the semantic features learned during pre-training. Building on this paradigm, we present the Hadamard Updated Transformation (HUT) method. |
Geyuan Zhang; Xiaofei Zhou; Chuheng Chen; | arxiv-cs.CL | 2024-09-20 |
351 | Prompting Large Language Models for Supporting The Differential Diagnosis of Anemia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by clinical guidelines, our study aimed to develop pathways similar to those that can be obtained in clinical guidelines. |
Elisa Castagnari; Lillian Muyama; Adrien Coulet; | arxiv-cs.CL | 2024-09-20 |
352 | T2M-X: Learning Expressive Text-to-Motion Generation from Partially Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose T2M-X, a two-stage method that learns expressive text-to-motion generation from partially annotated data. |
Mingdian Liu; Yilin Liu; Gurunandan Krishnan; Karl S Bayer; Bing Zhou; | arxiv-cs.CV | 2024-09-20 |
353 | ‘Since Lawyers Are Males..’: Examining Implicit Gender Bias in Hindi Language Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education. |
Ishika Joshi; Ishita Gupta; Adrita Dey; Tapan Parikh; | arxiv-cs.CL | 2024-09-20 |
354 | Drift to Remember Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that representational drift can alleviate catastrophic forgetting in AI during new task acquisition. To test this, we introduce DriftNet, a network designed to constantly explore various local minima in the loss landscape while dynamically retrieving relevant tasks. |
JIN DU et. al. | arxiv-cs.AI | 2024-09-20 |
355 | Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are renowned for their exceptional capabilities, and applying to a wide range of applications. |
Md Abdur Rahman; Hossain Shahriar; Fan Wu; Alfredo Cuzzocrea; | arxiv-cs.CL | 2024-09-20 |
356 | $\text{M}^\text{6}(\text{GPT})^\text{3}$: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm for the generation of melodic elements. |
Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | arxiv-cs.SD | 2024-09-19 |
357 | TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing prompt compression techniques either rely on sub-optimal metrics such as information entropy or model it as a task-agnostic token classification problem that fails to capture task-specific information. To address these issues, we propose a novel and efficient reinforcement learning (RL) based task-aware prompt compression method. |
SHIVAM SHANDILYA et. al. | arxiv-cs.CL | 2024-09-19 |
358 | Introducing The Large Medical Model: State of The Art Healthcare Cost and Risk Prediction with Transformers Trained on Patient Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration. |
RICKY SAHU et. al. | arxiv-cs.LG | 2024-09-19 |
359 | 3DTopia-XL: Scaling High-quality 3D Asset Generation Via Primitive Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for physically based rendering (PBR). In this paper, we introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome these limitations. |
ZHAOXI CHEN et. al. | arxiv-cs.CV | 2024-09-19 |
360 | Program Slicing in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the application of large language models (LLMs) to both static and dynamic program slicing, with a focus on Java programs. |
Kimya Khakzad Shahandashti; Mohammad Mahdi Mohajer; Alvine Boaye Belle; Song Wang; Hadi Hemmati; | arxiv-cs.SE | 2024-09-18 |
361 | Recommendation with Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a taxonomy that categorizes DGMs into three types: ID-driven models, large language models (LLMs), and multimodal models. |
YASHAR DELDJOO et. al. | arxiv-cs.IR | 2024-09-18 |
362 | AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders. |
PENGAN CHEN et. al. | arxiv-cs.RO | 2024-09-18 |
363 | American Sign Language to Text Translation Using Transformer and Seq2Seq with LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the Transformer with the Sequence-to-Sequence (Seq2Seq) model in translating sign language to text. |
Gregorius Guntur Sunardi Putra; Adifa Widyadhani Chanda D’Layla; Dimas Wahono; Riyanarto Sarno; Agus Tri Haryono; | arxiv-cs.CL | 2024-09-17 |
364 | Small Language Models Can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate the creative fiction writing abilities of a fine-tuned small language model (SLM), BART Large, and compare its performance to humans and two large language models (LLMs): GPT-3.5 and GPT-4o. |
Guillermo Marco; Luz Rello; Julio Gonzalo; | arxiv-cs.CL | 2024-09-17 |
365 | Adaptive Large Language Models By Layerwise Attention Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we propose to challenge this and introduce adaptive computations for LLM-like setups, which allow the final layer to attend to all of the intermediate layers as it deems fit through the attention mechanism, thereby introducing computational \textbf{attention shortcuts}. |
Prateek Verma; Mert Pilanci; | arxiv-cs.CL | 2024-09-16 |
366 | Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories. |
Shaznin Sultana; Sadia Afreen; Nasir U. Eisty; | arxiv-cs.SE | 2024-09-16 |
367 | Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, inspired by the recent public release of the GPT-o1 models, we conduct the first study to compare the effectiveness of different versions of the GPT-family models in APR. |
Haichuan Hu; Ye Shang; Guolin Xu; Congqing He; Quanjun Zhang; | arxiv-cs.SE | 2024-09-16 |
368 | LLMs for Clinical Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares the efficacy of GPT-4 and clinalytix Medical AI in predicting the clinical risk of delirium development. |
Mohamed Rezk; Patricia Cabanillas Silva; Fried-Michael Dahlweid; | arxiv-cs.CL | 2024-09-16 |
369 | SelECT-SQL: Self-correcting Ensemble Chain-of-Thought for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SelECT-SQL, a novel in-context learning solution that uses an algorithmic combination of chain-of-thought (CoT) prompting, self-correction, and ensemble methods to yield a new state-of-the-art result on challenging Text-to-SQL benchmarks. |
Ke Shen; Mayank Kejriwal; | arxiv-cs.CL | 2024-09-16 |
370 | Investigating The Impact of Code Comment Inconsistency on Bug Introducing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research investigates the impact of code-comment inconsistency on bug introduction using large language models, specifically GPT-3.5. |
Shiva Radmanesh; Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour; | arxiv-cs.SE | 2024-09-16 |
371 | CAT: Customized Transformer Accelerator Framework on Versal ACAP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is far more flexible than GPU in hardware customization, and has better and smaller design solution space than traditional FPGA. Therefore, this paper proposes the Customized Transformer Accelerator Framework(CAT), through the CAT framework, a customized Transformer accelerator family can be derived on Versal ACAP, CAT framework has an abstract accelerator architecture design idea, which deconstructs and efficiently maps the Transformer into the hardware, which contains a variety of customizable properties. |
Wenbo Zhang; Yiqi Liu; Zhenshan Bao; | arxiv-cs.AR | 2024-09-15 |
372 | Leveraging Open-Source Large Language Models for Native Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Native Language Identification (NLI) – the task of identifying the native language (L1) of a person based on their writing in the second language (L2) – has applications in forensics, marketing, and second language acquisition. Historically, conventional machine learning approaches that heavily rely on extensive feature engineering have outperformed transformer-based language models on this task. |
Yee Man Ng; Ilia Markov; | arxiv-cs.CL | 2024-09-15 |
373 | GP-GPT: Large Language Model for Gene-Phenotype Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis. |
YANJUN LYU et. al. | arxiv-cs.CL | 2024-09-15 |
374 | Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive investigation of the use of large language models (LLMs) and their capabilities in detecting OWASP Top Ten vulnerabilities in Solidity. |
Md Tauseef Alam; Raju Halder; Abyayananda Maiti; | arxiv-cs.CR | 2024-09-15 |
375 | Enhancing LLM Problem Solving with REAP: Reflection, Explicit Problem Deconstruction, and Advanced Prompting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have transformed natural language processing, yet improving their problem-solving capabilities, particularly for complex, reasoning-intensive tasks, … |
Ryan Lingo; Martin Arroyo; Rajeev Chhajer; | ArXiv | 2024-09-14 |
376 | Evaluating Authenticity and Quality of Image Captions Via Sentiment and Semantic Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an evaluation method focused on sentiment and semantic richness. |
Aleksei Krotov; Alison Tebo; Dylan K. Picart; Aaron Dean Algave; | arxiv-cs.CV | 2024-09-14 |
377 | Autoregressive + Chain of Thought = Recurrent: Recurrence’s Role in Language Models’ Computability and A Revisit of Recurrent Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thoroughly investigate the influence of recurrent structures in neural models on their reasoning abilities and computability, contrasting the role autoregression plays in the neural models’ computational power. |
Xiang Zhang; Muhammad Abdul-Mageed; Laks V. S. Lakshmanan; | arxiv-cs.CL | 2024-09-13 |
378 | Undergrads Are All You Have Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper also demonstrates that GPT-UGRD is cheaper and easier to train and operate than transformer models. In this paper, we outline the implementation, application, multi-tenanting, and social implications of using this new model in research and other contexts. |
Ashe Neth; | arxiv-cs.CY | 2024-09-13 |
379 | Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a comprehensive framework for evaluating VLMs tailored to VQA tasks in practical settings. |
Neelabh Sinha; Vinija Jain; Aman Chadha; | arxiv-cs.CV | 2024-09-13 |
380 | Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper’s contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices. |
Jake Street; Isibor Ihianle; Funminiyi Olajide; Ahmad Lotfi; | arxiv-cs.LG | 2024-09-12 |
381 | SDformer: Efficient End-to-End Transformer for Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a different window-based Transformer architecture for depth completion tasks named Sparse-to-Dense Transformer (SDformer). |
JIAN QIAN et. al. | arxiv-cs.CV | 2024-09-12 |
382 | A Novel Mathematical Framework for Objective Characterization of Ideas Through Vector Embeddings in LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This method suffers from limitations such as human judgment errors, bias, and oversight. Addressing this gap, our study introduces a comprehensive mathematical framework for automated analysis to objectively evaluate the plethora of ideas generated by CAI systems and/or humans. |
B. Sankar; Dibakar Sen; | arxiv-cs.AI | 2024-09-11 |
383 | Towards Fairer Health Recommendations: Finding Informative Unbiased Samples Via Word Sense Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, some of these terms, especially those related to race and ethnicity, can carry different meanings (e.g., white matter of spinal cord). To address this issue, we propose the use of Word Sense Disambiguation models to refine dataset quality by removing irrelevant sentences. |
GAVIN BUTTS et. al. | arxiv-cs.CL | 2024-09-11 |
384 | A Fine-grained Sentiment Analysis of App Reviews Using Large Language Models: An Evaluation Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Analyzing user reviews for sentiment towards app features can provide valuable insights into users’ perceptions of app functionality and their evolving needs. |
Faiz Ali Shah; Ahmed Sabir; Rajesh Sharma; | arxiv-cs.CL | 2024-09-11 |
385 | GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address the challenges of using LLM-as-a-Judge when evaluating grounded answers generated by RAG systems. |
Sacha Muller; António Loison; Bilel Omrani; Gautier Viaud; | arxiv-cs.CL | 2024-09-10 |
386 | FairHome: A Fair Housing and Fair Lending Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories. |
Anusha Bagalkotkar; Aveek Karmakar; Gabriel Arnson; Ondrej Linda; | arxiv-cs.LG | 2024-09-09 |
387 | Harmonic Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are becoming very popular and are used for many different purposes, including creative tasks in the arts. |
Anna Kruspe; | arxiv-cs.CL | 2024-09-09 |
388 | Retrofitting Temporal Graph Neural Networks with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose TF-TGN, which uses Transformer decoder as the backbone model for TGNN to enjoy Transformer’s codebase for efficient training. |
QIANG HUANG et. al. | arxiv-cs.LG | 2024-09-09 |
389 | NOVI : Chatbot System for University Novice with BERT and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the difficulties of university freshmen in adapting to university life, we developed NOVI, a chatbot system based on GPT-4o. |
Yoonji Nam; TaeWoong Seo; Gyeongcheol Shin; Sangji Lee; JaeEun Im; | arxiv-cs.CL | 2024-09-09 |
390 | Can Large Language Models Unlock Novel Scientific Research Ideas? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the capability of LLMs in generating novel research ideas based on information from research papers. |
Sandeep Kumar; Tirthankar Ghosal; Vinayak Goyal; Asif Ekbal; | arxiv-cs.CL | 2024-09-09 |
391 | Identifying The Sources of Ideological Bias in GPT Models Through Linguistic Variation in Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we provide an original approach to identifying ideological bias in generative models, showing that bias can stem from both the training data and the filtering algorithm. |
Christina Walker; Joan C. Timoneda; | arxiv-cs.CL | 2024-09-09 |
392 | Low Latency Transformer Inference on FPGAs for Physics Applications with Hls4ml Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays(FPGAs) using hls4ml. |
ZHIXING JIANG et. al. | arxiv-cs.LG | 2024-09-08 |
393 | TracrBench: Generating Interpretability Testbeds with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Achieving a mechanistic understanding of transformer-based language models is an open challenge, especially due to their large number of parameters. Moreover, the lack of ground … |
Hannes Thurnherr; J’er’emy Scheurer; | ArXiv | 2024-09-07 |
394 | You Can Remove GPT2’s LayerNorm By Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we show that it is possible to remove the LN layers from a pre-trained GPT2-small model by fine-tuning on a fraction (500M tokens) of the training data. |
Stefan Heimersheim; | arxiv-cs.CL | 2024-09-06 |
395 | The Emergence of Large Language Models (LLM) As A Tool in Literature Reviews: An LLM Automated Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review. |
Dmitry Scherbakov; Nina Hubig; Vinita Jansari; Alexander Bakumenko; Leslie A. Lenert; | arxiv-cs.DL | 2024-09-06 |
396 | Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PIPELOAD mechanism, we present Hermes, a framework optimized for large model inference on edge devices. |
XUEYUAN HAN et. al. | arxiv-cs.DC | 2024-09-06 |
397 | Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of … |
Maheep Chaudhary; Atticus Geiger; | ArXiv | 2024-09-05 |
398 | CA-BERT: Leveraging Context Awareness for Enhanced Multi-Turn Chat Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional models often struggle with determining when additional context is necessary for generating appropriate responses. This paper introduces Context-Aware BERT (CA-BERT), a transformer-based model specifically fine-tuned to address this challenge. |
Minghao Liu; Mingxiu Sui; Yi Nan; Cangqing Wang; Zhijie Zhou; | arxiv-cs.CL | 2024-09-05 |
399 | LLM-based Multi-agent Poetry Generation in Non-cooperative Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Under the rationale that the learning process of the poetry generation systems should be more human-like and their output more diverse and novel, we introduce a framework based on social learning where we emphasize non-cooperative interactions besides cooperative interactions to encourage diversity. |
Ran Zhang; Steffen Eger; | arxiv-cs.CL | 2024-09-05 |
400 | CACER: Clinical Concept Annotations for Cancer Events and Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Clinical Concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48,000 medical problems and drug events and 10,000 drug-problem and problem-problem relations. |
YUJUAN FU et. al. | arxiv-cs.CL | 2024-09-05 |
401 | Detecting Calls to Action in Multimodal Content: Analysis of The 2021 German Federal Election Campaign on Instagram Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts. |
Michael Achmann-Denkler; Jakob Fehle; Mario Haim; Christian Wolff; | arxiv-cs.SI | 2024-09-04 |
402 | OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the experiments and results for the CheckThat! |
WŁODZIMIERZ LEWONIEWSKI et. al. | arxiv-cs.CL | 2024-09-04 |
403 | MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Finally many Transformer based approaches rely primarily on CNN based decoders overlooking the benefits of Transformer based decoding models. Recognizing these limitations, we address the need efficient lightweight solutions by introducing MobileUNETR, which aims to overcome the performance constraints associated with both CNNs and Transformers while minimizing model size, presenting a promising stride towards efficient image segmentation. |
Shehan Perera; Yunus Erzurumlu; Deepak Gulati; Alper Yilmaz; | arxiv-cs.CV | 2024-09-04 |
404 | Dialogue You Can Trust: Human and AI Perspectives on Generated Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing the GPT-4o API, we generated a diverse dataset of conversations and conducted a two-part experimental analysis. |
Ike Ebubechukwu; Johane Takeuchi; Antonello Ceravola; Frank Joublin; | arxiv-cs.CL | 2024-09-03 |
405 | LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Modeling and predicting such intricate behavior without explicit knowledge of the system’s underlying topology presents a significant challenge, motivating the development of algorithms that can generalize across various grid configurations and boundary conditions. We develop a decoder-only generative pretrained transformer (GPT) model to solve this problem, showing that our model can simulate Life on a toroidal grid with no prior knowledge on the size of the grid, or its periodic boundary conditions (LifeGPT). |
Jaime A. Berkovich; Markus J. Buehler; | arxiv-cs.AI | 2024-09-03 |
406 | How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper seeks to address this gap by providing a comprehensive case study evaluating LLMs’ performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) with respect to privacy policies and data protection regulations. We introduce a Privacy Technical Review (PTR) framework, highlighting its role in mitigating privacy risks during the software development life-cycle. |
XICHOU ZHU et. al. | arxiv-cs.CL | 2024-09-03 |
407 | Beyond ChatGPT: Enhancing Software Quality Assurance Tasks with Diverse LLMs and Validation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There remains a gap in understanding the performance of various LLMs in this critical domain. This paper aims to address this gap by conducting a comprehensive investigation into the capabilities of several LLMs across two SQA tasks: fault localization and vulnerability detection. |
Ratnadira Widyasari; David Lo; Lizi Liao; | arxiv-cs.SE | 2024-09-02 |
408 | Identifying Influential Nodes in Complex Networks Via Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
LEIYANG CHEN et. al. | Inf. Process. Manag. | 2024-09-01 |
409 | Deep Expertise and Interest Personalized Transformer for Expert Finding Related Papers Related Patents Related Grants Related Venues Related Experts View |
YINGHUI WANG et. al. | Inf. Process. Manag. | 2024-09-01 |
410 | Towards Faster Graph Partitioning Via Pre-training and Inductive Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. |
MENG QIN et. al. | arxiv-cs.LG | 2024-09-01 |
411 | ST2SI: Image Style Transfer Via Vision Transformer Using Spatial Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wenshu Li; Yinliang Chen; Xiaoying Guo; Xiaoyu He; | Comput. Graph. | 2024-09-01 |
412 | LightingFormer: Transformer-CNN Hybrid Network for Low-light Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View |
Cong Bi; Wenhua Qian; Jinde Cao; Xue Wang; | Comput. Graph. | 2024-09-01 |
413 | Research on LLM Acceleration Using The High-Performance RISC-V Processor Xiangshan (Nanhu Version) Based on The Open-Source Matrix Instruction Set Extension (Vector Dot Product) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main contributions of this paper are as follows: For the characteristics of large language models, custom instructions were extended based on the RISC-V instruction set to perform vector dot product calculations, accelerating the computation of large language models on dedicated vector dot product acceleration hardware. |
XU-HAO CHEN et. al. | arxiv-cs.AR | 2024-09-01 |
414 | An Empirical Study on Information Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs’ human-like characteristics, we propose and analyze the effects of a series of simple prompt-based methods, which can be generalized to other LLMs and NLP tasks. |
RIDONG HAN et. al. | arxiv-cs.CL | 2024-08-31 |
415 | From Text to Emotion: Unveiling The Emotion Annotation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the potential of Large Language Models (LLMs), specifically GPT4, in automating or assisting emotion annotation. |
Minxue Niu; Mimansa Jaiswal; Emily Mower Provost; | arxiv-cs.CL | 2024-08-30 |
416 | Finding Frames with BERT: A Transformer-based Approach to Generic News Frame Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The availability of digital data offers new possibilities for studying how specific aspects of social reality are made more salient in online communication but also raises challenges related to the scaling of framing analysis and its adoption to new research areas (e.g. studying the impact of artificial intelligence-powered systems on representation of societally relevant issues). To address these challenges, we introduce a transformer-based approach for generic news frame detection in Anglophone online content. |
Vihang Jumle; Mykola Makhortykh; Maryna Sydorova; Victoria Vziatysheva; | arxiv-cs.CL | 2024-08-30 |
417 | Can Large Language Models Address Open-Target Stance Detection? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Open-Target Stance Detection (OTSD), the most realistic task where targets are neither seen during training nor provided as input. |
Abu Ubaida Akash; Ahmed Fahmy; Amine Trabelsi; | arxiv-cs.CL | 2024-08-30 |
418 | Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning), using leverage retrieval information from the memory to aid in generating accurate answers and persuasive explanations without relying on complex networks and extra datasets. |
Su Hyeon Lim; Minkuk Kim; Hyeon Bae Kim; Seong Tae Kim; | arxiv-cs.CV | 2024-08-30 |
419 | Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study performs a comparative analysis of various natural language models for medical text classification. |
SHUBHAM AGARWAL et. al. | arxiv-cs.CL | 2024-08-30 |
420 | Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in The Environmental and Climate Change Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through this research, we aim to contribute to the ongoing discussion on the utility and effectiveness of generative LMs in addressing some of the planet’s most urgent issues, highlighting their strengths and limitations in the context of ecology and CC. |
Francesca Grasso; Stefano Locci; | arxiv-cs.CL | 2024-08-30 |
421 | ProGRes: Prompted Generative Rescoring on ASR N-Best Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs. |
Ada Defne Tur; Adel Moumen; Mirco Ravanelli; | arxiv-cs.CL | 2024-08-30 |
422 | Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing far-right and far-left ideological keywords and manually labeled them as extremist or non-extremist. |
Beidi Dong; Jin R. Lee; Ziwei Zhu; Balassubramanian Srinivasan; | arxiv-cs.CL | 2024-08-29 |
423 | MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following current trends in machine learning, we have created a foundation model for the MAPF problems called MAPF-GPT. |
Anton Andreychuk; Konstantin Yakovlev; Aleksandr Panov; Alexey Skrynnik; | arxiv-cs.MA | 2024-08-29 |
424 | Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Fully Pipelined Distributed Transformer (FPDT) for efficiently training long-context LLMs with extreme hardware efficiency. |
JINGHAN YAO et. al. | arxiv-cs.DC | 2024-08-29 |
425 | Can AI Replace Human Subjects? A Large-Scale Replication of Psychological Experiments with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that GPT-4 successfully replicates 76.0 percent of main effects and 47.0 percent of interaction effects observed in the original studies, closely mirroring human responses in both direction and significance. |
Ziyan Cui; Ning Li; Huaikang Zhou; | arxiv-cs.CL | 2024-08-29 |
426 | Unleashing The Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Audio-Language-Referenced SAM 2 (AL-Ref-SAM 2) pipeline to explore the training-free paradigm for audio and language-referenced video object segmentation, namely AVS and RVOS tasks. |
SHAOFEI HUANG et. al. | arxiv-cs.CV | 2024-08-28 |
427 | FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses Over SORRY-Bench (Automated Multi-shot Jailbreaks) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FRACTURED-SORRY-Bench, a framework for evaluating the safety of Large Language Models (LLMs) against multi-turn conversational attacks. |
Aman Priyanshu; Supriti Vijay; | arxiv-cs.CL | 2024-08-28 |
428 | Speech Recognition Transformers: Topological-lingualism Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper presents a comprehensive survey of transformer techniques oriented in speech modality. |
Shruti Singh; Muskaan Singh; Virender Kadyan; | arxiv-cs.CL | 2024-08-27 |
429 | The Mamba in The Llama: Distilling and Accelerating Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of converting these pretrained models for deployment. |
Junxiong Wang; Daniele Paliotta; Avner May; Alexander M. Rush; Tri Dao; | arxiv-cs.LG | 2024-08-27 |
430 | Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models Without Instruction-Following Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction fine-tuning is crucial for today’s large language models (LLMs) to learn to follow instructions and align with human preferences. Conventionally, supervised data, … |
Juncheng Xie; Shensian Syu; Hung-yi Lee; | ArXiv | 2024-08-27 |
431 | A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this review paper, we provide an extensive overview of various transformer architectures adapted for computer vision tasks. |
Gracile Astlin Pereira; Muhammad Hussain; | arxiv-cs.CV | 2024-08-27 |
432 | Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluated multiple models, including OpenAI’s gpt-3.5-turbo, gpt-4o, and ZhipuAI’s glm-4, through a two-phase testing approach. |
LIUCHANG XU et. al. | arxiv-cs.CL | 2024-08-26 |
433 | Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite considerable efforts in attack detection, intrusion detection systems remain mostly reactive, responding to specific patterns or observed anomalies. This work proposes a proactive approach to anticipate and mitigate malicious activities before they cause damage. |
Alaeddine Diaf; Abdelaziz Amara Korba; Nour Elislem Karabadji; Yacine Ghamri-Doudane; | arxiv-cs.CR | 2024-08-26 |
434 | One-layer Transformers Fail to Solve The Induction Heads Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient … |
Clayton Sanford; Daniel Hsu; Matus Telgarsky; | arxiv-cs.LG | 2024-08-26 |
435 | LowCLIP: Adapting The CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address challenges in vision-language retrieval for low-resource languages, we integrated the CLIP model architecture and employed several techniques to balance computational efficiency with performance. |
Ali Asgarov; Samir Rustamov; | arxiv-cs.CV | 2024-08-25 |
436 | Bidirectional Awareness Induction in Autoregressive Seq2Seq Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Bidirectional Awareness Induction (BAI), a training method that leverages a subset of elements in the network, the Pivots, to perform bidirectional learning without breaking the autoregressive constraints. |
Jia Cheng Hu; Roberto Cavicchioli; Alessandro Capotondi; | arxiv-cs.CL | 2024-08-25 |
437 | Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Methods: We used 300 gastroenterology board exam-style multiple-choice questions, 138 of which contain images to systematically assess the impact of model configurations and parameters and prompt engineering strategies utilizing GPT-3.5. |
SEYED AMIR AHMAD SAFAVI-NAINI et. al. | arxiv-cs.CL | 2024-08-25 |
438 | Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine Against COVID-19 Literature: Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed. |
XU TONG et. al. | arxiv-cs.CL | 2024-08-24 |
439 | Preliminary Investigations of A Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an innovative architecture that leverages the generative capabilities of zero-shot prompting in Large Language Models (LLMs) such as GPT-4(language only), the predictive ability of few-shot (in-context) learning in Large Multimodal Models (LMMs) such as GPT-4(V)ision, and fuses knowledge across image based and linguistic insights for accurate nanomaterial category prediction. |
Sakhinana Sagar Srinivas; Geethan Sannidhi; Sreeja Gangasani; Chidaksh Ravuru; Venkataramana Runkana; | arxiv-cs.CV | 2024-08-24 |
440 | CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a CNN-Transformer rectified collaborative learning (CTRCL) framework to learn stronger CNN-based and Transformer-based models for MIS tasks via the bi-directional knowledge transfer between them. |
LANHU WU et. al. | arxiv-cs.CV | 2024-08-24 |
441 | Enhancing Multi-hop Reasoning Through Knowledge Erasure in Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE). |
MENGQI ZHANG et. al. | arxiv-cs.CL | 2024-08-22 |
442 | Enhancing Automated Program Repair with Solution Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a compelling question: How can we leverage DR scattered across the issue logs to efficiently enhance APR? To investigate this premise, we introduce DRCodePilot, an approach designed to augment GPT-4-Turbo’s APR capabilities by incorporating DR into the prompt instruction. |
JIUANG ZHAO et. al. | arxiv-cs.SE | 2024-08-21 |
443 | The Self-Contained Negation Test Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we build on Gubelmann and Handschuh (2022), which studies the modification of PLMs’ predictions as a function of the polarity of inputs, in English. |
David Kletz; Pascal Amsili; Marie Candito; | arxiv-cs.CL | 2024-08-21 |
444 | Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry. |
MENGLIN YANG et. al. | kdd | 2024-08-21 |
445 | Clinical Context-aware Radiology Report Generation from Medical Images Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the use of the transformer model for radiology report generation from chest X-rays. |
Sonit Singh; | arxiv-cs.CL | 2024-08-21 |
446 | BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a pipeline for developing an in-house LLM to extract clinical information from radiology reports. |
YUXUAN CHEN et. al. | arxiv-cs.CL | 2024-08-21 |
447 | Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands. |
Pihe Hu; Shaolong Li; Longbo Huang; | arxiv-cs.LG | 2024-08-21 |
448 | Language Models Can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks. |
Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty; | acl | 2024-08-20 |
449 | CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of a comprehensive benchmark impedes progress in this field. To bridge this gap, we introduce CharacterEval, a Chinese benchmark for comprehensive RPCA assessment, complemented by a tailored high-quality dataset. |
QUAN TU et. al. | acl | 2024-08-20 |
450 | Mission: Impossible Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we develop a set of synthetic impossible languages of differing complexity, each designed by systematically altering English data with unnatural word orders and grammar rules. |
Julie Kallini; Isabel Papadimitriou; Richard Futrell; Kyle Mahowald; Christopher Potts; | acl | 2024-08-20 |
451 | Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level operating system microkernel, seL4. |
Lichen Zhang; Shuai Lu; Nan Duan; | acl | 2024-08-20 |
452 | Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, methods leveraging pre-trained language models like BERT have been developed, which require less data and yield enhanced performance. |
YUCHENG RUAN et. al. | arxiv-cs.CL | 2024-08-20 |
453 | Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Aim: Our goal is to improve AD detection performance of various ML/DL models. |
Emmanuel Iko-Ojo Simon; Chirath Hettiarachchi; Alex Potanin; Hanna Suominen; Fatemeh Fard; | arxiv-cs.SE | 2024-08-20 |
454 | Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis. |
Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu; | acl | 2024-08-20 |
455 | MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability�MELA, with 46K samples covering 10 languages from a diverse set of language families. |
ZIYIN ZHANG et. al. | acl | 2024-08-20 |
456 | ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, and undergoes a comprehensive training regime with pre-training, SFT, and RLHF. |
Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang; | acl | 2024-08-20 |
457 | Self-Evolving GPT: A Lifelong Autonomous Experiential Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential learning framework based on LLMs to explore whether LLMs can imitate human ability for learning and utilizing experience. |
JINGLONG GAO et. al. | acl | 2024-08-20 |
458 | GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3. |
Virginia Felkner; Jennifer Thompson; Jonathan May; | acl | 2024-08-20 |
459 | CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Incorrect initial angles between Q and K can cause misestimation in modeling rotary position embedding of the closest tokens. To address this issue, we propose Collinear Constrained Attention mechanism, namely CoCA. |
SHIYI ZHU et. al. | acl | 2024-08-20 |
460 | Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Syntactic Transformer language models aim to achieve better generalization through simultaneously modeling syntax trees and sentences. |
Yida Zhao; Chao Lou; Kewei Tu; | acl | 2024-08-20 |
461 | The MERSA Dataset and A Transformer-Based Approach for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Multimodal Emotion Recognition and Sentiment Analysis (MERSA) dataset, which includes both natural and scripted speech recordings, transcribed text, physiological data, and self-reported emotional surveys from 150 participants collected over a two-week period. |
Enshi Zhang; Rafael Trujillo; Christian Poellabauer; | acl | 2024-08-20 |
462 | MultiLegalPile: A 689GB Multilingual Legal Corpus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, so far, few datasets are available for specialized critical domains such as law and the available ones are often small and only in English. To fill this gap, we curate and release MultiLegalPile, a 689GB corpus in 24 languages from 17 jurisdictions. |
Joel Niklaus; Veton Matoshi; Matthias St�rmer; Ilias Chalkidis; Daniel Ho; | acl | 2024-08-20 |
463 | Your Transformer Is Secretly Linear Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper reveals a novel linear characteristic exclusive to transformer decoders, including models like GPT, LLaMA, OPT, BLOOM and others. |
ANTON RAZZHIGAEV et. al. | acl | 2024-08-20 |
464 | MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel **map**-guided **GPT**-based agent, dubbed **MapGPT**, which introduces an online linguistic-formed map to encourage the global exploration. |
JIAQI CHEN et. al. | acl | 2024-08-20 |
465 | Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs By Sampling with People Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by methods from cognitive science, we propose an iterative method for simultaneously eliciting conversational tones and sentences, where participants alternate between two tasks: (1) one participant identifies the tone of a given sentence and (2) a different participant generates a sentence based on that tone. |
Dun-Ming Huang; Pol Van Rijn; Ilia Sucholutsky; Raja Marjieh; Nori Jacoby; | acl | 2024-08-20 |
466 | Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer. |
MULING WU et. al. | acl | 2024-08-20 |
467 | An Empirical Analysis on Large Language Models in Debate Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3. |
Xinyi Liu; Pinxin Liu; Hangfeng He; | acl | 2024-08-20 |
468 | Linear Transformers with Learnable Kernel Functions Are Better In-Context Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Mirroring the Transformer�s in-context adeptness, it became a strong contender in the field. In our work, we present a singular, elegant alteration to the Based kernel that amplifies its In-Context Learning abilities evaluated with the Multi-Query Associative Recall task and overall language modeling process, as demonstrated on the Pile dataset. |
YAROSLAV AKSENOV et. al. | acl | 2024-08-20 |
469 | Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian. |
CEM ÜYÜK et. al. | arxiv-cs.CL | 2024-08-20 |
470 | D2LLM: Decomposed and Distilled Large Language Models for Semantic Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present D2LLMs�Decomposed and Distilled LLMs for semantic search�that combines the best of both worlds. |
Zihan Liao; Hang Yu; Jianguo Li; Jun Wang; Wei Zhang; | acl | 2024-08-20 |
471 | Demystifying The Communication Characteristics for Distributed Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use GPT-based language models as a case study of the transformer architecture due to their ubiquity. |
QUENTIN ANTHONY et. al. | arxiv-cs.DC | 2024-08-19 |
472 | Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs). |
Aviv Bick; Kevin Y. Li; Eric P. Xing; J. Zico Kolter; Albert Gu; | arxiv-cs.LG | 2024-08-19 |
473 | Rhyme-aware Chinese Lyric Generator Based on GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance. |
YIXIAO YUAN et. al. | arxiv-cs.CL | 2024-08-19 |
474 | How Well Do Large Language Models Serve As End-to-End Secure Code Producers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic investigation into LLMs’ inherent potential to generate code with fewer vulnerabilities. |
JIANIAN GONG et. al. | arxiv-cs.SE | 2024-08-19 |
475 | GPT-based Textile Pilling Classification Using 3D Point Cloud Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on PointGPT, the GPT-like big model of point cloud analysis, we incorporate the global features of the input point cloud extracted from the non-parametric network into it, thus proposing the PointGPT+NN model. |
Yu Lu; YuYu Chen; Gang Zhou; Zhenghua Lan; | arxiv-cs.CV | 2024-08-19 |
476 | STransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we review and categorize existing Transformer-based models into two main types: (1) modifications to the model structure and (2) modifications to the input data. |
JIAHENG YIN et. al. | arxiv-cs.LG | 2024-08-19 |
477 | Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare a keyword filtering approach, a RoBERTa model fine-tuned with generic data from CrisisLex, a base RoBERTa model trained with AL and a fine-tuned RoBERTa model trained with AL regarding classification performance. |
David Hanny; Sebastian Schmidt; Bernd Resch; | arxiv-cs.CL | 2024-08-19 |
478 | Attention Is A Smoothed Cubic Spline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We highlight a perhaps important but hitherto unobserved insight: The attention module in a transformer is a smoothed cubic spline. |
Zehua Lai; Lek-Heng Lim; Yucong Liu; | arxiv-cs.AI | 2024-08-18 |
479 | A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets. |
CLAUDIO M. V. DE ANDRADE et. al. | arxiv-cs.CL | 2024-08-18 |
480 | From Specifications to Prompts: On The Future of Generative LLMs in Requirements Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative LLMs, such as GPT, have the potential to revolutionize Requirements Engineering (RE) by automating tasks in new ways. This column explores the novelties and introduces … |
Andreas Vogelsang; | arxiv-cs.SE | 2024-08-17 |
481 | See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Designing tasks and finding LLMs’ limitations are becoming increasingly important. In this paper, we investigate the question of whether an LLM can discover its own limitations from the errors it makes. |
YULONG CHEN et. al. | arxiv-cs.CL | 2024-08-16 |
482 | MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a pure Transformer-based SED model with masked-reconstruction based pre-training, termed MAT-SED. |
Pengfei Cai; Yan Song; Kang Li; Haoyu Song; Ian McLoughlin; | arxiv-cs.SD | 2024-08-16 |
483 | Extracting Sentence Embeddings from Pretrained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: Given 110M parameters BERT’s hidden representations from multiple layers and multiple tokens we tried various ways to extract optimal sentence representations. |
Lukas Stankevičius; Mantas Lukoševičius; | arxiv-cs.CL | 2024-08-15 |
484 | Leveraging Web-Crawled Data for High-Quality Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We argue that although the web-crawled data often has formatting errors causing semantic inaccuracies, it can still serve as a valuable source for high-quality supervised fine-tuning in specific domains without relying on advanced models like GPT-4. |
Jing Zhou; Chenglin Jiang; Wei Shen; Xiao Zhou; Xiaonan He; | arxiv-cs.CL | 2024-08-15 |
485 | Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This survey paper provides a comprehensive analysis of the utilization of Transformers and LLMs in cyber-threat detection systems. |
Hamza Kheddar; | arxiv-cs.CR | 2024-08-14 |
486 | CodeMirage: Hallucinations in Code Generated By Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have shown promising potentials in program generation and no-code automation. However, LLMs are prone to generate hallucinations, i.e., they generate … |
Vibhor Agarwal; Yulong Pei; Salwa Alamir; Xiaomo Liu; | ArXiv | 2024-08-14 |
487 | MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Models for Multimodal Surface Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting). |
YONGQUAN HU et. al. | arxiv-cs.HC | 2024-08-14 |
488 | Exploring Transformer Models for Sentiment Classification: A Comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning models have proven superior to classical machine learning approaches in various text classification tasks, such as sentiment analysis, question answering, news … |
Ali Areshey; H. Mathkour; | Expert Syst. J. Knowl. Eng. | 2024-08-14 |
489 | Generative AI for Automatic Topic Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to assess the reliability of three LLMs, namely flan, GPT-4o, and GPT-4 mini for topic labelling. |
Diego Kozlowski; Carolina Pradier; Pierre Benz; | arxiv-cs.CL | 2024-08-13 |
490 | Evaluating Cultural Adaptability of A Large Language Model Via Simulation of Synthetic Personas Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis shows that specifying a person’s country of residence improves GPT-3.5’s alignment with their responses. |
Louis Kwok; Michal Bravansky; Lewis D. Griffin; | arxiv-cs.CL | 2024-08-13 |
491 | Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel resource efficient deep learning and transformer based framework, Sumotosima (Summarizer for otoscopic images), an end-to-end pipeline for classification followed by summarization. |
Eram Anwarul Khan; Anas Anwarul Haq Khan; | arxiv-cs.CV | 2024-08-13 |
492 | MGH Radiology Llama: A Llama 3 70B Model for Radiology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the field of radiology has increasingly harnessed the power of artificial intelligence (AI) to enhance diagnostic accuracy, streamline workflows, and improve … |
YUCHENG SHI et. al. | ArXiv | 2024-08-13 |
493 | Pragmatic Inference of Scalar Implicature By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some. |
Ye-eun Cho; Seong mook Kim; | arxiv-cs.CL | 2024-08-13 |
494 | A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the constantly evolving field of cybersecurity, it is imperative for analysts to stay abreast of the latest attack trends and pertinent information that aids in the investigation and attribution of cyber-attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecurity experts about cyber-attacks investigations and attribution. |
Sampath Rajapaksha; Ruby Rani; Erisa Karafili; | arxiv-cs.CR | 2024-08-12 |
495 | Body Transformer: Leveraging Robot Embodiment for Policy Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. Therefore, we propose Body Transformer (BoT), an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process. |
Carmelo Sferrazza; Dun-Ming Huang; Fangchen Liu; Jongmin Lee; Pieter Abbeel; | arxiv-cs.RO | 2024-08-12 |
496 | A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is a huge gap between LLM’s and human capabilities for understanding abstract concepts and reasoning. We discuss these issues in a larger philosophical context of human knowledge acquisition and the Turing test. |
Vladimir Cherkassky; Eng Hock Lee; | arxiv-cs.CL | 2024-08-12 |
497 | Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the effectiveness of LLMs in detecting and classifying Common Weakness Enumerations (CWE) using different prompt and role strategies. |
Kohei Dozono; Tiago Espinha Gasiba; Andrea Stocco; | arxiv-cs.SE | 2024-08-12 |
498 | The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability of findings across different scenarios. We address this gap by training language models with progressing complexity on trauma-related datasets, including genocide-related court data, a Reddit dataset on post-traumatic stress disorder (PTSD), counseling conversations, and Incel forum posts. |
Miriam Schirmer; Tobias Leemann; Gjergji Kasneci; Jürgen Pfeffer; David Jurgens; | arxiv-cs.CL | 2024-08-12 |
499 | Spacetime $E(n)$-Transformer: Equivariant Attention for Spatio-temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an $E(n)$-equivariant Transformer architecture for spatio-temporal graph data. |
Sergio G. Charles; | arxiv-cs.LG | 2024-08-12 |
500 | Is It A Work or Leisure Travel? Applying Text Classification to Identify Work-related Travel on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a model to predict whether a trip is leisure or work-related, utilizing state-of-the-art Automatic Text Classification (ATC) models such as BERT, RoBERTa, and BART to enhance the understanding of user travel purposes and improve recommendation accuracy in specific travel scenarios. |
Lucas Félix; Washington Cunha; Jussara Almeida; | arxiv-cs.SI | 2024-08-12 |
501 | Cross-Lingual Conversational Speech Summarization with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We build a baseline cascade-based system using open-source speech recognition and machine translation models. |
Max Nelson; Shannon Wotherspoon; Francis Keith; William Hartmann; Matthew Snover; | arxiv-cs.CL | 2024-08-12 |
502 | GPT-4 Emulates Average-Human Emotional Cognition from A Third-Person Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper extends recent investigations on the emotional reasoning abilities of Large Language Models (LLMs). Current research on LLMs has not directly evaluated the distinction … |
Ala Nekouvaght Tak; Jonathan Gratch; | ArXiv | 2024-08-11 |
503 | Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches struggle with CQA due to two challenges: (1) precisely identifying necessary conditions and the logical relationship, and (2) verifying conditions to detect any that are missing. In this paper, we propose a novel prompting approach, Chain of condition, by first identifying all conditions and constructing their logical relationships explicitly according to the document, then verifying whether these conditions are satisfied, finally solving the logical expression to indicate any missing conditions and generating the answer accordingly. |
Jiuheng Lin; Yuxuan Lai; Yansong Feng; | arxiv-cs.CL | 2024-08-10 |
504 | Evaluating The Capability of Large Language Models to Personalize Science Texts for Diverse Middle-school-age Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, GPT-4 was used to profile student learning preferences based on choices made during a training session. |
Michael Vaccaro Jr; Mikayla Friday; Arash Zaghi; | arxiv-cs.HC | 2024-08-09 |
505 | From Text to Insight: Leveraging Large Language Models for Performance Evaluation in Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through comparative analyses across two studies, including various task performance outputs, we demonstrate that LLMs can serve as a reliable and even superior alternative to human raters in evaluating knowledge-based performance outputs, which are a key contribution of knowledge workers. |
Ning Li; Huaikang Zhou; Mingze Xu; | arxiv-cs.CL | 2024-08-09 |
506 | Retrieval-augmented Code Completion for Local Projects Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on using LLMs with around 160 million parameters that are suitable for local execution and augmentation with retrieval from local projects. |
Marko Hostnik; Marko Robnik-Šikonja; | arxiv-cs.SE | 2024-08-09 |
507 | Transformer Explainer: Interactive Learning of Text-Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model. |
AEREE CHO et. al. | arxiv-cs.LG | 2024-08-08 |
508 | Multi-Class Intrusion Detection Based on Transformer for IoT Networks Using CIC-IoT-2023 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study uses deep learning methods to explore the Internet of Things (IoT) network intrusion detection method based on the CIC-IoT-2023 dataset. This dataset contains extensive … |
Shu-Ming Tseng; Yan-Qi Wang; Yung-Chung Wang; | Future Internet | 2024-08-08 |
509 | Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles Using LLMs and LMMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores how LLMs and LMMs can assist journalistic practice by generating contextualised captions for images accompanying news articles. |
Aliki Anagnostopoulou; Thiago Gouvea; Daniel Sonntag; | arxiv-cs.CL | 2024-08-08 |
510 | Towards Explainable Network Intrusion Detection Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current state-of-the-art NIDS rely on artificial benchmarking datasets, resulting in skewed performance when applied to real-world networking environments. Therefore, we compare the GPT-4 and LLama3 models against traditional architectures and transformer-based models to assess their ability to detect malicious NetFlows without depending on artificially skewed datasets, but solely on their vast pre-trained acquired knowledge. |
Paul R. B. Houssel; Priyanka Singh; Siamak Layeghy; Marius Portmann; | arxiv-cs.CR | 2024-08-08 |
511 | SocFedGPT: Federated GPT-based Adaptive Content Filtering System Leveraging User Interactions in Social Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study presents a multifaceted approach to enhancing user interaction and content relevance in social media platforms through a federated learning framework. We introduce … |
Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder; | ArXiv | 2024-08-07 |
512 | Is Child-Directed Speech Effective Training Data for Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and a new matched, synthetic dataset (TinyDialogues), comparing to OpenSubtitles, Wikipedia, and a heterogeneous blend of datasets from the BabyLM challenge. |
Steven Y. Feng; Noah D. Goodman; Michael C. Frank; | arxiv-cs.CL | 2024-08-07 |
513 | A Comparison of LLM Finetuning Methods & Evaluation Metrics with Travel Chatbot Use Case Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We used two pretrained LLMs utilized for fine-tuning research: LLaMa 2 7B, and Mistral 7B. |
Sonia Meyer; Shreya Singh; Bertha Tam; Christopher Ton; Angel Ren; | arxiv-cs.CL | 2024-08-07 |
514 | Evaluating Source Code Quality with Large Languagem Models: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to describe and analyze the results obtained using LLMs as a static analysis tool, evaluating the overall quality of code. |
Igor Regis da Silva Simões; Elaine Venson; | arxiv-cs.SE | 2024-08-07 |
515 | Image-to-LaTeX Converter for Mathematical Formulas and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this project, we train a vision encoder-decoder model to generate LaTeX code from images of mathematical formulas and text. |
Daniil Gurgurov; Aleksey Morshnev; | arxiv-cs.CL | 2024-08-07 |
516 | FLASH: Federated Learning-Based LLMs for Advanced Query Processing in Social Networks Through RAG Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our paper introduces a novel approach to social network information retrieval and user engagement through a personalized chatbot system empowered by Federated Learning GPT. The … |
Sai Puppala; Ismail Hossain; Md Jahangir Alam; Sajedul Talukder; | ArXiv | 2024-08-06 |
517 | Training LLMs to Recognize Hedges in Spontaneous Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: After an error analysis on the top performing approaches, we used an LLM-in-the-Loop approach to improve the gold standard coding, as well as to highlight cases in which hedges are ambiguous in linguistically interesting ways that will guide future research. |
Amie J. Paige; Adil Soubki; John Murzaku; Owen Rambow; Susan E. Brennan; | arxiv-cs.CL | 2024-08-06 |
518 | HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. |
Pratyush Dhingra; Janardhan Rao Doppa; Partha Pratim Pande; | arxiv-cs.AR | 2024-08-06 |
519 | Accuracy and Consistency of LLMs in The Registered Dietitian Exam: The Impact of Prompt Engineering and Knowledge Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to employ the Registered Dietitian (RD) exam to conduct a standard and comprehensive evaluation of state-of-the-art LLMs, GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro, assessing both accuracy and consistency in nutrition queries. |
Iman Azimi; Mohan Qi; Li Wang; Amir M. Rahmani; Youlin Li; | arxiv-cs.CL | 2024-08-06 |
520 | Evaluating The Translation Performance of Large Language Models Based on Euas-20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the significant progress in translation performance achieved by large language models, machine translation still faces many challenges. Therefore, in this paper, we construct the dataset Euas-20 to evaluate the performance of large language models on translation tasks, the translation ability on different languages, and the effect of pre-training data on the translation ability of LLMs for researchers and developers. |
Yan Huang; Wei Liu; | arxiv-cs.CL | 2024-08-06 |
521 | Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG. |
Aaron Imani; Iftekhar Ahmed; Mohammad Moshirpour; | arxiv-cs.SE | 2024-08-05 |
522 | PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent success of pre-trained models (PTMs) in natural language processing (NLP), we present PTM4Tag+, a tag recommendation framework for Stack Overflow posts that utilizes PTMs in language modeling. |
JUNDA HE et. al. | arxiv-cs.SE | 2024-08-05 |
523 | Evaluating The Performance of Large Language Models for SDG Mapping (Technical Report) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we compare the performance of various language models on the Sustainable Development Goal (SDG) mapping task, using the output of GPT-4o as the baseline. |
Hui Yin; Amir Aryani; Nakul Nambiar; | arxiv-cs.LG | 2024-08-04 |
524 | X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities. |
HAOYUAN JIANG et. al. | ijcai | 2024-08-03 |
525 | QFormer: An Efficient Quaternion Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Secondly, the DCNNs or Transformer-based image denoising models usually have a large number of parameters, high computational complexity, and slow inference speed. To resolve these issues, this paper proposes a highly-efficient Quaternion Transformer (QFormer) for image denoising. |
Bo Jiang; Yao Lu; Guangming Lu; Bob Zhang; | ijcai | 2024-08-03 |
526 | Class-consistent Contrastive Learning Driven Cross-dimensional Transformer for 3D Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer emerges as an active research topic in medical image analysis. Yet, three substantial challenges limit the effectiveness of both 2D and 3D Transformers in 3D medical … |
Qikui Zhu; Chuan Fu; Shuo Li; | ijcai | 2024-08-03 |
527 | MiniCPM-V: A GPT-4V Level MLLM on Your Phone IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MiniCPM-V, a series of efficient MLLMs deployable on end-side devices. |
YUAN YAO et. al. | arxiv-cs.CV | 2024-08-03 |
528 | AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce AVESFormer, the first real-time Audio-Visual Efficient Segmentation transformer that achieves fast, efficient and light-weight simultaneously. |
ZILI WANG et. al. | arxiv-cs.CV | 2024-08-03 |
529 | Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants. |
ZHUOYI LIN et. al. | ijcai | 2024-08-03 |
530 | TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires. |
Yicheng Lin; Dandan Zhang; Yun Liu; | arxiv-cs.LG | 2024-08-02 |
531 | Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the basis of this analysis, we propose DefinitionEMB, a method that utilizes definitions to re-construct isotropically distributed and semantics-related token embeddings for encoder-based PLMs while maintaining original robustness during fine-tuning. |
Ying Zhang; Dongyuan Li; Manabu Okumura; | arxiv-cs.CL | 2024-08-02 |
532 | Efficacy of Large Language Models in Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the effectiveness of Large Language Models (LLMs) in interpreting existing literature through a systematic review of the relationship between Environmental, Social, and Governance (ESG) factors and financial performance. |
Aaditya Shah; Shridhar Mehendale; Siddha Kanthi; | arxiv-cs.CL | 2024-08-02 |
533 | Toward Automatic Relevance Judgment Using Vision–Language Models for Image–Text Retrieval Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain. |
Jheng-Hong Yang; Jimmy Lin; | arxiv-cs.IR | 2024-08-02 |
534 | High-Throughput Phenotyping of Clinical Text Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a performance comparison of GPT-4 and GPT-3.5-Turbo. |
Daniel B. Hier; S. Ilyas Munzir; Anne Stahlfeld; Tayo Obafemi-Ajayi; Michael D. Carrithers; | arxiv-cs.CL | 2024-08-02 |
535 | Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces ‘Psycho Analyst’, a custom GPT model based on OpenAI’s GPT-4, optimized for pre-screening mental health disorders. |
Jinwen Tang; Yi Shang; | arxiv-cs.CY | 2024-08-02 |
536 | Toward Automatic Relevance Judgment Using Vision-Language Models for Image-Text Retrieval Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vision–Language Models (VLMs) have demonstrated success across diverse applications, yet their potential to assist in relevance judgments remains uncertain. This paper assesses … |
Jheng-Hong Yang; Jimmy Lin; | ArXiv | 2024-08-02 |
537 | Graph Transformer for 3D Point Clouds Classification and Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wei Zhou; Qian Wang; Weiwei Jin; X. Shi; Ying He; | Comput. Graph. | 2024-08-01 |
538 | MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception … |
CONG JIN et. al. | IEEE Transactions on Circuits and Systems for Video … | 2024-08-01 |
539 | Unmasking Large Language Models By Means of OpenAI GPT-4 and Google AI: A Deep Instruction-based Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View |
IDREES A. ZAHID et. al. | Intell. Syst. Appl. | 2024-08-01 |
540 | Bilateral Transformer 3D Planar Recovery Related Papers Related Patents Related Grants Related Venues Related Experts View |
Fei Ren; Chunhua Liao; Zhina Xie; | Graph. Model. | 2024-08-01 |
541 | LCFormer: Linear Complexity Transformer for Efficient Image Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xiang Gao; Sining Wu; Ying Zhou; Fan Wang; Xiaopeng Hu; | Multim. Syst. | 2024-08-01 |
542 | CATNet: Cascaded Attention Transformer Network for Marine Species Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Weidong Zhang; Gongchao Chen; Peixian Zhuang; Wenyi Zhao; Ling Zhou; | Expert Syst. Appl. | 2024-08-01 |
543 | Leveraging Large Language Models (LLMs) for Traffic Management at Urban Intersections: The Case of Mixed Traffic Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the ability of a Large Language Model (LLM), specifically, GPT-4o-mini to improve traffic management at urban intersections. |
Sari Masri; Huthaifa I. Ashqar; Mohammed Elhenawy; | arxiv-cs.CL | 2024-08-01 |
544 | MAE-EEG-Transformer: A Transformer-based Approach Combining Masked Autoencoder and Cross-individual Data Augmentation Pre-training for EEG Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Miao Cai; Yu Zeng; | Biomed. Signal Process. Control. | 2024-08-01 |
545 | Granting GPT-4 License and Opportunity: Enhancing Accuracy and Confidence Estimation for Few-Shot Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present effort explores methods for effective confidence estimation with GPT-4 with few-shot learning for event detection in the BETTER ontology as a vehicle. |
Steven Fincke; Adrien Bibal; Elizabeth Boschee; | arxiv-cs.AI | 2024-08-01 |
546 | TR-TransGAN: Temporal Recurrent Transformer Generative Adversarial Network for Longitudinal MRI Dataset Expansion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Longitudinal magnetic resonance imaging (MRI) datasets have important implications for the study of degenerative diseases because such datasets have data from multiple points in … |
CHEN-CHEN FAN et. al. | IEEE Transactions on Cognitive and Developmental Systems | 2024-08-01 |
547 | Performance of Recent Large Language Models for A Low-Resourced Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have shown significant advances in the past year. |
Ravindu Jayakody; Gihan Dias; | arxiv-cs.CL | 2024-07-31 |
548 | OmniParser for Pure Vision Based GUI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. |
Yadong Lu; Jianwei Yang; Yelong Shen; Ahmed Awadallah; | arxiv-cs.CV | 2024-07-31 |
549 | The Llama 3 Herd of Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new set of foundation models, called Llama 3. |
ABHIMANYU DUBEY et. al. | arxiv-cs.AI | 2024-07-31 |
550 | Generative Expressive Conversational Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, due to the limitations of small-scale datasets containing scripted recording styles, they often fail to simulate real natural conversational styles. To address the above issues, we propose a novel generative expressive CSS system, termed GPT-Talker. |
Rui Liu; Yifan Hu; Yi Ren; Xiang Yin; Haizhou Li; | arxiv-cs.CL | 2024-07-31 |
551 | Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ultimately, we find that the GPT models that we evaluated are not suitable for fully automated vulnerability scanning because the false positive and false negative rates are too high to likely be useful in practice. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CR | 2024-07-31 |
552 | Enhancing Agricultural Machinery Management Through Advanced LLM Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel approach that leverages large language models (LLMs), particularly GPT-4, combined with multi-round prompt engineering to enhance decision-making processes in agricultural machinery management. |
Emily Johnson; Noah Wilson; | arxiv-cs.CL | 2024-07-30 |
553 | Robust Load Prediction of Power Network Clusters Based on Cloud-Model-Improved Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Presenting an innovative approach, the Cloud Model Improved Transformer (CMIT) method integrates the Transformer model with the cloud model utilizing the particle swarm optimization algorithm, with the aim of achieving robust and precise power load predictions. |
Cheng Jiang; Gang Lu; Xue Ma; Di Wu; | arxiv-cs.LG | 2024-07-30 |
554 | Interpretable Pre-Trained Transformers for Heart Time-Series Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we employ this framework to the analysis of clinical heart time-series data, to create two pre-trained general purpose cardiac models, termed PPG-PT and ECG-PT. |
Harry J. Davies; James Monsen; Danilo P. Mandic; | arxiv-cs.LG | 2024-07-30 |
555 | Comparison of Large Language Models for Generating Contextually Relevant Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The contribution of this research is the analysis of the capacity of LLMs for Automatic Question Generation in education. |
IVO LODOVICO MOLINA et. al. | arxiv-cs.CL | 2024-07-30 |
556 | AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We derive an analytical model for the dependence of optimal weights on data scale and introduce *AutoScale*, a novel, practical approach for optimizing data compositions at potentially large training data scales. |
FEIYANG KANG et. al. | arxiv-cs.LG | 2024-07-29 |
557 | Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address sentiment analysis of Lithuanian five-star-based online reviews from multiple domains that we collect and clean. |
Brigita Vileikytė; Mantas Lukoševičius; Lukas Stankevičius; | arxiv-cs.CL | 2024-07-29 |
558 | DuA: Dual Attentive Transformer in Long-Term Continuous EEG Emotion Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods encounter significant challenges in real-life scenarios where emotional states evolve over extended periods. To address this issue, we propose a Dual Attentive (DuA) transformer framework for long-term continuous EEG emotion analysis. |
YUE PAN et. al. | arxiv-cs.HC | 2024-07-29 |
559 | Legal Minds, Algorithmic Decisions: How LLMs Apply Constitutional Principles in Complex Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct an empirical analysis of how large language models (LLMs), specifically GPT-4, interpret constitutional principles in complex decision-making scenarios. |
Camilla Bignotti; Carolina Camassa; | arxiv-cs.CL | 2024-07-29 |
560 | Motamot: A Dataset for Revealing The Supremacy of Large Language Models Over Transformer Models in Bengali Political Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate political sentiment analysis during Bangladeshi elections, specifically examining how effectively Pre-trained Language Models (PLMs) and Large Language Models (LLMs) capture complex sentiment characteristics. |
FATEMA TUJ JOHORA FARIA et. al. | arxiv-cs.CL | 2024-07-28 |
561 | The Impact of LoRA Adapters for LLMs on Clinical NLP Classification Under Data Limitations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to the domain gap and limited data availability. |
Thanh-Dung Le; Ti Ti Nguyen; Vu Nguyen Ha; | arxiv-cs.CL | 2024-07-27 |
562 | FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks. |
Seyed Mojtaba Sadjadi; Zeinab Rajabi; Leila Rabiei; Mohammad-Shahram Moin; | arxiv-cs.CL | 2024-07-27 |
563 | QT-TDM: Planning With Transformer Dynamics Model and Autoregressive Q-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the success of the Transformer architecture in natural language processing and computer vision, we investigate the use of Transformers in Reinforcement Learning (RL), specifically in modeling the environment’s dynamics using Transformer Dynamics Models (TDMs). |
Mostafa Kotb; Cornelius Weber; Muhammad Burhan Hafez; Stefan Wermter; | arxiv-cs.LG | 2024-07-26 |
564 | GPT Deciphering Fedspeak: Quantifying Dissent Among Hawks and Doves Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use GPT-4 to quantify dissent among members on the topic of inflation. |
DENIS PESKOFF et. al. | arxiv-cs.AI | 2024-07-26 |
565 | Automatic Detection of Moral Values in Music Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. |
Vjosa Preniqi; Iacopo Ghinassi; Julia Ive; Kyriaki Kalimeri; Charalampos Saitis; | arxiv-cs.CY | 2024-07-26 |
566 | Is Larger Always Better? Evaluating and Prompting Large Language Models for Non-generative Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models, for non-generative medical tasks utilizing renowned datasets. |
YINGHAO ZHU et. al. | arxiv-cs.CL | 2024-07-26 |
567 | Using GPT-4 to Guide Causal Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are interested in the ability of LLMs to identify causal relationships. |
Anthony C. Constantinou; Neville K. Kitson; Alessio Zanga; | arxiv-cs.AI | 2024-07-26 |
568 | Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel joint graph learning approach that combines the rich contextual representations learned by pre-trained single-cell language models with the structured knowledge encoded in GRNs using graph neural networks (GNNs). |
Sindhura Kommu; Yizhi Wang; Yue Wang; Xuan Wang; | arxiv-cs.LG | 2024-07-25 |
569 | HDL-GPT: High-Quality HDL Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models. |
BHUVNESH KUMAR et. al. | arxiv-cs.LG | 2024-07-25 |
570 | The Power of Combining Data and Knowledge: GPT-4o Is An Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models to enhance LNM prediction performance. |
Danqing Hu; Bing Liu; Xiaofeng Zhu; Nan Wu; | arxiv-cs.CL | 2024-07-25 |
571 | Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving. |
Zuoyin Tang; Jianhua He; Dashuai Pei; Kezhong Liu; Tao Gao; | arxiv-cs.AI | 2024-07-24 |
572 | Cost-effective Instruction Learning for Pathology Vision and Language Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we propose a cost-effective instruction learning framework for conversational pathology named as CLOVER. |
KAITAO CHEN et. al. | arxiv-cs.AI | 2024-07-24 |
573 | Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have exhibited remarkable proficiency in natural language understanding, prompting extensive exploration of their potential applications across … |
Cui Long; Yongbin Liu; Chunping Ouyang; Ying Yu; | ArXiv | 2024-07-24 |
574 | SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we introduced SDoH-GPT, a simple and effective few-shot Large Language Model (LLM) method leveraging contrastive examples and concise instructions to extract SDoH without relying on extensive medical annotations or costly human intervention. |
BERNARDO CONSOLI et. al. | arxiv-cs.CL | 2024-07-24 |
575 | My Ontologist: Evaluating BFO-Based AI for Definition Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through iterative development of a specialized GPT model named My Ontologist, we aimed to generate BFO-conformant ontologies. |
Carter Benson; Alec Sculley; Austin Liebers; John Beverley; | arxiv-cs.DB | 2024-07-24 |
576 | Artificial Intelligence in Extracting Diagnostic Data from Dental Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research addresses the issue of missing structured data in dental records by extracting diagnostic information from unstructured text. |
YAO-SHUN CHUANG et. al. | arxiv-cs.CL | 2024-07-23 |
577 | OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. |
FAN CUI et. al. | arxiv-cs.AR | 2024-07-23 |
578 | Can Large Language Models Automatically Jailbreak GPT-4V? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization. |
YUANWEI WU et. al. | arxiv-cs.CL | 2024-07-23 |
579 | RadioRAG: Factual Large Language Models for Enhanced Diagnostics in Radiology Using Dynamic Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have advanced the field of artificial intelligence (AI) in medicine. |
SOROOSH TAYEBI ARASTEH et. al. | arxiv-cs.CL | 2024-07-22 |
580 | Inverted Activations: Reducing Memory Footprint in Neural Network Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a modification to the handling of activation tensors in pointwise nonlinearity layers. |
Georgii Novikov; Ivan Oseledets; | arxiv-cs.LG | 2024-07-22 |
581 | Dissecting Multiplication in Transformers: Insights Into LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on observation and analysis, we infer the reasons of transformers deficiencies in multiplication tasks lies in their difficulty in calculating successive carryovers and caching intermediate results, and confirmed this inference through experiments. Guided by these findings, we propose improvements to enhance transformers performance on multiplication tasks. |
Luyu Qiu; Jianing Li; Chi Su; Chen Jason Zhang; Lei Chen; | arxiv-cs.CL | 2024-07-22 |
582 | Can GPT-4 Learn to Analyse Moves in Research Article Abstracts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we employ the affordances of GPT-4 to automate the annotation process by using natural language prompts. |
Danni Yu; Marina Bondi; Ken Hyland; | arxiv-cs.CL | 2024-07-22 |
583 | KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the adaptation of Transformerbased models for edge devices through the quantisation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC-V platform. |
Aness Al-Qawlaq; Ajay Kumar M; Deepu John; | arxiv-cs.AR | 2024-07-22 |
584 | Efficient Visual Transformer By Learnable Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Learnable Token Merging (LTM), or LTM-Transformer. |
Yancheng Wang; Yingzhen Yang; | arxiv-cs.CV | 2024-07-21 |
585 | Unipa-GPT: Large Language Models for University-oriented QA in Italian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments we adopted both the Retrieval Augmented Generation (RAG) approach and fine-tuning to develop the system. |
Irene Siragusa; Roberto Pirrone; | arxiv-cs.CL | 2024-07-19 |
586 | LLMs Left, Right, and Center: Assessing GPT’s Capabilities to Label Political Bias from Web Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale (far-left to far-right). |
Raphael Hernandes; Giulio Corsi; | arxiv-cs.CL | 2024-07-19 |
587 | Can Open-Source LLMs Compete with Commercial Models? Exploring The Few-Shot Performance of Current GPT Models in Biomedical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3.5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning. |
Samy Ateia; Udo Kruschwitz; | arxiv-cs.CL | 2024-07-18 |
588 | Adaptive Foundation Models for Online Decisions: HyperAgent with Fast Incremental Uncertainty Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GPT-HyperAgent, an augmentation of GPT with HyperAgent for uncertainty-aware, scalable exploration in contextual bandits, a fundamental online decision problem involving natural language input. |
Yingru Li; Jiawei Xu; Zhi-Quan Luo; | arxiv-cs.LG | 2024-07-18 |
589 | Evaluating Large Language Models for Anxiety and Depression Classification Using Counseling and Psychotherapy Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts. |
Junwei Sun; Siqi Ma; Yiran Fan; Peter Washington; | arxiv-cs.CL | 2024-07-18 |
590 | Sharif-STR at SemEval-2024 Task 1: Transformer As A Regression Model for Fine-Grained Scoring of Textual Semantic Relations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the investigation of sentence-level STR within Track A (Supervised) by leveraging fine-tuning techniques on the RoBERTa transformer. |
SEYEDEH FATEMEH EBRAHIMI et. al. | arxiv-cs.CL | 2024-07-17 |
591 | LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel LLMs-in-the-loop approach to develop supervised neural machine translation models optimized specifically for medical texts. |
Bunyamin Keles; Murat Gunay; Serdar I. Caglar; | arxiv-cs.CL | 2024-07-16 |
592 | Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. |
SEYEDEH FATEMEH EBRAHIMI et. al. | arxiv-cs.CL | 2024-07-16 |
593 | Educational Personalized Learning Path Planning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its potential, traditional PLPP systems often lack adaptability, interactivity, and transparency. This paper proposes a novel approach integrating Large Language Models (LLMs) with prompt engineering to address these challenges. |
Chee Ng; Yuen Fung; | arxiv-cs.CL | 2024-07-16 |
594 | Does Refusal Training in LLMs Generalize to The Past Tense? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We systematically evaluate this method on Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, GPT-4o mini, GPT-4o, o1-mini, o1-preview, and R2D2 models using GPT-3.5 Turbo as a reformulation model. |
Maksym Andriushchenko; Nicolas Flammarion; | arxiv-cs.CL | 2024-07-16 |
595 | ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by the need for lightweight, open source, and multilingual dialogue evaluators, this paper introduces GenResCoh (Generated Responses targeting Coherence). |
John Mendonça; Isabel Trancoso; Alon Lavie; | arxiv-cs.CL | 2024-07-16 |
596 | Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches … |
Yaswanth Narsupalli; Abhranil Chandra; Sreevatsa Muppirala; Manish Gupta; Pawan Goyal; | ArXiv | 2024-07-16 |
597 | A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies show that creating a high-quality training dataset for software engineering chatbots is expensive in terms of both resources and time. Aims: Therefore, in this paper, we present an automated transformer-based approach to augment software engineering chatbot datasets. |
Ahmad Abdellatif; Khaled Badran; Diego Elias Costa; Emad Shihab; | arxiv-cs.SE | 2024-07-16 |
598 | GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution is a set of features, their properties, definitions, and examples in a machine-readable format, along with the code for RhetAnn and the GPT prompts and fine-tuning procedures for advancing state-of-the-art interpretable propaganda technique detection. |
Kyle Hamilton; Luca Longo; Bojan Bozic; | arxiv-cs.CL | 2024-07-16 |
599 | Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To thoroughly examine the behaviours of Transformer-based MDS models, this paper presents five empirical studies on (1) measuring the impact of document boundary separators quantitatively; (2) exploring the effectiveness of different mainstream Transformer structures; (3) examining the sensitivity of the encoder and decoder; (4) discussing different training strategies; and (5) discovering the repetition in a summary generation. |
Congbo Ma; Wei Emma Zhang; Dileepa Pitawela; Haojie Zhuang; Yanfeng Shu; | arxiv-cs.CL | 2024-07-16 |
600 | GPT-4V Cannot Generate Radiology Reports Yet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we perform a systematic evaluation of GPT-4V in generating radiology reports on two chest X-ray report datasets: MIMIC-CXR and IU X-Ray. |
Yuyang Jiang; Chacha Chen; Dang Nguyen; Benjamin M. Mervak; Chenhao Tan; | arxiv-cs.CY | 2024-07-16 |
601 | R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rigorous insights are provided into the influence of jamming LLM word embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE). |
ALADIN DJUHERA et. al. | arxiv-cs.LG | 2024-07-16 |
602 | Scientific QA System with Verifiable Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the VerifAI project, a pioneering open-source scientific question-answering system, designed to provide answers that are not only referenced but also automatically vetted and verifiable. |
ADELA LJAJIĆ et. al. | arxiv-cs.CL | 2024-07-16 |
603 | Large Language Models As Misleading Assistants in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users. |
BETTY LI HOU et. al. | arxiv-cs.CL | 2024-07-16 |
604 | Leveraging LLM-Respondents for Item Evaluation: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, item calibration is time-consuming and costly, requiring a sufficient number of respondents for the response process. We explore using six different LLMs (GPT-3.5, GPT-4, Llama 2, Llama 3, Gemini-Pro, and Cohere Command R Plus) and various combinations of them using sampling methods to produce responses with psychometric properties similar to human answers. |
Yunting Liu; Shreya Bhandari; Zachary A. Pardos; | arxiv-cs.CY | 2024-07-15 |
605 | GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images Via VLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4o can decode hand gestures from forearm ultrasound data even with no fine-tuning, and improves with few-shot, in-context learning. |
Keshav Bimbraw; Ye Wang; Jing Liu; Toshiaki Koike-Akino; | arxiv-cs.CV | 2024-07-15 |
606 | Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present novel approaches that use a generative pretrained transformer (GPT) to identify paraphasias from transcripts as well as two end-to-end approaches that focus on modeling both automatic speech recognition (ASR) and paraphasia classification as multiple sequences vs. a single sequence. |
Matthew Perez; Aneesha Sampath; Minxue Niu; Emily Mower Provost; | arxiv-cs.CL | 2024-07-15 |
607 | Transformer-based Drum-level Prediction in A Boiler Plant with Delayed Relations Among Multivariates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging the capabilities of Transformer architectures, this study aims to develop an accurate and robust predictive framework to anticipate water level fluctuations and facilitate proactive control strategies. |
Gang Su; Sun Yang; Zhishuai Li; | arxiv-cs.LG | 2024-07-15 |
608 | Hierarchical Local Temporal Feature Enhancing for Transformer-Based 3D Human Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advancements in transformer-based methods have yielded substantial success in 2D-to-3D human pose estimation. Transformer-based estimators have their inherent advantages … |
Xin Yan; Chi-Man Pun; Haolun Li; Mengqi Liu; Hao Gao; | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
609 | DistillSeq: A Framework for Safety Alignment Testing in Large Language Models Using Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we deploy two distinct strategies for generating malicious queries: one based on a syntax tree approach, and the other leveraging an LLM-based method. |
Mingke Yang; Yuqi Chen; Yi Liu; Ling Shi; | arxiv-cs.SE | 2024-07-14 |
610 | Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, current works use GPT-4 to only predict the correct option without providing any explanation and thus do not provide any insight into the thinking process and reasoning used by GPT-4 or other LLMs. Therefore, we introduce a new domain-specific error taxonomy derived from collaboration with medical students. |
SOUMYADEEP ROY et. al. | sigir | 2024-07-14 |
611 | Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints. |
Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu; | sigir | 2024-07-14 |
612 | Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4). |
GE GAO et. al. | arxiv-cs.CL | 2024-07-14 |
613 | TransOptAS: Transformer-Based Algorithm Selection for Single-Objective Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View |
Gjorgjina Cenikj; G. Petelin; T. Eftimov; | GECCO Companion | 2024-07-14 |
614 | Reflections on The Coding Ability of LLMs for Analyzing Market Research Surveys Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first systematic study of applying large language models (in our case, GPT-3.5 and GPT-4) for the automatic coding (multi-class classification) problem in market research. |
Shi Zong; Santosh Kolagati; Amit Chaudhary; Josh Seltzer; Jimmy Lin; | sigir | 2024-07-14 |
615 | CodeV: Empowering LLMs for Verilog Generation Through Multi-Level Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. |
YANG ZHAO et. al. | arxiv-cs.PL | 2024-07-14 |
616 | Legal Statute Identification: A Case Study Using State-of-the-Art Datasets and Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reproduce several LSI models on two popular LSI datasets and study the effect of the above-mentioned challenges. |
Shounak Paul; Rajas Bhatt; Pawan Goyal; Saptarshi Ghosh; | sigir | 2024-07-14 |
617 | Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking Over Larger Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages. |
Vinay Setty; | sigir | 2024-07-14 |
618 | Generalizable Tip-of-the-Tongue Retrieval with LLM Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the generalization capabilities of existing retrieval methods with ToT queries in multiple domains. |
Lu\'{\i}s Borges; Rohan Jha; Jamie Callan; Bruno Martins; | sigir | 2024-07-14 |
619 | Causality Extraction from Medical Text Using Large Language Models (LLMs) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of natural language models, including large language models, to extract causal relations from medical texts, specifically from Clinical Practice Guidelines (CPGs). |
Seethalakshmi Gopalakrishnan; Luciana Garbayo; Wlodek Zadrozny; | arxiv-cs.CL | 2024-07-13 |
620 | Document-level Clinical Entity and Relation Extraction Via Knowledge Base-Guided Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability. |
Kriti Bhattarai; Inez Y. Oh; Zachary B. Abrams; Albert M. Lai; | arxiv-cs.CL | 2024-07-13 |
621 | Robustness of LLMs to Perturbations in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs’ robustness against the corrupt variations of the original text. |
Ayush Singh; Navpreet Singh; Shubham Vatsal; | arxiv-cs.CL | 2024-07-12 |
622 | ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a reinforcement learning formulation of the LLM red-teaming task that allows us to discover prompts that both (1) trigger toxic outputs from a frozen defender and (2) have low perplexity as scored by that defender. |
Amelia F. Hardy; Houjun Liu; Bernard Lange; Mykel J. Kochenderfer; | arxiv-cs.CL | 2024-07-12 |
623 | A Survey on Symbolic Knowledge Distillation of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This survey paper delves into the emerging and critical area of symbolic knowledge distillation in Large Language Models (LLMs). As LLMs like Generative Pre-trained Transformer-3 … |
Kamal Acharya; Alvaro Velasquez; H. Song; | ArXiv | 2024-07-12 |
624 | EVOLVE: Predicting User Evolution and Network Dynamics in Social Media Using Fine-Tuned GPT-like Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we propose a predictive method to understand how a user evolves on social media throughout their life and to forecast the next stage of their evolution. |
Ismail Hossain; Md Jahangir Alam; Sai Puppala; Sajedul Talukder; | arxiv-cs.SI | 2024-07-12 |
625 | The Two Sides of The Coin: Hallucination Generation and Detection with LLMs As Evaluators for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. |
ANH THU MARIA BUI et. al. | arxiv-cs.AI | 2024-07-12 |
626 | Movie Recommendation with Poster Attention Via Multi-modal Transformer Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal movie recommendation system by extract features of the well designed posters for each movie and the narrative text description of the movie. |
Linhan Xia; Yicheng Yang; Ziou Chen; Zheng Yang; Shengxin Zhu; | arxiv-cs.IR | 2024-07-12 |
627 | On Exact Bit-level Reversible Transformers Without Changing Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference. |
Guoqiang Zhang; J. P. Lewis; W. B. Kleijn; | arxiv-cs.LG | 2024-07-12 |
628 | Show, Don’t Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate the models’ ability to generalize beyond their training data, we introduce two additional games. |
Gonçalo Hora de Carvalho; Oscar Knap; Robert Pollice; | arxiv-cs.AI | 2024-07-12 |
629 | Detect Llama — Finding Vulnerabilities in Smart Contracts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we test the hypothesis that although OpenAI’s GPT-4 performs well generally, we can fine-tune open-source models to outperform GPT-4 in smart contract vulnerability detection. |
Peter Ince; Xiapu Luo; Jiangshan Yu; Joseph K. Liu; Xiaoning Du; | arxiv-cs.CR | 2024-07-11 |
630 | GPT-4 Is Judged More Human Than Humans in Displaced and Inverted Turing Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall. |
Ishika Rathi; Sydney Taylor; Benjamin K. Bergen; Cameron R. Jones; | arxiv-cs.HC | 2024-07-11 |
631 | LLMs’ Morphological Analyses of Complex FST-generated Finnish Words Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms. |
Anssi Moisio; Mathias Creutz; Mikko Kurimo; | arxiv-cs.CL | 2024-07-11 |
632 | Teaching Transformers Causal Reasoning Through Axiomatic Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. |
Aniket Vashishtha; Abhinav Kumar; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma; | arxiv-cs.LG | 2024-07-10 |
633 | FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the previous approaches has investigated the efficiency of LLM-based few-shot learning in domain-specific scenarios. To address this gap, we introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets, with a focus on industrial manufacturing and maintenance, while using multiple LLMs — GPT-4-32K, GPT-3.5-Turbo, LLaMA 2-chat, and Vicuna. |
Yongjian Tang; Rakebul Hasan; Thomas Runkler; | arxiv-cs.CL | 2024-07-10 |
634 | Transformer Neural Networks with Spatiotemporal Attention for Predictive Control and Optimization of Industrial Processes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the context of real-time optimization and model predictive control of industrial systems, machine learning, and neural networks represent cutting-edge tools that hold promise … |
Ethan R. Gallup; Jacob F. Tuttle; Jake Immonen; Blake W. Billings; Kody M. Powell; | 2024 American Control Conference (ACC) | 2024-07-10 |
635 | ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose Random Subspace Adaptation (ROSA), a method that outperforms previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time. |
Marawan Gamal Abdel Hameed; Aristides Milios; Siva Reddy; Guillaume Rabusseau; | arxiv-cs.LG | 2024-07-10 |
636 | A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine five feature extraction methods (TF-IDF, LSI, BERT, MiniLM, RoBERTa) and find that Term Frequency-Inverse Document Frequency (TF-IDF) outperforms the other four methods with a precision of 75\% and an F1 score of 64\%. |
Refat Othman; Bruno Rossi; Russo Barbara; | arxiv-cs.CR | 2024-07-09 |
637 | PEER: Expertizing Domain-Specific Tasks with A Multi-Agent Framework and Tuning Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework. |
YIYING WANG et. al. | arxiv-cs.AI | 2024-07-09 |
638 | Mixture-of-Modules: Reinventing Transformers As Dynamic Assemblies of Modules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization. |
ZHUOCHENG GONG et. al. | arxiv-cs.CL | 2024-07-09 |
639 | Prompting Techniques for Secure Code Generation: A Systematic Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OBJECTIVE: In this study, we investigate the impact of different prompting techniques on the security of code generated from NL instructions by LLMs. |
Catherine Tony; Nicolás E. Díaz Ferreyra; Markus Mutas; Salem Dhiff; Riccardo Scandariato; | arxiv-cs.SE | 2024-07-09 |
640 | Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce Multilingual Blending, a mixed-language query-response scheme designed to evaluate the safety alignment of various state-of-the-art LLMs (e.g., GPT-4o, GPT-3.5, Llama3) under sophisticated, multilingual conditions. |
Jiayang Song; Yuheng Huang; Zhehua Zhou; Lei Ma; | arxiv-cs.CL | 2024-07-09 |
641 | Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To assess the LLM output, we propose completeness, soundness, clarity, syntax, and functioning code as metrics. |
Inwon Kang; William Van Woensel; Oshani Seneviratne; | arxiv-cs.CL | 2024-07-09 |
642 | Short Answer Scoring with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Lan Jiang; Nigel Bosch; | ACM Conference on Learning @ Scale | 2024-07-09 |
643 | Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR). |
YAOZONG GAN et. al. | arxiv-cs.CV | 2024-07-08 |
644 | Intent Aware Data Augmentation By Leveraging Generative AI for Stress Detection in Social Media Texts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Stress is a major issue in modern society. Researchers focus on identifying stress in individuals, linking language with mental health, and often utilizing social media posts. … |
Minhah Saleem; Jihie Kim; | PeerJ Comput. Sci. | 2024-07-08 |
645 | On The Power of Convolution Augmented Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent architectural recipes, such as state-space models, have bridged the performance gap. Motivated by this, we examine the benefits of Convolution-Augmented Transformer (CAT) for recall, copying, and length generalization tasks. |
Mingchen Li; Xuechen Zhang; Yixiao Huang; Samet Oymak; | arxiv-cs.LG | 2024-07-08 |
646 | Surprising Gender Biases in GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present seven experiments exploring gender biases in GPT. |
Raluca Alexandra Fulgu; Valerio Capraro; | arxiv-cs.CY | 2024-07-08 |
647 | Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations. |
Moritz Reuss; Ömer Erdinç Yağmurlu; Fabian Wenzel; Rudolf Lioutikov; | arxiv-cs.RO | 2024-07-08 |
648 | Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study underscores the crucial role of prompt engineering in maximizing the educational benefits of LLMs. By systematically categorizing and testing these strategies, we provide a comprehensive framework for both educators and students to optimize LLM-based learning experiences. |
Tianyu Wang; Nianjun Zhou; Zhixiong Chen; | arxiv-cs.AI | 2024-07-07 |
649 | A Novel Automated Urban Building Analysis Framework Based on GPT and SAM Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Rapid urban development necessitates advanced methodologies for efficiently acquiring and analyzing detailed building information. This study proposes an automated framework, … |
Yuchao Sun; Xianping Ma; Yizhen Yan; Man-On Pun; Bo Huang; | IGARSS 2024 – 2024 IEEE International Geoscience and Remote … | 2024-07-07 |
650 | Image-Conditional Diffusion Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). |
XINGYANG NIE et. al. | arxiv-cs.CV | 2024-07-07 |
651 | MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid advancement of Large Language Models (LLMs) and Large Multimodal Models (LMMs) has heightened the demand for AI-based scientific assistants capable of understanding … |
ZEKUN LI et. al. | ArXiv | 2024-07-06 |
652 | Associative Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step. |
Ivan Rodkin; Yuri Kuratov; Aydar Bulatov; Mikhail Burtsev; | arxiv-cs.CL | 2024-07-05 |
653 | Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored. |
Nathan Herr; Fernando Acero; Roberta Raileanu; María Pérez-Ortiz; Zhibin Li; | arxiv-cs.AI | 2024-07-05 |
654 | Using LLMs to Label Medical Papers According to The CIViC Evidence Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP. |
Markus Hisch; Xing David Wang; | arxiv-cs.CL | 2024-07-05 |
655 | Generalists Vs. Specialists: Evaluating Large Language Models for Urdu Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models–XLM-Roberta-large, mT5-large, and Llama-3-8b–that have been fine-tuned on specific tasks. |
Samee Arif; Abdul Hameed Azeemi; Agha Ali Raza; Awais Athar; | arxiv-cs.CL | 2024-07-05 |
656 | Enhancing Multi-Agent Communication Collaboration Through GPT-Based Semantic Information Extraction and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xinfeng Deng; Li Zhou; Dezun Dong; Jibo Wei; | ACM Turing Award Celebration Conference 2024 | 2024-07-05 |
657 | Adaptive Step-size Perception Unfolding Network with Non-local Hybrid Attention for Hyperspectral Image Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the aforementioned drawbacks, We proposed an adaptive step-size perception unfolding network (ASPUN), a deep unfolding network based on FISTA algorithm, which uses an adaptive step-size perception module to estimate the update step-size of each spectral channel. |
Yanan Yang; Like Xin; | arxiv-cs.CV | 2024-07-04 |
658 | HYBRINFOX at CheckThat! 2024 – Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! competition. The specificity of the method is to use a hybrid … |
MORGANE CASANOVA et. al. | Conference and Labs of the Evaluation Forum | 2024-07-04 |
659 | HYBRINFOX at CheckThat! 2024 — Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! |
MORGANE CASANOVA et. al. | arxiv-cs.CL | 2024-07-04 |
660 | From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with a focus on their abilities in reasoning and explainability. |
Stefanie Krause; Frieder Stolzenburg; | arxiv-cs.AI | 2024-07-04 |
661 | GPT-4 Vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. |
JIANHAO YAN et. al. | arxiv-cs.CL | 2024-07-04 |
662 | TrackPGD: A White-box Attack Using Binary Masks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are proposing a novel white-box attack named TrackPGD, which relies on the predicted object binary mask to attack the robust transformer trackers. |
Fatemeh Nourilenjan Nokabadi; Yann Batiste Pequignot; Jean-Francois Lalonde; Christian Gagné; | arxiv-cs.CV | 2024-07-04 |
663 | Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task. |
Sachin Yadav; Tejaswi Choppa; Dominik Schlechtweg; | arxiv-cs.CL | 2024-07-04 |
664 | InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. |
PAN ZHANG et. al. | arxiv-cs.CV | 2024-07-03 |
665 | CATT: Character-based Arabic Tashkeel Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new approach to training ATD models. |
Faris Alasmary; Orjuwan Zaafarani; Ahmad Ghannam; | arxiv-cs.CL | 2024-07-03 |
666 | Mast Kalandar at SemEval-2024 Task 8: On The Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness. |
Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava; Radhika Mamidi; | arxiv-cs.CL | 2024-07-03 |
667 | Large Language Models As Evaluators for Scientific Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study explores how well the state-of-the-art Large Language Models (LLMs), like GPT-4 and Mistral, can assess the quality of scientific summaries or, more fittingly, scientific syntheses, comparing their evaluations to those of human annotators. |
Julia Evans; Jennifer D’Souza; Sören Auer; | arxiv-cs.CL | 2024-07-03 |
668 | Assessing The Code Clone Detection Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection. |
Zixian Zhang; Takfarinas Saber; | arxiv-cs.SE | 2024-07-02 |
669 | RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG. |
YUE YU et. al. | arxiv-cs.CL | 2024-07-02 |
670 | Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry. |
MENGLIN YANG et. al. | arxiv-cs.LG | 2024-07-01 |
671 | FATFusion: A Functional-anatomical Transformer for Medical Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wei Tang; Fazhi He; | Inf. Process. Manag. | 2024-07-01 |
672 | Prompting GPT -4 to Support Automatic Safety Case Generation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti; | Expert Syst. Appl. | 2024-07-01 |
673 | Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, LLMs struggle to converge even when explicitly prompted to do so and are sensitive to prompt variations. To overcome these issues, we introduce a hybrid algorithm: LLM-Enhanced Adaptive Dueling (LEAD), which takes advantage of both in-context decision-making capabilities of LLMs and theoretical guarantees inherited from classic DB algorithms. |
Fanzeng Xia; Hao Liu; Yisong Yue; Tongxin Li; | arxiv-cs.LG | 2024-07-01 |
674 | Image-to-Text Logic Jailbreak: Your Imagination Can Help You Do Anything Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the integration of visual and text inputs in VLMs, new security issues emerge, as malicious attackers can exploit multiple modalities to achieve their objectives. |
Xiaotian Zou; Ke Li; Yongkang Chen; | arxiv-cs.CR | 2024-07-01 |
675 | Transformer Autoencoder for K-means Efficient Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wenhao Wu; Weiwei Wang; Xixi Jia; Xiangchu Feng; | Eng. Appl. Artif. Intell. | 2024-07-01 |
676 | TextCheater: A Query-Efficient Textual Adversarial Attack in The Hard-Label Setting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing a query-efficient attack strategy to generate high-quality adversarial examples under the hard-label black-box setting is a fundamental yet challenging problem, … |
HAO PENG et. al. | IEEE Transactions on Dependable and Secure Computing | 2024-07-01 |
677 | MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. |
YUBO MA et. al. | arxiv-cs.CV | 2024-07-01 |
678 | Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based models have made significant advancements across various domains, largely due to the self-attention mechanism’s ability to capture contextual relationships in … |
HULIN WANG et. al. | IEEE Transactions on Computers | 2024-07-01 |
679 | Dynamic Region-aware Transformer Backbone Network for Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jun Wang; Shuai Yang; Yuanyun Wang; | Eng. Appl. Artif. Intell. | 2024-07-01 |
680 | Token-disentangling Mutual Transformer for Multimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
GUANGHAO YIN et. al. | Eng. Appl. Artif. Intell. | 2024-07-01 |
681 | ITFuse: An Interactive Transformer for Infrared and Visible Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wei Tang; Fazhi He; Yu Liu; | Pattern Recognit. | 2024-07-01 |
682 | TE-Spikformer:Temporal-enhanced Spiking Neural Network with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
SHOUWEI GAO et. al. | Neurocomputing | 2024-07-01 |
683 | DC Bias Content Extraction of Power Transformer Under AC and DC Environment and Its Suppression Measures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The phenomenon of transformer dc bias (TDB) will saturate the transformer core, resulting in the local overheating, accelerating the ageing of insulating material, and even … |
ZHIWEI CHEN et. al. | IEEE Transactions on Industrial Electronics | 2024-07-01 |
684 | Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. |
Kota Shamanth Ramanath Nayak; Leila Kosseim; | arxiv-cs.CL | 2024-07-01 |
685 | Multi-Turn Hidden Backdoor in Large Language Model-powered Chatbot Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Model (LLM)-powered chatbot services like GPTs, simulating human-to-human conversation via machine-generated text, are used in numerous fields. They are enhanced by … |
Bocheng Chen; Nikolay Ivanov; Guangjing Wang; Qiben Yan; | Proceedings of the 19th ACM Asia Conference on Computer and … | 2024-07-01 |
686 | LegalTurk Optimized BERT for Multi-Label Text Classification and NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies. |
Farnaz Zeidi; Mehmet Fatih Amasyali; Çiğdem Erol; | arxiv-cs.CL | 2024-06-30 |
687 | WallFacer: Harnessing Multi-dimensional Ring Parallelism for Efficient Long Sequence Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods are either constrained by the number of attention heads or excessive communication overheads. To address this problem, we propose WallFacer, a multi-dimensional distributed training system for long sequences, fostering an efficient communication paradigm and providing additional tuning flexibility for communication arrangements. |
ZIMING LIU et. al. | arxiv-cs.DC | 2024-06-30 |
688 | Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transfer learning has gained significant traction in natural language processing due to the emergence of state-of-the-art pre-trained language models (P.L.M.s). Unlike traditional … |
Shadi Jaradat; Richi Nayak; Alexander Paz; Mohammed Elhenawy; | Algorithms | 2024-06-30 |
689 | LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, prior research harbors two primary concerns: firstly, a lack of contemplation regarding whether the natural language generated by LLM (LLMNL) truly aligns with human natural language (HNL), a critical foundational question; secondly, an oversight that augmented data is randomly generated by LLM, implying that not all data may possess equal training value, that could impede the performance of classifiers. To address these challenges, we introduce the scaling laws to intrinsically calculate LLMNL and HNL. |
Zhenhua Wang; Guang Xu; Ming Ren; | arxiv-cs.CL | 2024-06-29 |
690 | Machine Learning Predictors for Min-Entropy Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Utilizing data from Generalized Binary Autoregressive Models, a subset of Markov processes, we demonstrate that machine learning models (including a hybrid of convolutional and recurrent Long Short-Term Memory layers and the transformer-based GPT-2 model) outperform traditional NIST SP 800-90B predictors in certain scenarios. |
Javier Blanco-Romero; Vicente Lorenzo; Florina Almenares Mendoza; Daniel Díaz-Sánchez; | arxiv-cs.LG | 2024-06-28 |
691 | Optimizing Uyghur Speech Synthesis By Combining Pretrained Cross-Lingual Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: End-to-end speech synthesis methodologies have exhibited considerable advancements for languages with abundant corpus resources. Nevertheless, such achievements are yet to be … |
Kexin Lu; Zhihua Huang; Mingming Yin; Ke Chen; | ACM Transactions on Asian and Low-Resource Language … | 2024-06-28 |
692 | Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users’ quit-vaping intentions. |
SAI KRISHNA REVANTH VURUMA et. al. | arxiv-cs.CL | 2024-06-28 |
693 | The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP). |
Xiliang Zhu; Shayna Gardiner; Tere Roldán; David Rossouw; | arxiv-cs.CL | 2024-06-27 |
694 | FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FRED, a wafer-scale interconnect that is tailored for the high-BW requirements of wafer-scale networks and can efficiently execute communication patterns of different parallelization strategies. |
Saeed Rashidi; William Won; Sudarshan Srinivasan; Puneet Gupta; Tushar Krishna; | arxiv-cs.AR | 2024-06-27 |
695 | Fine-tuned Network Relies on Generic Representation to Solve Unseen Cognitive Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning pretrained language models has shown promising results on a wide range of tasks, but when encountering a novel task, do they rely more on generic pretrained representation, or develop brand new task-specific solutions? |
Dongyan Lin; | arxiv-cs.LG | 2024-06-27 |
696 | BADGE: BADminton Report Generation and Evaluation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel framework named BADGE, designed for this purpose using LLM. |
Shang-Hsuan Chiang; Lin-Wei Chao; Kuang-Da Wang; Chih-Chuan Wang; Wen-Chih Peng; | arxiv-cs.CL | 2024-06-26 |
697 | Automating Clinical Trial Eligibility Screening: Quantitative Analysis of GPT Models Versus Human Expertise Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Objective: This study quantitatively assesses the performance of GPT model in classifying patient eligibility for clinical trials, aiming to minimize the need for expert clinical … |
ARTI DEVI et. al. | Proceedings of the 17th International Conference on … | 2024-06-26 |
698 | This Paper Had The Smartest Reviewers — Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of speech and train machine learning models for automatic flattery detection. |
LUKAS CHRIST et. al. | arxiv-cs.SD | 2024-06-25 |
699 | SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR). |
Quan Mai; Susan Gauch; Douglas Adams; | arxiv-cs.CL | 2024-06-25 |
700 | Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora. |
YIMING LI et. al. | arxiv-cs.CL | 2024-06-25 |
701 | CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design. |
NAFIS NEEHAL et. al. | arxiv-cs.CL | 2024-06-25 |
702 | Autonomous Prompt Engineering in Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prompt engineering is a crucial yet challenging task for optimizing the performance of large language models (LLMs) on customized tasks. This pioneering research introduces the … |
Daan Kepel; Konstantina Valogianni; | ArXiv | 2024-06-25 |
703 | This Paper Had The Smartest Reviewers – Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Flattery is an important aspect of human communication that facilitates social bonding, shapes perceptions, and influences behavior through strategic compliments and praise, … |
LUKAS CHRIST et. al. | ArXiv | 2024-06-25 |
704 | Unambiguous Recognition Should Not Rely Solely on Natural Language Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This bias stems from the inherent characteristics of the dataset. To mitigate this bias, we propose a LaTeX printed text recognition model trained on a mixed dataset of pseudo-formulas and pseudo-text. |
Renqing Luo; Yuhan Xu; | arxiv-cs.CV | 2024-06-24 |
705 | The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions. |
Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz; | arxiv-cs.CL | 2024-06-24 |
706 | Exploring The Capability of Mamba in Speech Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compared Mamba with state-of-the-art Transformer variants for various speech applications, including ASR, text-to-speech, spoken language understanding, and speech summarization. |
Koichi Miyazaki; Yoshiki Masuyama; Masato Murata; | arxiv-cs.SD | 2024-06-24 |
707 | GPT-4V Explorations: Mining Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the application of the GPT-4V(ision) large visual language model to autonomous driving in mining environments, where traditional systems often falter in understanding intentions and making accurate decisions during emergencies. |
Zixuan Li; | arxiv-cs.CV | 2024-06-24 |
708 | Using GPT-4 Turbo to Automatically Identify Defeaters in Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are convincing arguments, supported by a body of evidence and aiming at demonstrating that a system will function as intended. Producers of systems can rely … |
K. K. SHAHANDASHTI et. al. | 2024 IEEE 32nd International Requirements Engineering … | 2024-06-24 |
709 | Exploring The Capabilities of Large Language Models for The Generation of Safety Cases: The Case of GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of large language models (LLMs) and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across … |
Mithila Sivakumar; A. B. Belle; Jinjun Shan; K. K. Shahandashti; | 2024 IEEE 32nd International Requirements Engineering … | 2024-06-24 |
710 | Exploring Factual Entailment with NLI: A News Media Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel — a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles. |
Guy Mor-Lan; Effi Levi; | arxiv-cs.CL | 2024-06-24 |
711 | DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models. |
YUANG PENG et. al. | arxiv-cs.CV | 2024-06-24 |
712 | OlympicArena Medal Ranks: Who Is The Most Intelligent AI So Far? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)? |
Zhen Huang; Zengzhi Wang; Shijie Xia; Pengfei Liu; | arxiv-cs.CL | 2024-06-24 |
713 | Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT). |
Ni Wang; Dongliang Liao; Xing Xu; | arxiv-cs.CV | 2024-06-23 |
714 | GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent studies have identified limitations in LLMs’ ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph data structure problems along with 2000 test cases. |
Qiming Wu; Zichen Chen; Will Corcoran; Misha Sra; Ambuj K. Singh; | arxiv-cs.AI | 2024-06-23 |
715 | Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate a broader view of knowledge location, that of concepts or clusters of related information, instead of disparate individual facts. |
Christopher Burger; Yifan Hu; Thai Le; | arxiv-cs.LG | 2024-06-22 |
716 | The Role of Generative AI in Qualitative Research: GPT-4’s Contributions to A Grounded Theory Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present reflections on our experience using a generative AI model in qualitative research, to illuminate the AI’s contributions to our analytic process. Our analytic focus was … |
Ravi Sinha; Idris Solola; Ha Nguyen; H. Swanson; LuEttaMae Lawrence; | Proceedings of the 2024 Symposium on Learning, Design and … | 2024-06-21 |
717 | How Effective Is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom’s Revised Taxonomy? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | arxiv-cs.CL | 2024-06-21 |
718 | VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack. |
Jonathan Rusert; | naacl | 2024-06-20 |
719 | MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3. |
SANCHIT AHUJA et. al. | naacl | 2024-06-20 |
720 | Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang. |
Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu; | naacl | 2024-06-20 |
721 | Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults. |
Afonso de Sá Delgado Neto; Maximilian Egger; Mayank Bakshi; Rawad Bitar; | arxiv-cs.LG | 2024-06-20 |
722 | Does GPT Really Get It? A Hierarchical Scale to Quantify Human Vs AI’s Understanding of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences. |
Mirabel Reid; Santosh S. Vempala; | arxiv-cs.AI | 2024-06-20 |
723 | CryptoGPT: A 7B Model Rivaling GPT-4 in The Task of Analyzing and Classifying Real-time Financial News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CryptoGPT: a 7B model competing with GPT-4 in a specific task — The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT. |
Ying Zhang; Matthieu Petit Guillaume; Aurélien Krauth; Manel Labidi; | arxiv-cs.AI | 2024-06-20 |
724 | Does GPT-4 Pass The Turing Test? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness. |
Cameron Jones; Ben Bergen; | naacl | 2024-06-20 |
725 | Removing RLHF Protections in GPT-4 Via Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate. |
QIUSI ZHAN et. al. | naacl | 2024-06-20 |
726 | Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance. |
XIANGRU TANG et. al. | naacl | 2024-06-20 |
727 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
DONG YUAN et. al. | naacl | 2024-06-20 |
728 | A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems. |
Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas; | naacl | 2024-06-20 |
729 | Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature. |
Anshuman Chhabra; Hadi Askari; Prasant Mohapatra; | naacl | 2024-06-20 |
730 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng; | naacl | 2024-06-20 |
731 | ChatGPT As Research Scientist: Probing GPT’s Capabilities As A Research Librarian, Research Ethicist, Data Generator and Data Predictor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research … |
Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R. Banaji; | arxiv-cs.AI | 2024-06-20 |
732 | On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck. |
TING-RUI CHIANG et. al. | naacl | 2024-06-20 |
733 | CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively. |
Ming Jiang; Mansi Joshi; | naacl | 2024-06-20 |
734 | Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3. |
Sindhu Kishore; Hangfeng He; | naacl | 2024-06-20 |
735 | Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; | naacl | 2024-06-20 |
736 | SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning. |
ARASH ARDAKANI et. al. | naacl | 2024-06-20 |
737 | Metacognitive Prompting Improves Understanding in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes. |
Yuqing Wang; Yun Zhao; | naacl | 2024-06-20 |
738 | Branch-Solve-Merge Improves Large Language Model Evaluation and Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance can fall short, due to the model�s lack of coherence and inability to plan and decompose the problem. We propose Branch-Solve-Merge (BSM), a Large Language Model program (Schlag et al. , 2023) for tackling such challenging natural language tasks. |
SWARNADEEP SAHA et. al. | naacl | 2024-06-20 |
739 | Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles. |
Hamdireza Rouzegar; Masoud Makrehchi; | arxiv-cs.CL | 2024-06-19 |
740 | Fine-Tuning BERTs for Definition Extraction from Mathematical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fine-tuned three pre-trained BERT models on the task of definition extraction from mathematical English written in LaTeX. |
Lucy Horowitz; Ryan Hathaway; | arxiv-cs.CL | 2024-06-19 |
741 | Putting GPT-4o to The Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study … |
SAKIB SHAHRIAR et. al. | ArXiv | 2024-06-19 |
742 | A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem. |
JIAQI LIU et. al. | arxiv-cs.RO | 2024-06-19 |
743 | ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. |
TEAM GLM et. al. | arxiv-cs.CL | 2024-06-18 |
744 | SwinStyleformer Is A Favorable Choice for Image Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects. |
Jiawei Mao; Guangyi Zhao; Xuesong Yin; Yuanqi Chang; | arxiv-cs.CV | 2024-06-18 |
745 | Reality Check: Assessing GPT-4 in Fixing Real-World Software Vulnerabilities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Discovering and mitigating software vulnerabilities is a challenging task. These vulnerabilities are often caused by simple, otherwise (and in other contexts) harmless code … |
ZOLTÁN SÁGODI et. al. | Proceedings of the 28th International Conference on … | 2024-06-18 |
746 | Generating Educational Materials with Different Levels of Readability Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning. |
Chieh-Yang Huang; Jing Wei; Ting-Hao ‘Kenneth’ Huang; | arxiv-cs.CL | 2024-06-18 |
747 | What Makes Two Language Models Think Alike? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. |
Jeanne Salle; Louis Jalouzot; Nur Lan; Emmanuel Chemla; Yair Lakretz; | arxiv-cs.CL | 2024-06-18 |
748 | ChatGPT: Perspectives from Human–computer Interaction and Psychology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The release of GPT-4 has garnered widespread attention across various fields, signaling the impending widespread adoption and application of Large Language Models (LLMs). However, … |
Jiaxi Liu; | Frontiers in Artificial Intelligence | 2024-06-18 |
749 | Adversarial Attacks on Multimodal Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that multimodal agents raise new safety risks, even though attacking agents is more challenging than prior attacks due to limited access to and knowledge about the environment. |
Chen Henry Wu; Jing Yu Koh; Ruslan Salakhutdinov; Daniel Fried; Aditi Raghunathan; | arxiv-cs.LG | 2024-06-18 |
750 | Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a thorough analysis and discussion of the results. |
ANKIT AICH et. al. | arxiv-cs.CL | 2024-06-18 |
751 | Promises, Outlooks and Challenges of Diffusion Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For example, autoregressive token generation is notably slow and can be prone to \textit{exposure bias}. The diffusion-based language models were proposed as an alternative to autoregressive generation to address some of these limitations. |
Justin Deschenaux; Caglar Gulcehre; | arxiv-cs.CL | 2024-06-17 |
752 | Significant Productivity Gains Through Programming with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models like GPT and Codex drastically alter many daily tasks, including programming, where they can rapidly generate code from natural language or informal … |
Thomas Weber; Maximilian Brandmaier; Albrecht Schmidt; Sven Mayer; | Proceedings of the ACM on Human-Computer Interaction | 2024-06-17 |
753 | Minimal Self in Humanoid Robot Alter3 Driven By Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Alter3, a humanoid robot that demonstrates spontaneous motion generation through the integration of GPT-4, Large Language Model (LLM). |
Takahide Yoshida; Suzune Baba; Atsushi Masumori; Takashi Ikegami; | arxiv-cs.RO | 2024-06-17 |
754 | Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify a pitfall of vanilla iterative DPO – improved response quality can lead to increased verbosity. |
JIE LIU et. al. | arxiv-cs.CL | 2024-06-17 |
755 | DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales. |
FAN ZHOU et. al. | arxiv-cs.DB | 2024-06-17 |
756 | A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a two-dimensional zero-shot evaluation method for DST using GPT-4, which divides the evaluation into two dimensions: accuracy and completeness. |
Ming Gu; Yan Yang; | arxiv-cs.CL | 2024-06-17 |
757 | Cultural Conditioning or Placebo? On The Effectiveness of Socio-Demographic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI) or neutral (MMLU and ETHICS). |
SAGNIK MUKHERJEE et. al. | arxiv-cs.CL | 2024-06-17 |
758 | Problematic Tokens: Tokenizer Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This misrepresentation results in the propagation of under-trained or untrained tokens, which perpetuate biases and pose serious concerns related to data security and ethical standards. We aim to dissect the tokenization mechanics of GPT-4o, illustrating how its simplified token-handling methods amplify these risks and offer strategic solutions to mitigate associated security and ethical issues. |
Jin Yang; Zhiqiang Wang; Yanbin Lin; Zunduo Zhao; | arxiv-cs.CL | 2024-06-17 |
759 | Look Further Ahead: Testing The Limits of GPT-4 in Path Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still face challenges with long-horizon planning. To study this, we propose path planning tasks as a platform to evaluate LLMs’ ability to navigate long trajectories under geometric constraints. |
Mohamed Aghzal; Erion Plaku; Ziyu Yao; | arxiv-cs.AI | 2024-06-17 |
760 | GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts. |
Binnur Görer; Fatma Başak Aydemir; | arxiv-cs.SE | 2024-06-17 |
761 | WellDunn: On The Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model’s utility in clinical practice. |
SEYEDALI MOHAMMADI et. al. | arxiv-cs.AI | 2024-06-17 |
762 | Exposing The Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dataset MWP-MISTAKE, incorporating MWPs with both correct and incorrect reasoning steps generated through rule-based methods and smaller language models. |
Joykirat Singh; Akshay Nambi; Vibhav Vineet; | arxiv-cs.CL | 2024-06-16 |
763 | Large Language Models for Automatic Milestone Detection in Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate an LLM’s performance on recordings of a group oral communication task in which utterances are often truncated or not well-formed. |
ZHUOXU DUAN et. al. | arxiv-cs.CL | 2024-06-16 |
764 | Generating Tables from The Parametric Knowledge of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables. |
Yevgeni Berkovitch; Oren Glickman; Amit Somech; Tomer Wolfson; | arxiv-cs.CL | 2024-06-16 |
765 | Breaking Boundaries: Investigating The Effects of Model Editing on Cross-linguistic Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. |
SOMNATH BANERJEE et. al. | arxiv-cs.CL | 2024-06-16 |
766 | ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we present Video Diffusion GPT (ViD-GPT). |
Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao; | arxiv-cs.CV | 2024-06-16 |
767 | Distilling Opinions at Scale: Incremental Opinion Summarization Using XL-OPSUMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally. |
SRI RAGHAVA MUDDU et. al. | arxiv-cs.CL | 2024-06-16 |
768 | KGPA: Robustness Evaluation for Large Language Models Via Cross-Domain Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). |
AIHUA PEI et. al. | arxiv-cs.CL | 2024-06-16 |
769 | ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning Via Shared Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an approach to optimize Parameter Efficient Fine Tuning (PEFT) for Pretrained Language Models (PLMs) by implementing a Shared Low Rank Adaptation (ShareLoRA). |
Yurun Song; Junchen Zhao; Ian G. Harris; Sangeetha Abdu Jyothi; | arxiv-cs.CL | 2024-06-15 |
770 | Multilingual Large Language Models and Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks. |
Daniil Gurgurov; Tanja Bäumel; Tatiana Anikina; | arxiv-cs.CL | 2024-06-15 |
771 | GPT-Fabric: Folding and Smoothing Fabric By Leveraging Pre-Trained Foundation Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Fabric manipulation has applications in folding blankets, handling patient clothing, and protecting items with covers. It is challenging for robots to perform fabric manipulation … |
Vedant Raval; Enyu Zhao; Hejia Zhang; S. Nikolaidis; Daniel Seita; | ArXiv | 2024-06-14 |
772 | GPT-4o: Visual Perception Performance of Multimodal Large Language Models in Piglet Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The initial evaluation experiments in this study validate the potential of multimodal large language models in livestock scene video understanding and provide new directions and references for future research on animal behavior video understanding. |
Yiqi Wu; Xiaodan Hu; Ziming Fu; Siling Zhou; Jiangong Li; | arxiv-cs.CV | 2024-06-14 |
773 | Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work enables extensive hardware/mapping exploration by extending the DSE framework Stream towards support for transformers across a wide variety of hardware architectures and different execution schedules. |
Steven Colleman; Arne Symons; Victor J. B. Jung; Marian Verhelst; | arxiv-cs.AR | 2024-06-14 |
774 | General Point Model Pretraining with Autoencoding and Autoregressive Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the General Language Model we propose a General Point Model (GPM) that seamlessly integrates autoencoding and autoregressive tasks in a point cloud transformer. |
ZHE LI et. al. | cvpr | 2024-06-13 |
775 | GPT-ology, Computational Models, Silicon Sampling: How Should We Think About LLMs in Cognitive Science? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models have taken the cognitive science world by storm. It is perhaps timely now to take stock of the various research paradigms that have been used to make … |
Desmond C. Ong; | ArXiv | 2024-06-13 |
776 | Complex Image-Generative Diffusion Transformer for Audio Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance audio denoising performance, this paper introduces a complex image-generative diffusion transformer that captures more information from the complex Fourier domain. |
Junhui Li; Pu Wang; Jialu Li; Youshan Zhang; | arxiv-cs.SD | 2024-06-13 |
777 | Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally bidirectional feature propagation is highly sensitive to inaccurate optical flow in blurry frames leading to error accumulation during the propagation process. To address these issues we propose BSSTNet Blur-aware Spatio-temporal Sparse Transformer Network. |
Huicong Zhang; Haozhe Xie; Hongxun Yao; | cvpr | 2024-06-13 |
778 | Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze a mechanism used in two LMs to selectively inhibit items in a context in one task, and find that it underlies a commonly used abstraction across many context-retrieval behaviors. |
Jack Merullo; Carsten Eickhoff; Ellie Pavlick; | arxiv-cs.CL | 2024-06-13 |
779 | MoMask: Generative Masked Modeling of 3D Human Motions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MoMask a novel masked modeling framework for text-driven 3D human motion generation. |
Chuan Guo; Yuxuan Mu; Muhammad Gohar Javed; Sen Wang; Li Cheng; | cvpr | 2024-06-13 |
780 | SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum we introduce SDPose a new self-distillation method for improving the performance of small transformer-based models. |
SICHEN CHEN et. al. | cvpr | 2024-06-13 |
781 | MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion. |
Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi; | cvpr | 2024-06-13 |
782 | OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions without a large-scale animal text-motion dataset. |
ZHANGSIHAO YANG et. al. | cvpr | 2024-06-13 |
783 | ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing studies are devoted to designing vision-specific transformers to solve the above problems which introduce additional pre-training costs. Therefore we present a plain pre-training-free and feature-enhanced ViT backbone with Convolutional Multi-scale feature interaction named ViT-CoMer which facilitates bidirectional interaction between CNN and transformer. |
Chunlong Xia; Xinliang Wang; Feng Lv; Xin Hao; Yifeng Shi; | cvpr | 2024-06-13 |
784 | Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models. |
Han Cai; Muyang Li; Qinsheng Zhang; Ming-Yu Liu; Song Han; | cvpr | 2024-06-13 |
785 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose VisualFactChecker (VFC) a flexible training-free pipeline that generates high-fidelity and detailed captions for both 2D images and 3D objects. |
YUNHAO GE et. al. | cvpr | 2024-06-13 |
786 | GPT-Fabric: Smoothing and Folding Fabric By Leveraging Pre-Trained Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-Fabric for the canonical tasks of fabric smoothing and folding, where GPT directly outputs an action informing a robot where to grasp and pull a fabric. |
Vedant Raval; Enyu Zhao; Hejia Zhang; Stefanos Nikolaidis; Daniel Seita; | arxiv-cs.RO | 2024-06-13 |
787 | Permutation Equivariance of Transformers and Its Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks. |
HENGYUAN XU et. al. | cvpr | 2024-06-13 |
788 | GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic versatile and human-aligned evaluation metric for text-to-3D generative models. |
TONG WU et. al. | cvpr | 2024-06-13 |
789 | Mean-Shift Feature Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models developed in NLP make a great impact on computer vision fields producing promising performance on various tasks. |
Takumi Kobayashi; | cvpr | 2024-06-13 |
790 | Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we first introduce LoCoV1, a 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long. |
Jon Saad-Falcon; Daniel Y Fu; Simran Arora; Neel Guha; Christopher Re; | icml | 2024-06-12 |
791 | Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. |
NICHOLAS CARLINI et. al. | icml | 2024-06-12 |
792 | Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. |
Jaehoon Kim; Seungwan Jin; Sohyun Park; Someen Park; Kyungsik Han; | arxiv-cs.CL | 2024-06-12 |
793 | An Empirical Study of Mamba-based Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In a controlled setting (e.g., same data), however, studies so far have only presented small scale experiments comparing SSMs to Transformers. To understand the strengths and weaknesses of these architectures at larger scales, we present a direct comparison between 8B-parameter Mamba, Mamba-2, and Transformer models trained on the same datasets of up to 3.5T tokens. |
ROGER WALEFFE et. al. | arxiv-cs.LG | 2024-06-12 |
794 | Trainable Transformer in Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new efficient construction, Transformer in Transformer (in short, TINT), that allows a transformer to simulate and fine-tune more complex models during inference (e.g., pre-trained language models). |
Abhishek Panigrahi; Sadhika Malladi; Mengzhou Xia; Sanjeev Arora; | icml | 2024-06-12 |
795 | Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by previous theoretical study of static version of the attention multiplication problem [Zandieh, Han, Daliri, and Karbasi ICML 2023, Alman and Song NeurIPS 2023], we formally define a dynamic version of attention matrix multiplication problem. |
Jan van den Brand; Zhao Song; Tianyi Zhou; | icml | 2024-06-12 |
796 | Asymmetry in Low-Rank Adapters of Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. |
JIACHENG ZHU et. al. | icml | 2024-06-12 |
797 | Entropy-Reinforced Planning with Large Language Models for Drug Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose ERP, Entropy-Reinforced Planning for Transformer Decoding, which employs an entropy-reinforced planning algorithm to enhance the Transformer decoding process and strike a balance between exploitation and exploration. |
Xuefeng Liu; Chih-chan Tien; Peng Ding; Songhao Jiang; Rick L. Stevens; | icml | 2024-06-12 |
798 | AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively. |
REDUAN ACHTIBAT et. al. | icml | 2024-06-12 |
799 | PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality. |
Praneeth Kacham; Vahab Mirrokni; Peilin Zhong; | icml | 2024-06-12 |
800 | Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses—that intuitively contain more learnable information and are harder to overfit—from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge. |
Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; | icml | 2024-06-12 |
801 | Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. |
HAOQI WU et. al. | icml | 2024-06-12 |
802 | Timer: Generative Pre-trained Transformers Are Large Time Series Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM). |
YONG LIU et. al. | icml | 2024-06-12 |
803 | In-Context Principle Learning from Mistakes IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples. |
TIANJUN ZHANG et. al. | icml | 2024-06-12 |
804 | InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval. |
BOXIN WANG et. al. | icml | 2024-06-12 |
805 | Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to *weakly supervise* superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model? |
COLLIN BURNS et. al. | icml | 2024-06-12 |
806 | GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website. |
Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su; | icml | 2024-06-12 |
807 | Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias terms. |
Brian K Chen; Tianyang Hu; Hui Jin; Hwee Kuan Lee; Kenji Kawaguchi; | icml | 2024-06-12 |
808 | Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed `OutEffHop`) and use it to address the outlier inefficiency problem of training gigantic transformer-based models. |
JERRY YAO-CHIEH HU et. al. | icml | 2024-06-12 |
809 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge. |
Fangyun Wei; Xi Chen; Lin Luo; | icml | 2024-06-12 |
810 | Prodigy: An Expeditiously Adaptive Parameter-Free Learner IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally. |
Konstantin Mishchenko; Aaron Defazio; | icml | 2024-06-12 |
811 | Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective. |
CHENG HAN et. al. | icml | 2024-06-12 |
812 | Fine-Tuned ‘Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification. |
Martin Juan José Bucher; Marco Martini; | arxiv-cs.CL | 2024-06-12 |
813 | Position: On The Possibilities of AI-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds. |
SOURADIP CHAKRABORTY et. al. | icml | 2024-06-12 |
814 | How Language Model Hallucinations Can Snowball IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim. |
Muru Zhang; Ofir Press; William Merrill; Alisa Liu; Noah A. Smith; | icml | 2024-06-12 |
815 | Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Differentiable Channel Selection, or DCS-Transformer. |
Yancheng Wang; Ping Li; Yingzhen Yang; | icml | 2024-06-12 |
816 | Discrete Diffusion Modeling By Estimating The Ratios of The Data Distribution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Crucially, standard diffusion models rely on the well-established theory of score matching, but efforts to generalize this to discrete structures have not yielded the same empirical gains. In this work, we bridge this gap by proposing score entropy, a novel loss that naturally extends score matching to discrete spaces, integrates seamlessly to build discrete diffusion models, and significantly boosts performance. |
Aaron Lou; Chenlin Meng; Stefano Ermon; | icml | 2024-06-12 |
817 | SpikeZIP-TF: Conversion Is All You Need for Transformer-based SNN Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel ANN-to-SNN conversion method called SpikeZIP-TF, where ANN and SNN are exactly equivalent, thus incurring no accuracy degradation. |
KANG YOU et. al. | icml | 2024-06-12 |
818 | What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth. |
Xingwu Chen; Difan Zou; | icml | 2024-06-12 |
819 | Improving Autoformalization Using Type Checking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis shows that the performance of these models is largely limited by their inability to generate formal statements that successfully type-check (i.e., are syntactically correct and consistent with types) – with a whopping 86.6% of GPT-4o errors starting from a type-check failure. In this work, we propose a method to fix this issue through decoding with type-check filtering, where we initially sample a diverse set of candidate formalizations for an informal statement, then use the Lean proof assistant to filter out candidates that do not type-check. |
Auguste Poiroux; Gail Weiss; Viktor Kunčak; Antoine Bosselut; | arxiv-cs.CL | 2024-06-11 |
820 | Anomaly Detection on Unstable Logs with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs. |
Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand; | arxiv-cs.SE | 2024-06-11 |
821 | LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approaches for the SMM4H24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders. |
Dasun Athukoralage; Thushari Atapattu; Menasha Thilakaratne; Katrina Falkner; | arxiv-cs.CL | 2024-06-11 |
822 | Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models. |
AmirMohammad Azadi; Baktash Ansari; Sina Zamani; | arxiv-cs.CL | 2024-06-11 |
823 | LLM-Powered Multimodal AI Conversations for Diabetes Prevention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The global prevalence of diabetes remains high despite rising life expectancy with improved quality and access to healthcare services. The significant burden that diabetes imposes … |
Dung Dao; Jun Yi Claire Teo; Wenru Wang; Hoang D. Nguyen; | Proceedings of the 1st ACM Workshop on AI-Powered Q&A … | 2024-06-10 |
824 | Unveiling The Safety of GPT-4o: An Empirical Study Using Jailbreak Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this paper adopts a series of multi-modal and uni-modal jailbreak attacks on 4 commonly used benchmarks encompassing three modalities (ie, text, speech, and image), which involves the optimization of over 4,000 initial text queries and the analysis and statistical evaluation of nearly 8,000+ response on GPT-4o. |
Zonghao Ying; Aishan Liu; Xianglong Liu; Dacheng Tao; | arxiv-cs.CR | 2024-06-10 |
825 | LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent. |
SIMRANJIT SINGH et. al. | arxiv-cs.DC | 2024-06-10 |
826 | Improving ROUGE-1 By 6%: A Novel Multilingual Transformer for Abstractive News Summarization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language processing (NLP) has undergone a significant transformation, evolving from manually crafted rules to powerful deep learning techniques such as transformers. These … |
Sandeep Kumar; Arun Solanki; | Concurr. Comput. Pract. Exp. | 2024-06-10 |
827 | Validating LLM-Generated Programs with Metamorphic Prompt Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research is required to comprehensively explore these critical concerns surrounding LLM-generated code. In this paper, we propose a novel solution called metamorphic prompt testing to address these challenges. |
Xiaoyin Wang; Dakai Zhu; | arxiv-cs.SE | 2024-06-10 |
828 | In-Context Learning and Fine-Tuning GPT for Argument Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an ICL strategy for ATC combining kNN-based examples selection and majority vote ensembling. |
Jérémie Cabessa; Hugo Hernault; Umer Mushtaq; | arxiv-cs.CL | 2024-06-10 |
829 | Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture. |
Martin Courtois; Malte Ostendorff; Leonhard Hennig; Georg Rehm; | arxiv-cs.CL | 2024-06-10 |
830 | Hidden Holes: Topological Aspects of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods developed in this paper are novel in the field and based on mathematical apparatus that might be unfamiliar to the target audience. |
Stephen Fitz; Peter Romero; Jiyan Jonas Schneider; | arxiv-cs.CL | 2024-06-09 |
831 | Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, resource-intensive VTs updating and high mobility of vehicles require intensive computation, communication, and storage resources, especially for their migration among RSUs with limited coverages. To address these issues, we propose an attribute-aware auction-based mechanism to optimize resource allocation during VTs migration by considering both price and non-monetary attributes, e.g., location and reputation. |
YONGJU TONG et. al. | arxiv-cs.AI | 2024-06-08 |
832 | MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. |
GYEONG HOON YI et. al. | arxiv-cs.CL | 2024-06-08 |
833 | G-Transformer: Counterfactual Outcome Prediction Under Dynamic and Time-varying Treatment Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present G-Transformer for counterfactual outcome prediction under dynamic and time-varying treatment strategies. |
Hong Xiong; Feng Wu; Leon Deng; Megan Su; Li-wei H Lehman; | arxiv-cs.LG | 2024-06-08 |
834 | Do LLMs Recognize Me, When I Is Not Me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first study examining indexical shift in any language, releasing a Turkish dataset specifically designed for this purpose. |
Metehan Oğuz; Yusuf Umut Ciftci; Yavuz Faruk Bakman; | arxiv-cs.CL | 2024-06-08 |
835 | Automata Extraction from Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automata extraction algorithm specifically designed for Transformer models. |
Yihao Zhang; Zeming Wei; Meng Sun; | arxiv-cs.LG | 2024-06-08 |
836 | Are Large Language Models More Empathetic Than Humans? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study exploring the empathetic responding capabilities of four state-of-the-art LLMs: GPT-4, LLaMA-2-70B-Chat, Gemini-1.0-Pro, and Mixtral-8x7B-Instruct in comparison to a human baseline. |
Anuradha Welivita; Pearl Pu; | arxiv-cs.CL | 2024-06-07 |
837 | VTrans: Accelerating Transformer Compression with Variational Information Bottleneck Based Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges, we propose VTrans, an iterative pruning framework guided by the Variational Information Bottleneck (VIB) principle. |
Oshin Dutta; Ritvik Gupta; Sumeet Agarwal; | arxiv-cs.LG | 2024-06-07 |
838 | BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense. |
Baktash Ansari; Mohammadmostafa Rostamkhani; Sauleh Eetemadi; | arxiv-cs.CL | 2024-06-07 |
839 | Transformer Conformal Prediction for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a conformal prediction method for time series using the Transformer architecture to capture long-memory and long-range dependencies. |
Junghwan Lee; Chen Xu; Yao Xie; | arxiv-cs.LG | 2024-06-07 |
840 | Low-Resource Cross-Lingual Summarization Through Few-Shot Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4. |
Gyutae Park; Seojin Hwang; Hwanhee Lee; | arxiv-cs.CL | 2024-06-07 |
841 | Logic Synthesis with Generative Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named ctrw (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting. |
XIHAN LI et. al. | arxiv-cs.LO | 2024-06-07 |
842 | Mixture-of-Agents Enhances Large Language Model Capabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. |
Junlin Wang; Jue Wang; Ben Athiwaratkun; Ce Zhang; James Zou; | arxiv-cs.CL | 2024-06-07 |
843 | Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations. |
NEEMESH YADAV et. al. | arxiv-cs.CL | 2024-06-06 |
844 | GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents’ performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of LLM agents. |
ANTHONY COSTARELLI et. al. | arxiv-cs.CL | 2024-06-06 |
845 | Exploring The Latest LLMs for Leaderboard Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document). |
Salomon Kabongo; Jennifer D’Souza; Sören Auer; | arxiv-cs.CL | 2024-06-06 |
846 | MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Multi-path Enhanced Taylor (MET) Transformer based U-net for Speech Enhancement (MUSE), a lightweight speech enhancement network built upon the Unet architecture. |
Zizhen Lin; Xiaoting Chen; Junyu Wang; | arxiv-cs.SD | 2024-06-06 |
847 | From Tarzan to Tolkien: Controlling The Language Proficiency Level of LLMs for Content Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of controlling the difficulty level of text generated by Large Language Models (LLMs) for contexts where end-users are not fully proficient, such as language learners. |
Ali Malik; Stephen Mayhew; Chris Piech; Klinton Bicknell; | arxiv-cs.CL | 2024-06-05 |
848 | The Good, The Bad, and The Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states. |
MIKHAIL MOZIKOV et. al. | arxiv-cs.AI | 2024-06-05 |
849 | CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition. |
YE ZENG et. al. | arxiv-cs.IT | 2024-06-05 |
850 | Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Global Clipper and Global Hybrid Clipper, effective mitigation strategies specifically designed for transformer-based models. |
QUTUB SYED SHA et. al. | arxiv-cs.CV | 2024-06-05 |
851 | Learning to Grok: Emergence of In-context Learning and Skill Composition in Modular Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks. |
Tianyu He; Darshil Doshi; Aritra Das; Andrey Gromov; | arxiv-cs.LG | 2024-06-04 |
852 | Multi-layer Learnable Attention Mask for Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Comprehensive experimental validation on various datasets, such as MADv2, QVHighlights, ImageNet 1K, and MSRVTT, demonstrates the efficacy of the LAM, exemplifying its ability to enhance model performance while mitigating redundant computations. This pioneering approach presents a significant advancement in enhancing the understanding of complex scenarios, such as in movie understanding. |
Wayner Barrios; SouYoung Jin; | arxiv-cs.CV | 2024-06-04 |
853 | Randomized Geometric Algebra Methods for Convex Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce randomized algorithms to Clifford’s Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces. |
Yifei Wang; Sungyoon Kim; Paul Chu; Indu Subramaniam; Mert Pilanci; | arxiv-cs.LG | 2024-06-04 |
854 | Too Big to Fail: Larger Language Models Are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous findings show that changes in PPL when masking attention layers in pre-trained transformer-based NLMs reflect linguistic anomalies associated with Alzheimer’s disease dementia. Building upon this, we explore a novel bidirectional attention head ablation method that exhibits properties attributed to the concepts of cognitive and brain reserve in human brain studies, which postulate that people with more neurons in the brain and more efficient processing are more resilient to neurodegeneration. |
Changye Li; Zhecheng Sheng; Trevor Cohen; Serguei Pakhomov; | arxiv-cs.CL | 2024-06-04 |
855 | Probing The Category of Verbal Aspect in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian. |
Anisia Katinskaia; Roman Yangarber; | arxiv-cs.CL | 2024-06-04 |
856 | A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs). |
Remi Genet; Hugo Inzirillo; | arxiv-cs.LG | 2024-06-04 |
857 | SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for semantic understanding for complex tasks like debugging and program repair. |
YANGRUIBO DING et. al. | arxiv-cs.CL | 2024-06-03 |
858 | Eliciting The Priors of Large Language Models Using Iterated In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a prompt-based workflow for eliciting prior distributions from LLMs. |
Jian-Qiao Zhu; Thomas L. Griffiths; | arxiv-cs.CL | 2024-06-03 |
859 | Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our empirical study focuses on evaluating adversarial robustness of object trackers based on bounding box versus binary mask predictions, and attack methods at different levels of perturbations. |
Fatemeh Nourilenjan Nokabadi; Jean-François Lalonde; Christian Gagné; | arxiv-cs.CV | 2024-06-03 |
860 | Performance Evaluation of Multimodal Large Language Models (LLaVA and GPT-4-based ChatGPT) in Medical Image Classification Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have gained significant attention due to their prospective applications in medicine. Utilizing multimodal LLMs can potentially assist clinicians in … |
Yuhang Guo; Zhiyu Wan; | 2024 IEEE 12th International Conference on Healthcare … | 2024-06-03 |
861 | Seeing Beyond Borders: Evaluating LLMs in Multilingual Ophthalmological Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs), such as GPT-3.5 [1] and GPT-4 [2], have significant potential for transforming several aspects of patient care from clinical note summarization to … |
DAVID RESTREPO et. al. | 2024 IEEE 12th International Conference on Healthcare … | 2024-06-03 |
862 | Superhuman Performance in Urology Board Questions By An Explainable Large Language Model Enabled for Context Integration of The European Association of Urology Guidelines: The UroBot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: UroBot was developed using OpenAI’s GPT-3.5, GPT-4, and GPT-4o models, employing retrieval-augmented generation (RAG) and the latest 2023 guidelines from the European Association of Urology (EAU). |
MARTIN J. HETZ et. al. | arxiv-cs.CL | 2024-06-03 |
863 | Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective. |
CHENG HAN et. al. | arxiv-cs.CV | 2024-06-03 |
864 | In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties? |
GRZEGORZ KASZUBA et. al. | arxiv-cs.LG | 2024-06-03 |
865 | Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs. |
SHIQI LIU et. al. | arxiv-cs.CL | 2024-06-02 |
866 | Drive As Veteran: Fine-tuning of An Onboard Large Language Model for Highway Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Due to the limitations of network communication conditions for online calling GPT, the onboard deployment of Large Language Models for autonomous driving is in need. In this … |
YUJIN WANG et. al. | 2024 IEEE Intelligent Vehicles Symposium (IV) | 2024-06-02 |
867 | Transformer-Based Adversarial Network for Semi-supervised Face Sketch Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhihua Shi; Weiguo Wan; | J. Vis. Commun. Image Represent. | 2024-06-01 |
868 | RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks. |
Md. Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Md. Ashad Alam; | arxiv-cs.CL | 2024-06-01 |
869 | Multimodal Metadata Assignment for Cultural Heritage Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset. |
LUIS REI et. al. | arxiv-cs.CV | 2024-06-01 |
870 | Transformer-based Fall Detection in Videos Related Papers Related Patents Related Grants Related Venues Related Experts View |
Adrián Núñez-Marcos; I. Arganda-Carreras; | Eng. Appl. Artif. Intell. | 2024-06-01 |
871 | EdgeTran: Device-Aware Co-Search of Transformers for Efficient Inference on Mobile Edge Platforms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while … |
Shikhar Tuli; N. Jha; | IEEE Transactions on Mobile Computing | 2024-06-01 |
872 | Beyond Boundaries: A Human-like Approach for Question Answering Over Structured and Unstructured Information Sources Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering … |
Jens Lehmann; Dhananjay Bhandiwad; Preetam Gattogi; S. Vahdati; | Transactions of the Association for Computational … | 2024-06-01 |
873 | SwinFG: A Fine-grained Recognition Scheme Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhipeng Ma; Xiaoyu Wu; Anzhuo Chu; Lei Huang; Zhiqiang Wei; | Expert Syst. Appl. | 2024-06-01 |
874 | Hyneter:Hybrid Network Transformer for Multiple Computer Vision Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this article, we point out that the essential differences between convolutional neural network (CNN)-based and transformer-based detectors, which cause worse performance of … |
Dong Chen; Duoqian Miao; Xuerong Zhao; | IEEE Transactions on Industrial Informatics | 2024-06-01 |
875 | Multi-granularity Cross Transformer Network for Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yanping Li; Duoqian Miao; Hongyun Zhang; Jie Zhou; Cairong Zhao; | Pattern Recognit. | 2024-06-01 |
876 | FuzzyTP-BERT: Enhancing Extractive Text Summarization with Fuzzy Topic Modeling and Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View |
Aytuğ Onan; Hesham A. Alhumyani; | J. King Saud Univ. Comput. Inf. Sci. | 2024-06-01 |
877 | Low-Contrast Medical Image Segmentation Via Transformer and Boundary Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Low-contrast medical image segmentation is a challenging task that requires full use of local details and global context. However, existing convolutional neural networks (CNNs) … |
YINGLIN ZHANG et. al. | IEEE Transactions on Emerging Topics in Computational … | 2024-06-01 |
878 | Transformer in Reinforcement Learning for Decision-making: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View |
WEILIN YUAN et. al. | Frontiers Inf. Technol. Electron. Eng. | 2024-06-01 |
879 | Bidirectional Interaction of CNN and Transformer for Image Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jialu Liu; Maoguo Gong; Yuan Gao; Yihe Lu; Hao Li; | Knowl. Based Syst. | 2024-06-01 |
880 | A Transformer and Convolution-Based Learning Framework for Automatic Modulation Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic modulation classification (AMC) is a typical pattern classification task that is an intermediate process between signal detection and demodulation. Deep learning methods … |
Wenxuan Ma; Zhuoran Cai; Chuan Wang; | IEEE Communications Letters | 2024-06-01 |
881 | TSD: Random Feature Query Design for Transformer-based Shrimp Detector Related Papers Related Patents Related Grants Related Venues Related Experts View |
Bo Gong; Ling Jing; Yingyi Chen; | Comput. Electron. Agric. | 2024-06-01 |
882 | Reinforcement Learning and Transformer for Fast Magnetic Resonance Imaging Scan Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A major drawback in Magnetic Resonance Imaging (MRI) is the long scan times necessary to acquire complete K-space matrices using phase encoding. This paper proposes a … |
Yiming Liu; Yanwei Pang; Ruiqi Jin; Yonghong Hou; Xuelong Li; | IEEE Transactions on Emerging Topics in Computational … | 2024-06-01 |
883 | Dual-branch Network Based on Transformer for Texture Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yangqi Liu; Hao Dong; Guodong Wang; Chenglizhao Chen; | Digit. Signal Process. | 2024-06-01 |
884 | Explainable Attention Pruning: A Metalearning-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Pruning, as a technique to reduce the complexity and size of transformer-based models, has gained significant attention in recent years. While various models have been … |
P. Rajapaksha; Noel Crespi; | IEEE Transactions on Artificial Intelligence | 2024-06-01 |
885 | Beyond Metrics: Evaluating LLMs’ Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our evaluation includes both quantitative analysis using metrics like F1 score and qualitative assessment of LLMs’ explanations for their predictions. We find that, while Mistral-7b and Mixtral-8x7b achieved high F1 scores, they and other LLMs such as GPT-3.5-Turbo, Llama-2-70b, and Gemma-7b struggled with understanding linguistic and contextual nuances, as well as lack of transparency in their decision-making process as observed from their explanations. |
MILLICENT OCHIENG et. al. | arxiv-cs.CL | 2024-06-01 |
886 | LiteFormer: A Lightweight and Efficient Transformer for Rotating Machine Fault Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, … |
WENJUN SUN et. al. | IEEE Transactions on Reliability | 2024-06-01 |
887 | How Random Is Random? Evaluating The Randomness and Humaness of LLMs’ Coin Flips Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One uniquely human trait is our inability to be random. We see and produce patterns where there should not be any and we do so in a predictable way. LLMs are supplied with human … |
K. V. Koevering; Jon Kleinberg; | ArXiv | 2024-05-31 |
888 | A Comparison of Correspondence Analysis with PMI-based Word Embedding Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we link correspondence analysis (CA) to the factorization of the PMI matrix. |
Qianqian Qi; Ayoub Bagheri; David J. Hessen; Peter G. M. van der Heijden; | arxiv-cs.CL | 2024-05-31 |
889 | Learning General Policies for Planning Through GPT Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning … |
NICHOLAS ROSSETTI et. al. | International Conference on Automated Planning and … | 2024-05-30 |
890 | Bi-Directional Transformers Vs. Word2vec: Discovering Vulnerabilities in Lifted Compiled Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting vulnerabilities within compiled binaries is challenging due to lost high-level code structures and other factors such as architectural dependencies, compilers, and optimization options. To address these obstacles, this research explores vulnerability detection using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa to learn semantics from intermediate representation (LLVM IR) code. |
Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier; | arxiv-cs.CR | 2024-05-30 |
891 | The Point of View of A Sentiment: Towards Clinician Bias Detection in Psychiatric Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging large language models, this work aims to identify the sentiment expressed in psychiatric clinical notes based on the reader’s point of view. |
Alissa A. Valentine; Lauren A. Lepow; Alexander W. Charney; Isotta Landi; | arxiv-cs.CL | 2024-05-30 |
892 | Automatic Graph Topology-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers. |
CHAO WANG et. al. | arxiv-cs.NE | 2024-05-30 |
893 | DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. |
JIA LI et. al. | arxiv-cs.CL | 2024-05-30 |
894 | Divide-and-Conquer Meets Consensus: Unleashing The Power of Functions in Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. |
JINGCHANG CHEN et. al. | arxiv-cs.CL | 2024-05-30 |
895 | Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RIR robustly improves knowledge-intensive visual question answering (VQA) of GPT-4V by 37-43%, GPT-4 Turbo by 25-27%, and GPT-4o by 18-20% in terms of open-ended VQA evaluation metrics. To our surprise, we discover that RIR helps the model to better access its own world knowledge. |
Jialiang Xu; Michael Moor; Jure Leskovec; | arxiv-cs.CL | 2024-05-29 |
896 | Multi-objective Cross-task Learning Via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new learning-based framework by leveraging the strong reasoning capability of the GPT-based architecture to automate surgical robotic tasks. |
Jiawei Fu; Yonghao Long; Kai Chen; Wang Wei; Qi Dou; | arxiv-cs.RO | 2024-05-29 |
897 | Towards Next-Generation Urban Decision Support Systems Through AI-Powered Generation of Scientific Ontology Using Large Language Models – A Case in Optimizing Intermodal Freight Transportation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management … |
JOSE TUPAYACHI et. al. | ArXiv | 2024-05-29 |
898 | MDS-ViTNet: Improving Saliency Prediction for Eye-Tracking with Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking. |
Polezhaev Ignat; Goncharenko Igor; Iurina Natalya; | arxiv-cs.CV | 2024-05-29 |
899 | Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As interest in reformulating the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA. |
Simranjit Singh; Georgios Pavlakos; Dimitrios Stamoulis; | arxiv-cs.CV | 2024-05-29 |
900 | Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Language models, such as GPT-3 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks, using instruction fine-tuning. … |
PENG LI et. al. | Proc. ACM Manag. Data | 2024-05-29 |
901 | A Multi-Source Retrieval Question Answering Framework Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information. |
RIDONG WU et. al. | arxiv-cs.IR | 2024-05-29 |
902 | Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Repeat Ranking method – where we evaluate the same responses multiple times and train only on those responses which are consistently ranked. |
Peter Devine; | arxiv-cs.CL | 2024-05-29 |
903 | AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we rethink the approach to jailbreaking LLMs and formally define three essential properties from the attacker’ s perspective, which contributes to guiding the design of jailbreak methods. |
JIAWEI CHEN et. al. | arxiv-cs.CV | 2024-05-29 |
904 | LMO-DP: Optimizing The Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$). |
QIN YANG et. al. | arxiv-cs.CR | 2024-05-29 |
905 | Beyond Agreement: Diagnosing The Rationale Alignment of Automated Essay Scoring Methods Based on Linguistically-informed Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that BERT-like models primarily focus on sentence-level features, whereas LLMs such as GPT-3.5, GPT-4 and Llama-3 are sensitive to conventions & accuracy, language complexity, and organization, indicating a more comprehensive rationale alignment with scoring rubrics. |
Yupei Wang; Renfen Hu; Zhe Zhao; | arxiv-cs.CL | 2024-05-29 |
906 | Voice Jailbreak Attacks Against GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first systematic measurement of jailbreak attacks against the voice mode of GPT-4o. |
Xinyue Shen; Yixin Wu; Michael Backes; Yang Zhang; | arxiv-cs.CR | 2024-05-29 |
907 | Data-Efficient Approach to Humanoid Control Via Fine-Tuning A Pre-Trained GPT on Action Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories. |
Siddharth Padmanabhan; Kazuki Miyazawa; Takato Horii; Takayuki Nagai; | arxiv-cs.RO | 2024-05-28 |
908 | Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks. |
Shubham Vatsal; Ayush Singh; | arxiv-cs.CL | 2024-05-28 |
909 | Notes on Applicability of GPT-4 to Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics. |
Łukasz Borchmann; | arxiv-cs.CL | 2024-05-28 |
910 | Delving Into Differentially Private Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such `reduction’ is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively. |
YOULONG DING et. al. | arxiv-cs.LG | 2024-05-28 |
911 | I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach aims to revolutionize teachers’ assessment of students’ practices by leveraging Generative Artificial Intelligence (GenAI) to offer detailed insights into classroom dynamics. |
UNGGI LEE et. al. | arxiv-cs.HC | 2024-05-28 |
912 | Look Ahead Text Understanding and LLM Stitching Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper proposes a look ahead text understanding problem with look ahead section identification (LASI) as an example. This problem may appear in generative AI as well as human … |
Junlin Julian Jiang; Xin Li; | International Conference on Web and Social Media | 2024-05-28 |
913 | PivotMesh: Generic 3D Mesh Generation Via Pivot Vertices Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a generic and scalable mesh generation framework PivotMesh, which makes an initial attempt to extend the native mesh generation to large-scale datasets. |
Haohan Weng; Yikai Wang; Tong Zhang; C. L. Philip Chen; Jun Zhu; | arxiv-cs.CV | 2024-05-27 |
914 | How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they … |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | ArXiv | 2024-05-27 |
915 | Multi-objective Representation for Numbers in Clinical Narratives Using CamemBERT-bio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to classify numerical values extracted from medical documents across seven distinct physiological categories, employing CamemBERT-bio. |
Boammani Aser Lompo; Thanh-Dung Le; | arxiv-cs.CL | 2024-05-27 |
916 | Toward A Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered a … |
M. EMANI et. al. | 2024 IEEE International Parallel and Distributed Processing … | 2024-05-27 |
917 | Vision-and-Language Navigation Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposal, the Vision-and-Language Navigation Generative Pretrained Transformer (VLN-GPT), adopts a transformer decoder model (GPT2) to model trajectory sequence dependencies, bypassing the need for historical encoding modules. |
Wen Hanlin; | arxiv-cs.AI | 2024-05-27 |
918 | Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous approaches to 3D human motion generation have achieved notable success, they often rely on extensive training and are limited to specific tasks. To address these challenges, we introduce Motion-Agent, an efficient conversational framework designed for general human motion generation, editing, and understanding. |
QI WU et. al. | arxiv-cs.CV | 2024-05-27 |
919 | RLAIF-V: Aligning MLLMs Through Open-Source AI Feedback for Super GPT-4V Trustworthiness IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm for super GPT-4V trustworthiness. |
TIANYU YU et. al. | arxiv-cs.CL | 2024-05-27 |
920 | LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker. |
Shaohua Dong; Yunhe Feng; Qing Yang; Yuewei Lin; Heng Fan; | arxiv-cs.CV | 2024-05-27 |
921 | InversionView: A General-Purpose Method for Reading Information from Neural Activations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activations. |
Xinting Huang; Madhur Panwar; Navin Goyal; Michael Hahn; | arxiv-cs.LG | 2024-05-27 |
922 | Deployment of Large Language Models to Control Mobile Robots at The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated. |
PASCAL SIKORSKI et. al. | arxiv-cs.RO | 2024-05-27 |
923 | Assessing LLMs Suitability for Knowledge Graph Completion Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Recent work has shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot … |
Vasile Ionut Remus Iga; Gheorghe Cosmin Silaghi; | arxiv-cs.CL | 2024-05-27 |
924 | Performance Evaluation of Reddit Comments Using Machine Learning and Natural Language Processing Methods in Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments. |
Xiaoxia Zhang; Xiuyuan Qi; Zixin Teng; | arxiv-cs.CL | 2024-05-26 |
925 | M3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three … |
MINGSHUANG LUO et. al. | ArXiv | 2024-05-25 |
926 | M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. |
MINGSHUANG LUO et. al. | arxiv-cs.CV | 2024-05-25 |
927 | Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. |
HOAI-CHAU TRAN et. al. | arxiv-cs.LG | 2024-05-25 |
928 | Activator: GLU Activation Function As The Core Component of A Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimental assessments conducted by this research show that both proposed modifications and reductions offer competitive performance in relation to baseline architectures, in support of the aims of this work in establishing a more efficient yet capable alternative to the traditional attention mechanism as the core component in designing transformer architectures. |
Abdullah Nazhat Abdullah; Tarkan Aydin; | arxiv-cs.CV | 2024-05-24 |
929 | Incremental Comprehension of Garden-Path Sentences By Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa. |
ANDREW LI et. al. | arxiv-cs.CL | 2024-05-24 |
930 | PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis. |
ZICHENG WANG et. al. | arxiv-cs.CV | 2024-05-24 |
931 | Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD). |
Gao Tianci; | arxiv-cs.RO | 2024-05-24 |
932 | Enhancing Non-player Characters in Unity 3D Using GPT-3.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This case study presents a comprehensive integration process of OpenAI’s GPT-3.5 large language model (LLM) into Unity 3D to enhance non-player characters (NPCs) in video games … |
John Sissler; | ACM Games: Research and Practice | 2024-05-24 |
933 | SMART: Scalable Multi-agent Real-time Simulation Via Next-token Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their … |
Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan; | ArXiv | 2024-05-24 |
934 | GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey. |
Virginia K. Felkner; Jennifer A. Thompson; Jonathan May; | arxiv-cs.CL | 2024-05-24 |
935 | GPTZoo: A Large-scale Dataset of GPTs for The Research Community Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support academic research on GPTs, we introduce GPTZoo, a large-scale dataset comprising 730,420 GPT instances. |
Xinyi Hou; Yanjie Zhao; Shenao Wang; Haoyu Wang; | arxiv-cs.SE | 2024-05-24 |
936 | A Comparative Analysis of Distributed Training Strategies for GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques developed to address these challenges, enabling the efficient and scalable training of Large Language Models. |
Ishan Patwardhan; Shubham Gandhi; Om Khare; Amit Joshi; Suraj Sawant; | arxiv-cs.DC | 2024-05-24 |
937 | SMART: Scalable Multi-agent Real-time Motion Generation Via Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens. |
Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan; | arxiv-cs.RO | 2024-05-24 |
938 | Comet: A Communication-efficient and Performant Approximation for Private Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel plug-in method Comet to effectively reduce the communication cost without compromising the inference performance. |
Xiangrui Xu; Qiao Zhang; Rui Ning; Chunsheng Xin; Hongyi Wu; | arxiv-cs.LG | 2024-05-24 |
939 | The Buffer Mechanism for Multi-Step Information Reasoning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy based on their inherent structure and horizontal thinking strategy based on Chain of Thought to achieve multi-step reasoning. |
ZHIWEI WANG et. al. | arxiv-cs.AI | 2024-05-24 |
940 | Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS). |
Barış Büyüktaş; Kenneth Weitzel; Sebastian Völkers; Felix Zailskas; Begüm Demir; | arxiv-cs.CV | 2024-05-24 |
941 | SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings. |
GUIBAO SHEN et. al. | arxiv-cs.CV | 2024-05-24 |
942 | CulturePark: Boosting Cross-cultural Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection. |
CHENG LI et. al. | arxiv-cs.AI | 2024-05-23 |
943 | An Evaluation of Estimative Uncertainty in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares estimative uncertainty in commonly used large language models (LLMs) like GPT-4 and ERNIE-4 to that of humans, and to each other. |
Zhisheng Tang; Ke Shen; Mayank Kejriwal; | arxiv-cs.CL | 2024-05-23 |
944 | CEEBERT: Cross-Domain Inference in Early Exit BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point. |
Divya Jyoti Bajpai; Manjesh Kumar Hanawal; | arxiv-cs.CL | 2024-05-23 |
945 | JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data. |
KUN ZHOU et. al. | arxiv-cs.CL | 2024-05-23 |
946 | Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to The Edge of Generalization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with. |
Boshi Wang; Xiang Yue; Yu Su; Huan Sun; | arxiv-cs.CL | 2024-05-23 |
947 | AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce AutoCoder, the first Large Language Model to surpass GPT-4 Turbo (April 2024) and GPT-4o in pass@1 on the Human Eval benchmark test ($\mathbf{90.9\%}$ vs. … |
Bin Lei; Yuchen Li; Qiuwu Chen; | ArXiv | 2024-05-23 |
948 | ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD. |
Luan Thanh Nguyen; | arxiv-cs.CL | 2024-05-22 |
949 | Transformer in Touch: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to comprehensively outline the application and development of Transformers in tactile technology. |
Jing Gao; Ning Cheng; Bin Fang; Wenjuan Han; | arxiv-cs.LG | 2024-05-21 |
950 | Quantifying Emergence in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a quantifiable solution for estimating emergence. |
Hang Chen; Xinyu Yang; Jiaying Zhu; Wenya Wang; | arxiv-cs.CL | 2024-05-21 |
951 | Towards Authoring Open-Ended Behaviors for Narrative Puzzle Games with Large Language Model Support Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Designing games with branching story lines, object annotations, scene details, and dialog can be challenging due to the intensive authoring required. We investigate the potential … |
Britney Ngaw; Grishma Jena; João Sedoc; Aline Normoyle; | Proceedings of the 19th International Conference on the … | 2024-05-21 |
952 | How Reliable AI Chatbots Are for Disease Prediction from Patient Complaints? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. |
Ayesha Siddika Nipu; K M Sajjadul Islam; Praveen Madiraju; | arxiv-cs.AI | 2024-05-21 |
953 | Advancing Web Science Through Foundation Model for Tabular Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As the landscape of web science expands, handling the vast datasets collected from the Web while preserving computational efficiency and privacy remains a significant challenge. … |
Inwon Kang; | Companion Publication of the 16th ACM Web Science Conference | 2024-05-21 |
954 | Generative AI and Large Language Models for Cyber Security: All Insights You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs). |
MOHAMED AMINE FERRAG et. al. | arxiv-cs.CR | 2024-05-21 |
955 | Cardistry: Exploring A GPT Model Workflow As An Adapted Method of Gaminiscing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cardistry is an application that enables users to create their own playing cards for use in evocative storytelling games. It is driven by OpenAI’s Generative Pre-trained … |
BRANDON LYMAN et. al. | Proceedings of the 19th International Conference on the … | 2024-05-21 |
956 | Exploring The Gap: The Challenge of Achieving Human-like Generalization for Concept-based Translation Instruction Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our study utilizes concept description instructions and few-shot learning examples to examine the effectiveness of a large language model (GPT-4) in generating Chinese-to-English … |
Ming Qian; Chuiqing Kong; | AAAI Spring Symposia | 2024-05-20 |
957 | Automated Hardware Logic Obfuscation Framework Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process. |
BANAFSHEH SABER LATIBARI et. al. | arxiv-cs.CR | 2024-05-20 |
958 | From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, … |
PRIYANKA NANAYAKKARA et. al. | 2024 IEEE Symposium on Security and Privacy (SP) | 2024-05-19 |
959 | DaVinci at SemEval-2024 Task 9: Few-shot Prompting GPT-3.5 for Unconventional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types. |
Suyash Vardhan Mathur; Akshett Rai Jindal; Manish Shrivastava; | arxiv-cs.CL | 2024-05-19 |
960 | Zero-Shot Stance Detection Using Contextual Data Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this approach, we aim to fine-tune an existing model at test time. |
Ghazaleh Mahmoudi; Babak Behkamkia; Sauleh Eetemadi; | arxiv-cs.CL | 2024-05-19 |
961 | Enhancing User Experience in Large Language Models Through Human-centered Design: Integrating Theoretical Insights with An Experimental Study to Meet Diverse Software Learning Needs with A Single Document Knowledge Base Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The experimental results demonstrate the effect of different elements’ forms and organizational methods in the document, as well as GPT’s relevant configurations, on the interaction effectiveness between GPT and software learners. |
Yuchen Wang; Yin-Shan Lin; Ruixin Huang; Jinyin Wang; Sensen Liu; | arxiv-cs.HC | 2024-05-19 |
962 | Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques. |
Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; Chen Hajaj; | arxiv-cs.CR | 2024-05-18 |
963 | Benchmarking Large Language Models on CFLUE – A Chinese Financial Language Understanding Evaluation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In light of recent breakthroughs in large language models (LLMs) that have revolutionized natural language processing (NLP), there is an urgent need for new benchmarks to keep … |
Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo; | Annual Meeting of the Association for Computational … | 2024-05-17 |
964 | GPTs Window Shopping: An Analysis of The Landscape of Custom ChatGPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Customization comes in the form of prompt-tuning, analysis of reference resources, browsing, and external API interactions, alongside a promise of revenue sharing for created custom GPTs. In this work, we peer into the window of the GPT Store and measure its impact. |
Benjamin Zi Hao Zhao; Muhammad Ikram; Mohamed Ali Kaafar; | arxiv-cs.SI | 2024-05-17 |
965 | Benchmarking Large Language Models on CFLUE — A Chinese Financial Language Understanding Evaluation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CFLUE, the Chinese Financial Language Understanding Evaluation benchmark, designed to assess the capability of LLMs across various dimensions. |
Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo; | arxiv-cs.CL | 2024-05-17 |
966 | GPT Store Mining and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings aim to enhance understanding of the GPT ecosystem, providing valuable insights for future research, development, and policy-making in generative AI. |
Dongxun Su; Yanjie Zhao; Xinyi Hou; Shenao Wang; Haoyu Wang; | arxiv-cs.LG | 2024-05-16 |
967 | Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, no comparative study examining different LLMs has yet been reported for web-form-test generation. |
TAO LI et. al. | arxiv-cs.SE | 2024-05-16 |
968 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 1.55B parameters. |
RHEA SANJAY SUKTHANKER et. al. | arxiv-cs.LG | 2024-05-16 |
969 | Comparing The Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to compare the performance of two large language models, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings. |
Birger Moell; | arxiv-cs.CL | 2024-05-15 |
970 | Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP). |
Tong Zhan; Chenxi Shi; Yadong Shi; Huixiang Li; Yiyu Lin; | arxiv-cs.CL | 2024-05-15 |
971 | GPT-3.5 for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models. |
Anisia Katinskaia; Roman Yangarber; | arxiv-cs.CL | 2024-05-14 |
972 | Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a theoretical framework that sheds light on the memorization process and performance dynamics of transformer-based language models. |
Xueyan Niu; Bo Bai; Lei Deng; Wei Han; | arxiv-cs.LG | 2024-05-14 |
973 | Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis Between Emotional Stimuli Prompt, Fine-Tuning, and In-Context Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Textual emotion recognition (TER) has significant commercial potential since it can be used as an excellent tool to monitor a brand/business reputation, understand customer … |
E. Nfaoui; Hanane Elfaik; | J. Theor. Appl. Electron. Commer. Res. | 2024-05-14 |
974 | Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Continual learning, which acts as an effective tool for detecting newly emerged deepfake audio while maintaining performance on older types, lacks a well-constructed and user-friendly evaluation framework. To address this gap, we introduce EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection. |
Xiaohui Zhang; Jiangyan Yi; Jianhua Tao; | arxiv-cs.SD | 2024-05-14 |
975 | Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes a concurrent programming framework for quantitatively analyzing the efficiency challenges in serving multiple long-context requests under limited size of GPU high-bandwidth memory (HBM) regime. |
Yao Fu; | arxiv-cs.LG | 2024-05-14 |
976 | Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs. |
CHENGYUE WU et. al. | arxiv-cs.CL | 2024-05-13 |
977 | Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Memes are important because they serve as conduits for expressing emotions, opinions, and social commentary online, providing valuable insight into public sentiment, trends, and … |
F. Abdullakutty; Usman Naseem; | Companion Proceedings of the ACM on Web Conference 2024 | 2024-05-13 |
978 | Relationalizing Tables with Large Language Models: The Promise and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural … |
Zezhou Huang; Eugene Wu; | 2024 IEEE 40th International Conference on Data Engineering … | 2024-05-13 |
979 | Coding Historical Causes of Death Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. |
BJØRN PEDERSEN et. al. | arxiv-cs.LG | 2024-05-13 |
980 | Large Language Models: Principles and Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The last few years have been marked by several breakthroughs in the domain of generative AI. Large language models such as GPT-4 are able to solve a plethora of tasks, ranging … |
Immanuel Trummer; | 2024 IEEE 40th International Conference on Data Engineering … | 2024-05-13 |
981 | Decision Mamba Architectures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models. |
André Correia; Luís A. Alexandre; | arxiv-cs.LG | 2024-05-13 |
982 | PRECYSE: Predicting Cybersickness Using Transformer for Multimodal Time-Series Sensor Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cybersickness, a factor that hinders user immersion in VR, has been the subject of ongoing attempts to predict it using AI. Previous studies have used CNN and LSTM for prediction … |
Dayoung Jeong; Kyungsik Han; | Proceedings of the ACM on Interactive, Mobile, Wearable and … | 2024-05-13 |
983 | COLA: Cross-city Mobility Transformer for Human Trajectory Simulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to explore the intriguing problem of mobility transfer across cities, grasping the universal patterns of human trajectories to augment the powerful Transformer with external mobility data. |
Yu Wang; Tongya Zheng; Yuxuan Liang; Shunyu Liu; Mingli Song; | www | 2024-05-13 |
984 | Lgt: Long-range Graph Transformer for Early Rumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jinghong Xia; Yuling Li; Kui Yu; | Social Network Analysis and Mining | 2024-05-13 |
985 | The Personality Dimensions GPT-3 Expresses During Human-Chatbot Interactions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models such as GPT-3 and ChatGPT can mimic human-to-human conversation with unprecedented fidelity, which enables many applications such as conversational agents … |
N. Kovačević; Christian Holz; Markus Gross; Rafael Wampfler; | Proceedings of the ACM on Interactive, Mobile, Wearable and … | 2024-05-13 |
986 | Can GNN Be Good Adapter for LLMs? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. |
XUANWEN HUANG et. al. | www | 2024-05-13 |
987 | Open-vocabulary Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the \textbf{Brain Prompt GPT (BP-GPT)}. |
Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He; | arxiv-cs.HC | 2024-05-13 |
988 | L(u)PIN: LLM-based Political Ideology Nowcasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method to analyze ideological positions of individual parliamentary representatives by leveraging the latent knowledge of LLMs. |
Ken Kato; Annabelle Purnomo; Christopher Cochrane; Raeid Saqur; | arxiv-cs.CL | 2024-05-12 |
989 | Limited Ability of LLMs to Simulate Human Psychological Behaviours: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we prompt OpenAI’s flagship models, GPT-3.5 and GPT-4, to assume different personas and respond to a range of standardized measures of personality constructs. |
Nikolay B Petrov; Gregory Serapio-García; Jason Rentfrow; | arxiv-cs.CL | 2024-05-12 |
990 | Can Language Models Explain Their Own Classification Behavior? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules. |
Dane Sherburn; Bilal Chughtai; Owain Evans; | arxiv-cs.LG | 2024-05-12 |
991 | Retrieval Enhanced Zero-Shot Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to take advantage of existing pre-trained large-scale vision and language models to directly generate captions with test time adaptation. |
YUNCHUAN MA et. al. | arxiv-cs.CV | 2024-05-11 |
992 | Integrating Expertise in LLMs: Crafting A Customized Nutrition Assistant with Refined Template Instructions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have the potential to contribute to the fields of nutrition and dietetics in generating food product explanations that facilitate informed food … |
Annalisa Szymanski; Brianna L Wimer; Oghenemaro Anuyah; H. Eicher-Miller; Ronald A Metoyer; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
993 | ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Peer influence plays a crucial role in promoting classroom participation, where behaviors from active students can contribute to a collective classroom learning experience. … |
ZIYI LIU et. al. | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
994 | GPTs in Mafia-like Game Simulation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this research, we explore the potential of Generative AI models, focusing on their application in role-playing simulations through Spyfall, a renowned mafia-style game. By … |
Munyeong Kim; | Extended Abstracts of the CHI Conference on Human Factors … | 2024-05-11 |
995 | An Autoethnographic Reflection of Prompting A Custom GPT Based on Oneself Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: What if you could have a chat with yourself? OpenAI’s introduction of custom GPTs in November 2023 provides an opportunity for non-technical users to create specialized generative … |
Priscilla Y. Lo; | Extended Abstracts of the CHI Conference on Human Factors … | 2024-05-11 |
996 | Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution. |
IBAI RAMIREZ et. al. | arxiv-cs.LG | 2024-05-10 |
997 | TacoERE: Cluster-aware Compression for Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm. |
YONG GUAN et. al. | arxiv-cs.CL | 2024-05-10 |
998 | Multimodal LLMs Struggle with Basic Visual Network Analysis: A VNA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose. |
Evan M. Williams; Kathleen M. Carley; | arxiv-cs.CV | 2024-05-10 |
999 | A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task. |
Dongwei Sun; Yajie Bao; Junmin Liu; Xiangyong Cao; | arxiv-cs.CV | 2024-05-10 |
1000 | ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design. |
Francisco Olivos; Minhui Liu; | arxiv-cs.CY | 2024-05-10 |
1001 | Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder. |
YAO GE et. al. | arxiv-cs.CL | 2024-05-09 |
1002 | People Cannot Distinguish GPT-4 from A Human in A Turing Test IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or … |
Cameron R. Jones; Benjamin K. Bergen; | ArXiv | 2024-05-09 |
1003 | Optimizing Software Vulnerability Detection Using RoBERTa and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View |
Cho Xuan Do; Nguyen Trong Luu; Phuong Thi Lan Nguyen; | Autom. Softw. Eng. | 2024-05-08 |
1004 | Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. |
HAOQI WU et. al. | arxiv-cs.CR | 2024-05-08 |
1005 | Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals. |
Aylin Gunal; Baihan Lin; Djallel Bouneffouf; | arxiv-cs.CL | 2024-05-08 |
1006 | Integrating Pepper Robot and GPT for Neuromyth Educational Conversation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of neuromyths, or false beliefs about brain function and learning, has been a significant challenge in the field of education. These myths often hinders the learning … |
Abdelhadi Hireche; Abdelkader Nasreddine Belkacem; | 2024 IEEE Global Engineering Education Conference (EDUCON) | 2024-05-08 |
1007 | Few-Shot Class Incremental Learning Via Robust Transformer Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
NAEEM PAEEDEH et. al. | Inf. Sci. | 2024-05-08 |
1008 | Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by recent work that has utilised very powerful LLMs, such as GPT-4, to evaluate the outputs produced by less powerful models, we conduct an automated analysis of the quality of the feedback produced by several open source models using a dataset from an introductory programming course. |
CHARLES KOUTCHEME et. al. | arxiv-cs.CL | 2024-05-08 |
1009 | A Transformer with Stack Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism. |
Jiaoda Li; Jennifer C. White; Mrinmaya Sachan; Ryan Cotterell; | arxiv-cs.CL | 2024-05-07 |
1010 | Evaluating Text Summaries Generated By Large Language Models Using OpenAI’s GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research examines the effectiveness of OpenAI’s GPT models as independent evaluators of text summaries generated by six transformer-based models from Hugging Face: DistilBART, BERT, ProphetNet, T5, BART, and PEGASUS. |
Hassan Shakil; Atqiya Munawara Mahi; Phuoc Nguyen; Zeydy Ortiz; Mamoun T. Mardini; | arxiv-cs.CL | 2024-05-07 |
1011 | Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries. |
Hassan Shakil; Zeydy Ortiz; Grant C. Forbes; | arxiv-cs.CL | 2024-05-07 |
1012 | GPT-Enabled Cybersecurity Training: A Tailored Approach for Effective Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the limitations of traditional Cybersecurity Awareness and Training (CSAT) programs and proposes an innovative solution using Generative Pre-Trained Transformers (GPT) to address these shortcomings. |
Nabil Al-Dhamari; Nathan Clarke; | arxiv-cs.CR | 2024-05-07 |
1013 | How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms. |
Jorge García-Carrasco; Alejandro Maté; Juan Trujillo; | arxiv-cs.LG | 2024-05-07 |
1014 | Structured Click Control in Transformer-based Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness of the response, we propose a structured click intent model based on graph neural networks, which adaptively obtains graph nodes via the global similarity of user-clicked Transformer tokens. |
Long Xu; Yongquan Chen; Rui Huang; Feng Wu; Shiwu Lai; | arxiv-cs.CV | 2024-05-07 |
1015 | The Silicon Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness. |
Lena Armstrong; Abbey Liu; Stephen MacNeil; Danaë Metaxa; | arxiv-cs.CY | 2024-05-07 |
1016 | Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This anchored bias challenges the integrity of GPT-2’s decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias. |
Ruizhe Li; Yanjun Gao; | arxiv-cs.CL | 2024-05-06 |
1017 | Hire Me or Not? Examining Language Model’s Behavior with Occupation Attributes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the impressive performance in various downstream tasks, large language models (LLMs) have been widely integrated into production pipelines, like recruitment and recommendation systems. |
Damin Zhang; Yi Zhang; Geetanjali Bihani; Julia Rayz; | arxiv-cs.CL | 2024-05-06 |
1018 | Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs have not yet been used to characterize synergistic learning in students’ collaborative discourse. In this exploratory work, we take a first step towards adopting a human-in-the-loop prompt engineering approach with GPT-4-Turbo to summarize and categorize students’ synergistic learning during collaborative discourse. |
Clayton Cohn; Caitlin Snyder; Justin Montenegro; Gautam Biswas; | arxiv-cs.CL | 2024-05-06 |
1019 | Addressing Data Scarcity in The Medical Domain: A GPT-Based Approach for Synthetic Data Generation and Feature Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research confronts the persistent challenge of data scarcity in medical machine learning by introducing a pioneering methodology that harnesses the capabilities of Generative … |
F. Sufi; | Inf. | 2024-05-06 |
1020 | Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation. |
Keith Burghardt; Kai Chen; Kristina Lerman; | arxiv-cs.CL | 2024-05-06 |
1021 | Detecting Anti-Semitic Hate Speech Using Transformer-based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we developed a new data labeling technique and established a proof of concept targeting anti-Semitic hate speech, utilizing a variety of transformer models such as BERT (arXiv:1810.04805), DistillBERT (arXiv:1910.01108), RoBERTa (arXiv:1907.11692), and LLaMA-2 (arXiv:2307.09288), complemented by the LoRA fine-tuning approach (arXiv:2106.09685). |
Dengyi Liu; Minghao Wang; Andrew G. Catlin; | arxiv-cs.CL | 2024-05-06 |
1022 | Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems. |
Bingzhang Wang; Muhammad Monjurul Karim; Chenxi Liu; Yinhai Wang; | arxiv-cs.MA | 2024-05-05 |
1023 | Can Large Language Models Make The Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents reports on a series of experiments with a novel dataset evaluating how well Large Language Models (LLMs) can mark (i.e. grade) open text responses to short answer questions, Specifically, we explore how well different combinations of GPT version and prompt engineering strategies performed at marking real student answers to short answer across different domain areas (Science and History) and grade-levels (spanning ages 5-16) using a new, never-used-before dataset from Carousel, a quizzing platform. |
Owen Henkel; Adam Boxer; Libby Hills; Bill Roberts; | arxiv-cs.CL | 2024-05-05 |
1024 | Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, … |
Sven Jacobs; Steffen Jaschke; | 2024 36th International Conference on Software Engineering … | 2024-05-05 |
1025 | Unraveling The Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the underexplored area of evaluating LLMs in low-resourced languages such as Bengali. |
Fatema Tuj Johora Faria; Mukaffi Bin Moin; Asif Iftekher Fahim; Pronay Debnath; Faisal Muhammad Shah; | arxiv-cs.CL | 2024-05-05 |
1026 | SCATT: Transformer Tracking with Symmetric Cross-attention Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jianming Zhang; Wentao Chen; Jiangxin Dai; Jin Zhang; | Appl. Intell. | 2024-05-04 |
1027 | A Combination of BERT and Transformer for Vietnamese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction. |
Hieu Ngo Trung; Duong Tran Ham; Tin Huynh; Kiem Hoang; | arxiv-cs.CL | 2024-05-04 |
1028 | U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on self-attention with downsampled tokens, we propose a series of U-shaped DiTs (U-DiTs) in the paper and conduct extensive experiments to demonstrate the extraordinary performance of U-DiT models. |
YUCHUAN TIAN et. al. | arxiv-cs.CV | 2024-05-04 |
1029 | Structural Pruning of Pre-trained Language Models Via Neural Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process. |
Aaron Klein; Jacek Golebiowski; Xingchen Ma; Valerio Perrone; Cedric Archambeau; | arxiv-cs.LG | 2024-05-03 |
1030 | Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on The Travelling Salesman Problem Using GPT-3.5 Turbo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP). |
Mahmoud Masoud; Ahmed Abdelhay; Mohammed Elhenawy; | arxiv-cs.CL | 2024-05-03 |
1031 | Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to Test BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. |
PATRICK KRAUSS et. al. | arxiv-cs.CL | 2024-05-03 |
1032 | REASONS: A Benchmark for REtrieval and Automated CitationS Of ScieNtific Sentences Using Public and Proprietary LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article. |
DEEPA TILWANI et. al. | arxiv-cs.CL | 2024-05-03 |
1033 | The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This study introduces a systematic framework to compare the efficacy of Large Language Models (LLMs) for fine-tuning across various cheminformatics tasks. Employing a uniform … |
Youngmin Lee; Andrew S. I. D. Lang; Duoduo Cai; Wheat R. Stephen; | ArXiv | 2024-05-02 |
1034 | The Effectiveness of LLMs As Annotators: A Comparative Overview and Empirical Analysis of Direct Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data. |
Maja Pavlovic; Massimo Poesio; | arxiv-cs.CL | 2024-05-02 |
1035 | UQA: Corpus for Urdu Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers. |
Samee Arif; Sualeha Farid; Awais Athar; Agha Ali Raza; | arxiv-cs.CL | 2024-05-02 |
1036 | Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. We investigate the ability of … |
TOLGA BUZ et. al. | STARSEM | 2024-05-02 |
1037 | Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. |
TOLGA BUZ et. al. | arxiv-cs.CL | 2024-05-02 |
1038 | Empowering IoT with Generative AI: Applications, Case Studies, and Limitations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rise of the Generative Pre-Trained Transformer(GPT) language model, more commonly used as ChatGPT has brought a spotlight on the ever-developing field of Generative AI (GAI).} … |
Siva Sai; Mizaan Kanadia; V. Chamola; | IEEE Internet of Things Magazine | 2024-05-01 |
1039 | Transformer Dense Center Network for Liver Tumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
JINLIN MA et. al. | Biomed. Signal Process. Control. | 2024-05-01 |
1040 | A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges. |
Ayaz Mehmood; Muhammad Tayyab Zamir; Muhammad Asif Ayub; Nasir Ahmad; Kashif Ahmad; | arxiv-cs.CL | 2024-05-01 |
1041 | PWLT: Pyramid Window-based Lightweight Transformer for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
YUWEI MO et. al. | Comput. Electr. Eng. | 2024-05-01 |
1042 | Vision Transformer: To Discover The four Secrets of Image Patches Related Papers Related Patents Related Grants Related Venues Related Experts View |
TAO ZHOU et. al. | Inf. Fusion | 2024-05-01 |
1043 | Chat-GPT; Validating Technology Acceptance Model (TAM) in Education Sector Via Ubiquitous Learning Mechanism IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
N. SAIF et. al. | Comput. Hum. Behav. | 2024-05-01 |
1044 | FedViT: Federated Continual Learning of Vision Transformer at Edge Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIAOJIANG ZUO et. al. | Future Gener. Comput. Syst. | 2024-05-01 |
1045 | CSPFormer: A Cross-spatial Pyramid Transformer for Visual Place Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhenyu Li; Pengjie Xu; | Neurocomputing | 2024-05-01 |
1046 | Collaborative Compensative Transformer Network for Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jun Chen; Heye Zhang; Mingming Gong; Zhifan Gao; | Pattern Recognit. | 2024-05-01 |
1047 | Dynamic Spatial Aware Graph Transformer for Spatiotemporal Traffic Flow Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zequan Li; Jinglin Zhou; Zhizhe Lin; Teng Zhou; | Knowl. Based Syst. | 2024-05-01 |
1048 | Font Transformer for Few-shot Font Generation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xu Chen; Lei Wu; Yongliang Su; Lei Meng; Xiangxu Meng; | Comput. Vis. Image Underst. | 2024-05-01 |
1049 | Reinforced Res-Unet Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View |
Peitong Li; Jiaying Chen; Chengtao Cai; | Signal Process. Image Commun. | 2024-05-01 |
1050 | How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. |
JIONGHAO LIN et. al. | arxiv-cs.CL | 2024-05-01 |
1051 | Structural and Positional Ensembled Encoding for Graph Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jeyoon Yeom; Taero Kim; Rakwoo Chang; Kyungwoo Song; | Pattern Recognit. Lett. | 2024-05-01 |
1052 | Semantic Perceptive Infrared and Visible Image Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIN YANG et. al. | Pattern Recognit. | 2024-05-01 |
1053 | On Compositional Generalization of Transformer-based Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yongjing Yin; Lian Fu; Yafu Li; Yue Zhang; | Inf. Fusion | 2024-05-01 |
1054 | Energy-informed Graph Transformer Model for Solid Mechanical Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View |
Bo Feng; Xiaoping Zhou; | Commun. Nonlinear Sci. Numer. Simul. | 2024-05-01 |
1055 | Do Large Language Models Understand Conversational Implicature — A Case Study with A Chinese Sitcom Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $\textit{My Own Swordsman}$. |
Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu; | arxiv-cs.CL | 2024-04-30 |
1056 | Harmonic LLMs Are Trustworthy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an intuitive method to test the robustness (stability and explainability) of any black-box LLM in real-time via its local deviation from harmoniticity, denoted as $\gamma$. |
Nicholas S. Kersting; Mohammad Rahman; Suchismitha Vedala; Yang Wang; | arxiv-cs.LG | 2024-04-30 |
1057 | Do Large Language Models Understand Conversational Implicature – A Case Study with A Chinese Sitcom Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce … |
Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu; | ArXiv | 2024-04-30 |
1058 | How Can I Improve? Using GPT to Highlight The Desired and Undesired Parts of Open-ended Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our aim is to equip tutors with actionable, explanatory feedback during online training lessons. |
JIONGHAO LIN et. al. | arxiv-cs.CL | 2024-04-30 |
1059 | Ethical Reasoning and Moral Value Alignment of LLMs Depend on The Language We Prompt Them in Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs — GPT-4, ChatGPT, and Llama2-70B-Chat — perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted. |
Utkarsh Agarwal; Kumar Tanmay; Aditi Khandelwal; Monojit Choudhury; | arxiv-cs.CL | 2024-04-29 |
1060 | RSCaMa: Remote Sensing Image Change Captioning with State Space Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite previous methods progressing in the spatial change perception, there are still weaknesses in joint spatial-temporal modeling. To address this, in this paper, we propose a novel RSCaMa model, which achieves efficient joint spatial-temporal modeling through multiple CaMa layers, enabling iterative refinement of bi-temporal features. |
CHENYANG LIU et. al. | arxiv-cs.CV | 2024-04-29 |
1061 | Can GPT-4 Do L2 Analytic Assessment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components. |
Stefano Bannò; Hari Krishna Vydana; Kate M. Knill; Mark J. F. Gales; | arxiv-cs.CL | 2024-04-29 |
1062 | Normalization of Arabic Dialects Into Modern Standard Arabic Using BERT and GPT-2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present an encoder-decored based model for normalization of Arabic dialects using both BERT and GPT-2 based models. Arabic is a language of many dialects that not only differ … |
Khalid Alnajjar; Mika Hämäläinen; | J. Data Min. Digit. Humanit. | 2024-04-29 |
1063 | GPT-4 Passes Most of The 297 Written Polish Board Certification Examinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: We developed a software program to download and process PES exams and tested the performance of GPT models using OpenAI Application Programming Interface. |
Jakub Pokrywka; Jeremi Kaczmarek; Edward Gorzelańczyk; | arxiv-cs.CL | 2024-04-29 |
1064 | Time Machine GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative. |
Felix Drinkall; Eghbal Rahimikia; Janet B. Pierrehumbert; Stefan Zohren; | arxiv-cs.CL | 2024-04-29 |
1065 | PatentGPT: A Large Language Model for Intellectual Property Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain. |
ZILONG BAI et. al. | arxiv-cs.CL | 2024-04-28 |
1066 | Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the study of how subwording affects the understanding capacity of language models has been very few and only limited to a handful of languages. To reduce this gap we used 6 different tokenization schemes to pretrain relatively small language models in Nepali and used the representations learned to finetune on several downstream tasks. |
Nishant Luitel; Nirajan Bekoju; Anand Kumar Sah; Subarna Shakya; | arxiv-cs.CL | 2024-04-28 |
1067 | Transfer Learning and Transformer Architecture for Financial Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View |
Tohida Rehman; Raghubir Bose; S. Chattopadhyay; Debarshi Kumar Sanyal; | ArXiv | 2024-04-28 |
1068 | GPT for Games: A Scoping Review (2020-2023) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a scoping review of 55 articles to explore GPT’s potential for games, offering researchers a comprehensive understanding of the current applications and identifying both emerging trends and unexplored areas. |
Daijin Yang; Erica Kleinman; Casper Harteveld; | arxiv-cs.HC | 2024-04-27 |
1069 | Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work addresses the task of detecting conspiracy theories in German Telegram messages. |
Milena Pustet; Elisabeth Steffen; Helena Mihaljević; | arxiv-cs.CL | 2024-04-27 |
1070 | MRScore: Evaluating Radiology Report Generation with LLM-based Reward System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MRScore, an automatic evaluation metric tailored for radiology report generation by leveraging Large Language Models (LLMs). |
YUNYI LIU et. al. | arxiv-cs.CL | 2024-04-27 |
1071 | CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments. |
KAIXUAN HUANG et. al. | arxiv-cs.AI | 2024-04-27 |
1072 | UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our team’s participation in the MEDIQA-ClinicalNLP 2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating … |
PARTH VASHISHT et. al. | ArXiv | 2024-04-27 |
1073 | ChatGPT Is Here to Help, Not to Replace Anybody — An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, 52 first-year CS students were surveyed in order to assess their views on technologies with code-generation capabilities, both from academic and professional perspectives. |
Bruno Pereira Cipriano; Pedro Alves; | arxiv-cs.ET | 2024-04-26 |
1074 | Enhancing Legal Compliance and Regulation Analysis with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks. |
Shabnam Hassani; | arxiv-cs.SE | 2024-04-26 |
1075 | UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt — A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. |
PARTH VASHISHT et. al. | arxiv-cs.AI | 2024-04-26 |
1076 | ChatGPT Is Here to Help, Not to Replace Anybody – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) like GPT and Bard are capable of producing code based on textual descriptions, with remarkable efficacy. Such technology will have profound … |
Bruno Pereira Cipriano; P. Alves; | ArXiv | 2024-04-26 |
1077 | Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT As A Pivot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process. |
Michelle Terblanche; Kayode Olaleye; Vukosi Marivate; | arxiv-cs.CL | 2024-04-26 |
1078 | Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative artificial intelligences, particularly large language models (LLMs), play an increasingly prominent role in human decision-making contexts, necessitating transparency … |
Lydia Uhler; Verena Jordan; Jürgen Buder; Markus Huff; Frank Papenmeier; | arxiv-cs.CL | 2024-04-25 |
1079 | Exploring Internal Numeracy in Language Models: A Case Study on ALBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models. |
Ulme Wennberg; Gustav Eje Henter; | arxiv-cs.CL | 2024-04-25 |
1080 | Player-Driven Emergence in LLM-Driven Game Narrative Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives. |
XIANGYU PENG et. al. | arxiv-cs.CL | 2024-04-25 |
1081 | IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate a wide range of proprietary and open-source LLMs including GPT-3.5, GPT-4, PaLM-2, mT5, Gemma, BLOOM and LLaMA on IndicGenBench in a variety of settings. |
Harman Singh; Nitish Gupta; Shikhar Bharadwaj; Dinesh Tewari; Partha Talukdar; | arxiv-cs.CL | 2024-04-25 |
1082 | TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters. |
LIANG ZHANG et. al. | arxiv-cs.CV | 2024-04-25 |
1083 | Towards Efficient Patient Recruitment for Clinical Trials: Application of A Prompt-Based Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR. |
Mojdeh Rahmanian; Seyed Mostafa Fakhrahmad; Seyedeh Zahra Mousavi; | arxiv-cs.CL | 2024-04-24 |
1084 | An Automated Learning Model for Twitter Sentiment Analysis Using Ranger AdaBelief Optimizer Based Bidirectional Long Short Term Memory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment analysis is an automated approach which is utilized in process of analysing textual data to describe public opinion. The sentiment analysis has major role in creating … |
Sasirekha Natarajan; Smitha Kurian; P. Divakarachari; Przemysław Falkowski‐Gilski; | Expert Syst. J. Knowl. Eng. | 2024-04-24 |
1085 | The Promise and Challenges of Using LLMs to Accelerate The Screening Process of Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening. |
Aleksi Huotala; Miikka Kuutila; Paul Ralph; Mika Mäntylä; | arxiv-cs.CL | 2024-04-24 |
1086 | Automated Creation of Source Code Variants of A Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CR | 2024-04-24 |
1087 | GeckOpt: LLM System Efficiency Via Intent-Based Tool Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By … |
Michael Fore; Simranjit Singh; Dimitrios Stamoulis; | Proceedings of the Great Lakes Symposium on VLSI 2024 | 2024-04-24 |
1088 | A Comprehensive Survey on Evaluating Large Language Model Applications in The Medical Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective and ethical deployment. |
Yining Huang; Keke Tang; Meilian Chen; Boyuan Wang; | arxiv-cs.CL | 2024-04-24 |
1089 | Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on a specific use case, pharmaceutical manufacturing investigations, and propose that leveraging historical records of manufacturing incidents and deviations in an organization can be beneficial for addressing and closing new cases, or de-risking new manufacturing campaigns. |
Hossein Salami; Brandye Smith-Goettler; Vijay Yadav; | arxiv-cs.CL | 2024-04-23 |
1090 | Transformers Can Represent $n$-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; | arxiv-cs.CL | 2024-04-23 |
1091 | PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs. |
SHASHI KANT GUPTA et. al. | arxiv-cs.CL | 2024-04-23 |
1092 | Pyramid Hierarchical Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). |
Muhammad Ahmad; Muhammad Hassaan Farooq Butt; Manuel Mazzara; Salvatore Distifano; | arxiv-cs.CV | 2024-04-23 |
1093 | Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance. |
Qianru Meng; Xiao Zhang; Guus Ramackers; Visser Joost; | arxiv-cs.SE | 2024-04-23 |
1094 | From Complexity to Clarity: How AI Enhances Perceptions of Scientists and The Public’s Understanding of Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public’s understanding of science. |
David M. Markowitz; | arxiv-cs.CL | 2024-04-23 |
1095 | Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates GPT-4V’s ability to interpret meteorological charts and communicate weather hazards appropriately to the user, despite challenges of hallucinations, where generative AI delivers coherent, confident, but incorrect responses. We assess GPT-4V’s competence via its web interface ChatGPT in two tasks: (1) generating a severe-weather outlook from weather-chart analysis and conducting self-evaluation, revealing an outlook that corresponds well with a Storm Prediction Center human-issued forecast; and (2) producing hazard summaries in Spanish and English from weather charts. |
JOHN R. LAWSON et. al. | arxiv-cs.CL | 2024-04-22 |
1096 | Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Marking, a novel grading task that enhances automated grading systems by performing an in-depth analysis of student responses and providing students with visual highlights. |
Shashank Sonkar; Naiming Liu; Debshila B. Mallick; Richard G. Baraniuk; | arxiv-cs.CL | 2024-04-22 |
1097 | Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CL | 2024-04-22 |
1098 | How Well Can LLMs Echo Us? Evaluating AI Chatbots’ Role-Play Ability with ECHO Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an oversight limits the potential for advancements in digital human clones and non-player characters in video games. To bridge this gap, we introduce ECHO, an evaluative framework inspired by the Turing test. |
MAN TIK NG et. al. | arxiv-cs.CL | 2024-04-22 |
1099 | Pre-Calc: Learning to Use The Calculator Improves Numeracy in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively. |
Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate; | arxiv-cs.CL | 2024-04-22 |
1100 | Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This paper presents a preliminary evaluation of GPT-4-Vision, a state-of-the-art deep learning model, and its capabilities in transforming Unified Modeling Language (UML) class diagrams into fully operating Java class files. |
Gábor Antal; Richárd Vozár; Rudolf Ferenc; | arxiv-cs.SE | 2024-04-22 |
1101 | What Do Transformers Know About Government? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. |
JUE HOU et. al. | arxiv-cs.CL | 2024-04-22 |
1102 | Transformer-Driven Resource Allocation for Enhanced Multi-Carrier NOMA Downlink Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a transformer-driven resource allocation strategy to optimize channel assignment and power allocation in multi-carrier non-orthogonal multiple access (NOMA) … |
Liang Leon Dong; | 2024 IEEE Wireless Communications and Networking Conference … | 2024-04-21 |
1103 | SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM’s SVG Editing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For quantitative evaluation of LLMs’ ability to edit SVG, we propose SVGEditBench. |
Kunato Nishina; Yusuke Matsui; | arxiv-cs.CV | 2024-04-21 |
1104 | Automated Text Mining of Experimental Methodologies from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the fine-tuned DistilBERT, a methodology-specific, pre-trained generative classification language model for mining biomedicine texts. |
Ziqing Guo; | arxiv-cs.CL | 2024-04-21 |
1105 | Do English Named Entity Recognizers Work Well on Global Englishes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As such, it is unclear whether they generalize for analyzing use of English globally. To test this, we build a newswire dataset, the Worldwide English NER Dataset, to analyze NER model performance on low-resource English variants from around the world. |
Alexander Shan; John Bauer; Riley Carlson; Christopher Manning; | arxiv-cs.CL | 2024-04-20 |
1106 | Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a solution, we propose a combined intrinsic-extrinsic evaluation framework for subword tokenization. |
KHUYAGBAATAR BATSUREN et. al. | arxiv-cs.CL | 2024-04-20 |
1107 | Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. |
Danqing Ma; Meng Wang; Ao Xiang; Zongqing Qi; Qin Yang; | arxiv-cs.CV | 2024-04-19 |
1108 | Crowdsourcing Public Attitudes Toward Local Services Through The Lens of Google Maps Reviews: An Urban Density-based Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel data source and methodological framework that can be easily adapted to different regions, offering useful insights into public sentiment toward the built environment and shedding light on how planning policies can be designed to handle related challenges. |
Lingyao Li; Songhua Hu; Atiyya Shaw; Libby Hemphill; | arxiv-cs.SI | 2024-04-19 |
1109 | TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling. |
Aleksei Dorkin; Kairit Sirts; | arxiv-cs.CL | 2024-04-19 |
1110 | Enabling Natural Zero-Shot Prompting on Encoder Models Via Statement-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label. |
Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji; | arxiv-cs.CL | 2024-04-19 |
1111 | Enhancing Child Safety in Online Gaming: The Development and Application of Protectbot, An AI-Powered Chatbot Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study introduces Protectbot, an innovative chatbot framework designed to improve safety in children’s online gaming environments. At its core, Protectbot incorporates … |
Anum Faraz; Fardin Ahsan; Jinane Mounsef; Ioannis Karamitsos; A. Kanavos; | Inf. | 2024-04-19 |
1112 | Linearly-evolved Transformer for Pan-sharpening Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource satellites.To address this challenge between favorable performance and expensive computation, we tailor an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework. |
JUNMING HOU et. al. | arxiv-cs.CV | 2024-04-19 |
1113 | Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce two new methods, Dubo-SQL v1 and v2. |
Dayton G. Thorpe; Andrew J. Duberstein; Ian A. Kinsey; | arxiv-cs.CL | 2024-04-18 |
1114 | Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. … |
Nils Graef; | arxiv-cs.LG | 2024-04-18 |
1115 | MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm. |
Jinwu Wang; Wei Mao; Miaomiao Liu; | arxiv-cs.SD | 2024-04-18 |
1116 | Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation. |
Yucheng Lin; Yuhan Xia; Yunfei Long; | arxiv-cs.CL | 2024-04-18 |
1117 | Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles. |
Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch; | arxiv-cs.CL | 2024-04-18 |
1118 | EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research. |
Jimenez Eladio; Hao Wu; | arxiv-cs.CL | 2024-04-18 |
1119 | Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications. |
Wei Chen; Zhiyuan Li; | arxiv-cs.CL | 2024-04-17 |
1120 | CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. |
Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu; | arxiv-cs.AI | 2024-04-17 |
1121 | CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions. |
Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak; | arxiv-cs.CL | 2024-04-16 |
1122 | Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. |
PEIYUAN ZHI et. al. | arxiv-cs.RO | 2024-04-15 |
1123 | AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance. |
Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara; | arxiv-cs.CV | 2024-04-15 |
1124 | Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, I argue that we can extract from the way the transformer architecture works a theory of the relationship between context and meaning. |
Jumbly Grindrod; | arxiv-cs.CL | 2024-04-15 |
1125 | Leveraging GPT-like LLMs to Automate Issue Labeling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Issue labeling is a crucial task for the effective management of software projects. To date, several approaches have been put forth for the automatic assignment of labels to issue … |
Giuseppe Colavito; F. Lanubile; Nicole Novielli; L. Quaranta; | 2024 IEEE/ACM 21st International Conference on Mining … | 2024-04-15 |
1126 | Zero-shot Building Age Classification from Facade Image Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. A building’s age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images … |
ZICHAO ZENG et. al. | ArXiv | 2024-04-15 |
1127 | Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This paper introduces fourteen novel datasets for the evaluation of Large Language Models’ safety in the context of enterprise tasks. A method was devised to evaluate a model’s … |
David Nadeau; Mike Kroutikov; Karen McNeil; Simon Baribeau; | ArXiv | 2024-04-15 |
1128 | Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility. |
SIQIAO XUE et. al. | arxiv-cs.AI | 2024-04-15 |
1129 | Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain. |
Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu; | arxiv-cs.CV | 2024-04-15 |
1130 | Hybrid Convolution-Transformer for Lightweight Single Image Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid development of deep learning has driven the breakthrough in performance of single image super-resolution (SISR). However, many existing works deepen the network to … |
Jiuqiang Li; Yutong Ke; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1131 | TD-GPT: Target Protein-Specific Drug Molecule Generation GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Drug discovery faces challenges due to the vast chemical space and complex drug-target interactions. This paper proposes a novel deep learning framework TD-GPT for targeted drug … |
ZHENGDA HE et. al. | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1132 | A Scalable Sparse Transformer Model for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extracting the melody of a singing voice is an essential task within the realm of music information retrieval (MIR). Recently, transformer based models have drawn great attention … |
Shuai Yu; Jun Liu; Yi Yu; Wei Li; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1133 | GPT-4 Driven Cinematic Music Generation Through Text Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech … |
Muhammad Taimoor Haseeb; Ahmad Hammoudeh; Gus G. Xia; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1134 | Inducing Inductive Bias in Vision Transformer for EEG Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Human brain signals are highly complex and dynamic in nature. Electroencephalogram (EEG) devices capture some of this complexity, both in space and in time, with a certain … |
Rabindra Khadka; Pedro G. Lind; G. Mello; M. Riegler; Anis Yazidi; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1135 | A Hybrid CNN-Transformer for Focal Liver Lesion Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The early diagnosis of focal liver lesions (FLLs) plays a key role in the successful treatment of liver cancer. To effectively diagnose focal liver lesions, we used … |
LING ZHAO et. al. | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1136 | Improving Domain Generalization in Speech Emotion Recognition with Whisper Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers have been used successfully in a variety of settings, including Speech Emotion Recognition (SER). However, use of the latest transformer base models in domain … |
Erik Goron; Lena Asai; Elias Rut; Martin Dinov; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1137 | OpenTE: Open-Structure Table Extraction From Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents an Open-Structure Table Extraction (OpenTE) task, which aims to extract a table with intrinsic semantic, calculational, and hierarchical structure from … |
Haoyu Dong; Mengkang Hu; Qinyu Xu; Haocheng Wang; Yue Hu; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1138 | Assessing The Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Assurance cases (ACs) are structured arguments that allow verifying the correct implementation of the created systems’ non-functional requirements (e.g., safety, security). This … |
KIMYA KHAKZAD SHAHANDASHTI et. al. | 2024 IEEE/ACM First International Conference on AI … | 2024-04-14 |
1139 | Planning to Guide LLM for Code Coverage Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Code coverage serves as a crucial metric to assess testing effectiveness, measuring the degree to which a test suite exercises different facets of the code, such as statements, … |
Hridya Dhulipala; Aashish Yadavally; Tien N. Nguyen; | 2024 IEEE/ACM First International Conference on AI … | 2024-04-14 |
1140 | Fine Tuning Large Language Model for Secure Code Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI pair programmers, such as GitHub’s Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the … |
Junjie Li; Aseem Sangalay; Cheng Cheng; Yuan Tian; Jinqiu Yang; | 2024 IEEE/ACM First International Conference on AI … | 2024-04-14 |
1141 | LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NER Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Flat-LAttice Transformer (FLAT) has achieved notable success in Chinese named entity recognition (NER) by integrating lexical information into the widely-used Transformer … |
Zongcheng Ji; Yinlong Xiao; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
1142 | The Impact of Knowledge Distillation on The Energy Consumption and Runtime Efficiency of NLP Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Context. While models like BERT and GPT are powerful, they require substantial resources. Knowledge distillation can be employed as a technique to enhance their efficiency. Yet, … |
YE YUAN et. al. | 2024 IEEE/ACM 3rd International Conference on AI … | 2024-04-14 |
1143 | A Lightweight Transformer-based Neural Network for Large-scale Masonry Arch Bridge Point Cloud Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to … |
Yixiong Jing; Brian Sheil; S. Acikgoz; | Comput. Aided Civ. Infrastructure Eng. | 2024-04-14 |
1144 | Few-shot Name Entity Recognition on StackOverflow IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning. |
Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo; | arxiv-cs.CL | 2024-04-14 |
1145 | CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs. |
Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran; | arxiv-cs.CL | 2024-04-12 |
1146 | Inheritune: Training Smaller Yet More Attentive Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Layers in this state are unable to learn anything meaningful and mostly redundant; we refer to these as lazy layers. The goal of this paper is to train smaller models by eliminating this structural inefficiency without compromising performance. |
Sunny Sanyal; Ravid Shwartz-Ziv; Alexandros G. Dimakis; Sujay Sanghavi; | arxiv-cs.CL | 2024-04-12 |
1147 | Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap. |
Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych; | arxiv-cs.CL | 2024-04-12 |
1148 | Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code. |
MENGNAN QI et. al. | arxiv-cs.PL | 2024-04-12 |
1149 | Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels. |
William Gantt; Aaron Steven White; | arxiv-cs.CL | 2024-04-12 |
1150 | Measuring Geographic Diversity of Foundation Models with A Natural Language-based Geo-guessing Experiment on GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Abstract. Generative AI based on foundation models provides a first glimpse into the world represented by machines trained on vast amounts of multimodal data ingested by these … |
Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi; | ArXiv | 2024-04-11 |
1151 | From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates. |
Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu; | arxiv-cs.CL | 2024-04-11 |
1152 | LLM Agents Can Autonomously Exploit One-day Vulnerabilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems. |
Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang; | arxiv-cs.CR | 2024-04-11 |
1153 | Map Reading and Analysis with GPT-4V(ision) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In late 2023, the image-reading capability added to a Generative Pre-trained Transformer (GPT) framework provided the opportunity to potentially revolutionize the way we view and … |
Jinwen Xu; Ran Tao; | ISPRS Int. J. Geo Inf. | 2024-04-11 |
1154 | Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented. |
Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi; | arxiv-cs.CY | 2024-04-11 |
1155 | Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items. |
Andreas Säuberli; Simon Clematide; | arxiv-cs.CL | 2024-04-11 |
1156 | Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images. |
Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi; | arxiv-cs.RO | 2024-04-11 |
1157 | Automated Mapping of Common Vulnerabilities and Exposures to MITRE ATT&CK Tactics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Effectively understanding and categorizing vulnerabilities is vital in the ever-evolving cybersecurity landscape, since only one exposure can have a devastating effect on the … |
Ioana Branescu; Octavian Grigorescu; Mihai Dascălu; | Inf. | 2024-04-10 |
1158 | Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs. |
Miriam Anschütz; Edoardo Mosca; Georg Groh; | arxiv-cs.CL | 2024-04-10 |
1159 | Learning A Multimodal Feature Transformer for RGBT Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View |
Hui-ling Shi; Xiaodong Mu; Danyao Shen; Chengliang Zhong; | Signal Image Video Process. | 2024-04-09 |
1160 | Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere. |
Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan; | arxiv-cs.CL | 2024-04-09 |
1161 | Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data. |
YANJIE LI et. al. | arxiv-cs.LG | 2024-04-09 |
1162 | InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration. |
XIAOYI DONG et. al. | arxiv-cs.CV | 2024-04-09 |
1163 | PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English. |
Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar; | arxiv-cs.CL | 2024-04-08 |
1164 | VulnHunt-GPT: A Smart Contract Vulnerabilities Detector Based on OpenAI ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Smart contracts are self-executing programs that can run on a blockchain. Due to the fact of being immutable after their deployment on blockchain, it is crucial to ensure their … |
Biagio Boi; Christian Esposito; Sokjoon Lee; | Proceedings of the 39th ACM/SIGAPP Symposium on Applied … | 2024-04-08 |
1165 | OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets. |
MEHRAN SAFAYANI et. al. | arxiv-cs.CL | 2024-04-08 |
1166 | Use of A Structured Knowledge Base Enhances Metadata Curation By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the potential of large language models (LLMs), specifically GPT-4, to improve adherence to metadata standards. |
SOWMYA S. SUNDARAM et. al. | arxiv-cs.AI | 2024-04-08 |
1167 | Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data. |
AHMAD IDRISSI-YAGHIR et. al. | arxiv-cs.CL | 2024-04-08 |
1168 | Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to compare the performance of GPT with traditional deep learning models (Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT)) in extracting acupoint-related location relations and assess the impact of pretraining and fine-tuning on GPT’s performance. |
YIMING LI et. al. | arxiv-cs.CL | 2024-04-08 |
1169 | Clinical Trials Protocol Authoring Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies. |
Morteza Maleki; SeyedAli Ghahari; | arxiv-cs.CE | 2024-04-07 |
1170 | PagPassGPT: Pattern Guided Password Guessing Via Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT). |
XINGYU SU et. al. | arxiv-cs.CR | 2024-04-07 |
1171 | Initial Exploration of Zero-Shot Privacy Utility Tradeoffs in Tabular Data Using GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We investigate the application of large language models (LLMs), specifically GPT-4, to scenarios involving the tradeoff between privacy and utility in tabular data. Our approach … |
Bishwas Mandal; G. Amariucai; Shuangqing Wei; | 2024 International Joint Conference on Neural Networks … | 2024-04-07 |
1172 | RecGPT: Generative Personalized Prompts for Sequential Recommendation Via ChatGPT Training Paradigm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as … |
YABIN ZHANG et. al. | ArXiv | 2024-04-06 |
1173 | Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel approach, Joint Visual and Text Prompting (VTPrompt), that employs fine-grained visual information to enhance the capability of MLLMs in VQA, especially for object-oriented perception. |
SONGTAO JIANG et. al. | arxiv-cs.CL | 2024-04-06 |
1174 | Scope Ambiguities in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models — GPT-2, GPT-3/3.5, Llama 2 and GPT-4 — treat scope ambiguous sentences, and compare this with human judgments. |
Gaurav Kamath; Sebastian Schuster; Sowmya Vajjala; Siva Reddy; | arxiv-cs.CL | 2024-04-05 |
1175 | Evaluating LLMs at Detecting Errors in LLM Responses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs. |
RYO KAMOI et. al. | arxiv-cs.CL | 2024-04-04 |
1176 | Outlier-Efficient Hopfield Layers for Large Transformer-Based Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models. |
JERRY YAO-CHIEH HU et. al. | arxiv-cs.LG | 2024-04-04 |
1177 | Hierarchical Patch Aggregation Transformer for Motion Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yujie Wu; Lei Liang; Siyao Ling; Zhisheng Gao; | Neural Process. Lett. | 2024-04-04 |
1178 | Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the … |
SHUO CHEN et. al. | arxiv-cs.LG | 2024-04-04 |
1179 | NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation Using Few-Shot Multi-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate. |
Anish Pahilajani; Samyak Rajesh Jain; Devasha Trivedi; | arxiv-cs.CL | 2024-04-03 |
1180 | UTeBC-NLP at SemEval-2024 Task 9: Can LLMs Be Lateral Thinkers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT) and direct prompting, enhancing with informative descriptions, and employing contextualizing prompts using a retrieval augmented generation (RAG) pipeline. |
Pouya Sadeghi; Amirhossein Abaskohi; Yadollah Yaghoobzadeh; | arxiv-cs.CL | 2024-04-03 |
1181 | FGeo-TP: A Language Model-Enhanced Solver for Euclidean Geometry Problems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The application of contemporary artificial intelligence techniques to address geometric problems and automated deductive proofs has always been a grand challenge to the … |
Yiming He; Jia Zou; Xiaokai Zhang; Na Zhu; Tuo Leng; | Symmetry | 2024-04-03 |
1182 | BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this, our team engaged in SemEval-2024 Task 4, a hierarchical multi-label classification task designed to identify rhetorical and psychological persuasion techniques embedded within memes. To tackle this problem, we introduced a caption generation step to assess the modality gap and the impact of additional semantic information from images, which improved our result. |
Amirhossein Abaskohi; Amirhossein Dabiriaghdam; Lele Wang; Giuseppe Carenini; | arxiv-cs.CL | 2024-04-03 |
1183 | GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. |
Ali Pesaranghader; Nikhil Verma; Manasa Bharadwaj; | arxiv-cs.CL | 2024-04-03 |
1184 | Task Agnostic Architecture for Algorithm Induction Via Implicit Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking this trend of generalization to the extreme suggests the possibility of a single deep network architecture capable of solving all tasks. This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed. |
Sahil J. Sindhi; Ignas Budvytis; | arxiv-cs.LG | 2024-04-03 |
1185 | METAL: Towards Multilingual Meta-Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a framework for an end-to-end assessment of LLMs as evaluators in multilingual scenarios. |
Rishav Hada; Varun Gumma; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram; | arxiv-cs.CL | 2024-04-02 |
1186 | Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this way, we achieve 100% attack success rate — according to GPT-4 as a judge — on Vicuna-13B, Mistral-7B, Phi-3-Mini, Nemotron-4-340B, Llama-2-Chat-7B/13B/70B, Llama-3-Instruct-8B, Gemma-7B, GPT-3.5, GPT-4o, and R2D2 from HarmBench that was adversarially trained against the GCG attack. |
Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; | arxiv-cs.CR | 2024-04-02 |
1187 | SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SGSH–a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. |
SHASHA GUO et. al. | arxiv-cs.CL | 2024-04-02 |
1188 | Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks. |
AMIRHOSSEIN ABASKOHI et. al. | arxiv-cs.CL | 2024-04-02 |
1189 | Release of Pre-Trained Models for The Japanese Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI democratization aims to create a world in which the average person can utilize AI techniques. |
KEI SAWADA et. al. | arxiv-cs.CL | 2024-04-02 |
1190 | GPT-COPE: A Graph-Guided Point Transformer for Category-Level Object Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Category-level object pose estimation aims to predict the 6D pose and 3D metric size of objects from given categories. Due to significant intra-class shape variations among … |
Lu Zou; Zhangjin Huang; Naijie Gu; Guoping Wang; | IEEE Transactions on Circuits and Systems for Video … | 2024-04-01 |
1191 | Vision Transformer Models for Mobile/edge Devices: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View |
SEUNG IL LEE et. al. | Multim. Syst. | 2024-04-01 |
1192 | Syntactic Robustness for LLM-based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on prompts that ask for code that generates solutions to variables in an equation, when given coefficients of the equation as input. |
Laboni Sarker; Mara Downing; Achintya Desai; Tevfik Bultan; | arxiv-cs.SE | 2024-04-01 |
1193 | TQRFormer: Tubelet Query Recollection Transformer for Action Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xiangyang Wang; Kun Yang; Qiang Ding; Rui Wang; Jinhua Sun; | Image Vis. Comput. | 2024-04-01 |
1194 | RDTN: Residual Densely Transformer Network for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yan Li; Xiaofei Yang; Dong Tang; Zheng-yang Zhou; | Expert Syst. Appl. | 2024-04-01 |
1195 | ScopeViT: Scale-Aware Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
XUESONG NIE et. al. | Pattern Recognit. | 2024-04-01 |
1196 | Large Language Model Evaluation Via Multi AI Agents: Preliminary Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite extensive efforts to examine LLMs from various perspectives, there is a noticeable lack of multi-agent AI models specifically designed to evaluate the performance of different LLMs. To address this gap, we introduce a novel multi-agent AI model that aims to assess and compare the performance of various LLMs. |
Zeeshan Rasheed; Muhammad Waseem; Kari Systä; Pekka Abrahamsson; | arxiv-cs.SE | 2024-04-01 |
1197 | Time Domain Speech Enhancement with CNN and Time-attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
N. Saleem; T. S. Gunawan; Sami Dhahbi; Sami Bourouis; | Digit. Signal Process. | 2024-04-01 |
1198 | Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify and address ethical issues through empirical studies. |
Richard Kimera; Yun-Seon Kim; Heeyoul Choi; | arxiv-cs.CL | 2024-04-01 |
1199 | BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research. |
Jiarong Xian; Jibao Yuan; Peiwei Zheng; Dexian Chen; Nie yuntao; | arxiv-cs.CL | 2024-04-01 |
1200 | Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of zero-shot Large Multimodal Models (LMMs) in the domain of drone perception. |
Christian Limberg; Artur Gonçalves; Bastien Rigault; Helmut Prendinger; | arxiv-cs.CV | 2024-04-01 |
1201 | SIF-TF: A Scene-Interaction Fusion Transformer for Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View |
Fei Gao; Wanjun Huang; Libo Weng; Yuanming Zhang; | Knowl. Based Syst. | 2024-04-01 |
1202 | LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment. |
Zilong Wang; Xufang Luo; Xinyang Jiang; Dongsheng Li; Lili Qiu; | arxiv-cs.CL | 2024-04-01 |
1203 | TWIN-GPT: Digital Twins for Clinical Trials Via Large Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a large language model-based digital twin creation approach, called TWIN-GPT. |
YUE WANG et. al. | arxiv-cs.LG | 2024-04-01 |
1204 | An Innovative GPT-based Open-source Intelligence Using Historical Cyber Incident Reports Related Papers Related Patents Related Grants Related Venues Related Experts View |
F. Sufi; | Nat. Lang. Process. J. | 2024-04-01 |
1205 | SwinSOD: Salient Object Detection Using Swin-transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Shuang Wu; Guangjian Zhang; Xuefeng Liu; | Image Vis. Comput. | 2024-04-01 |
1206 | Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. |
HAN CAI et. al. | arxiv-cs.CV | 2024-04-01 |
1207 | EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new benchmark – EvoCodeBench to address the preceding problems, which has three primary advances. |
Jia Li; Ge Li; Xuanming Zhang; Yihong Dong; Zhi Jin; | arxiv-cs.CL | 2024-03-31 |
1208 | CHOPS: CHat with CustOmer Profile Systems for Customer Service with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a practical dataset, the CPHOS-dataset, which includes a database, guiding files, and QA pairs collected from CPHOS, an online platform that facilitates the organization of simulated Physics Olympiads for high school teachers and students. |
JINGZHE SHI et. al. | arxiv-cs.CL | 2024-03-31 |
1209 | Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune Your Model Unless You Have Access to GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a PEFT method to improve the consistency of LLMs by merging adapters that were fine-tuned separately using triplet and language modelling objectives. |
Aryo Pradipta Gema; Giwon Hong; Pasquale Minervini; Luke Daines; Beatrice Alex; | arxiv-cs.CL | 2024-03-30 |
1210 | Cross-lingual Named Entity Corpus for Slavic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a corpus manually annotated with named entities for six Slavic languages – Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. |
Jakub Piskorski; Michał Marcińczuk; Roman Yangarber; | arxiv-cs.CL | 2024-03-30 |
1211 | Jetsons at FinNLP 2024: Towards Understanding The ESG Impact of A News Article Using Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. |
PARAG PRAVIN DAKLE et. al. | arxiv-cs.CL | 2024-03-30 |
1212 | Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite efforts to create open-source models, their limited parameters often result in insufficient multi-step reasoning capabilities required for solving complex medical problems. To address this, we introduce Meerkat, a new family of medical AI systems ranging from 7 to 70 billion parameters. |
HYUNJAE KIM et. al. | arxiv-cs.CL | 2024-03-30 |
1213 | A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In pursuit of suitable data augmentation methods, this study explores both established legacy approaches and contemporary practices such as Large Language Models (LLM), including GPT in Hate Speech detection. |
Md Saroar Jahan; Mourad Oussalah; Djamila Romaissa Beddia; Jhuma kabir Mim; Nabil Arhab; | arxiv-cs.CL | 2024-03-30 |
1214 | A Hybrid Transformer and Attention Based Recurrent Neural Network for Robust and Interpretable Sentiment Analysis of Tweets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this. |
Md Abrar Jahin; Md Sakib Hossain Shovon; M. F. Mridha; Md Rashedul Islam; Yutaka Watanobe; | arxiv-cs.CL | 2024-03-30 |
1215 | Transformer Based Pluralistic Image Completion with Reduced Information Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer. To mitigate these issues, we propose a new transformer based framework called PUT. |
QIANKUN LIU et. al. | arxiv-cs.CV | 2024-03-30 |
1216 | ChatGPT V.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can ChatGPT detect media bias? This study seeks to answer this question by leveraging the Media Bias Identification Benchmark (MBIB) to assess ChatGPT’s competency in distinguishing six categories of media bias, juxtaposed against fine-tuned models such as BART, ConvBERT, and GPT-2. |
Zehao Wen; Rabih Younes; | arxiv-cs.CL | 2024-03-29 |
1217 | ReALM: Reference Resolution As Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality. |
JOEL RUBEN ANTONY MONIZ et. al. | arxiv-cs.CL | 2024-03-29 |
1218 | Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. |
Ahmad Diab; Rr. Nefriana; Yu-Ru Lin; | arxiv-cs.CL | 2024-03-29 |
1219 | Shallow Cross-Encoders for Low-Latency Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, keeping search latencies low is important for user satisfaction and energy usage. In this paper, we show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings since they can estimate the relevance of more documents in the same time budget. |
Aleksandr V. Petrov; Sean MacAvaney; Craig Macdonald; | arxiv-cs.IR | 2024-03-29 |
1220 | A Review of Multi-Modal Large Language and Vision Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have recently emerged as a focal point of research and application, driven by their unprecedented ability to understand and generate text with … |
Kilian Carolan; Laura Fennelly; A. Smeaton; | ArXiv | 2024-03-28 |
1221 | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator’s behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT). |
Norman Di Palo; Edward Johns; | arxiv-cs.RO | 2024-03-28 |
1222 | TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-modal large language models (MLLMs), such as GPT-4, exhibit great comprehension capabilities on human instruction, as well as zero-shot ability on new downstream multi-modal … |
YUNKAI CHEN et. al. | ACM Transactions on Knowledge Discovery from Data | 2024-03-28 |
1223 | Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks. |
ANG LV et. al. | arxiv-cs.CL | 2024-03-28 |
1224 | Decision Mamba: Reinforcement Learning Via Sequence Modeling with Selective State Spaces IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios. |
Toshihiro Ota; | arxiv-cs.LG | 2024-03-28 |
1225 | AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. |
FELIX VIRGO et. al. | arxiv-cs.CL | 2024-03-27 |
1226 | Reshaping Free-Text Radiology Notes Into Structured Reports With Generative Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a pipeline to extract information from free-text radiology reports, that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma. |
LAURA BERGOMI et. al. | arxiv-cs.CL | 2024-03-27 |
1227 | Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed three approaches for leveraging LLMs for text classification: employing LLMs as zero-shot classifiers, us-ing LLMs as annotators to annotate training data for supervised classifiers, and utilizing LLMs with few-shot examples for augmentation of manually annotated data. |
Yuting Guo; Anthony Ovadje; Mohammed Ali Al-Garadi; Abeed Sarker; | arxiv-cs.CL | 2024-03-27 |
1228 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students’ physics labs. |
Ehsan Latif; Ramviyas Parasuraman; Xiaoming Zhai; | arxiv-cs.RO | 2024-03-27 |
1229 | 3P-LLM: Probabilistic Path Planning Using Large Language Model for Autonomous Robot Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning. |
Ehsan Latif; | arxiv-cs.RO | 2024-03-27 |
1230 | The Topos of Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a theoretical analysis of the expressivity of the transformer architecture through the lens of topos theory. |
Mattia Jacopo Villani; Peter McBurney; | arxiv-cs.LG | 2024-03-27 |
1231 | RankMamba: Benchmarking Mamba’s Document Ranking Performance in The Era of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine \mamba’s efficacy through the lens of a classical IR task — document ranking. |
Zhichao Xu; | arxiv-cs.IR | 2024-03-27 |
1232 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan Sheng Foo; Luu Anh Tuan; See-Kiong Ng; | arxiv-cs.CL | 2024-03-27 |
1233 | A Survey on Large Language Models from Concept to Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series. |
Chen Wang; Jin Zhao; Jiaqi Gong; | arxiv-cs.CL | 2024-03-27 |
1234 | From Traditional Recommender Systems to GPT-Based Chatbots: A Survey of Recent Developments and Future Directions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recommender systems are a key technology for many applications, such as e-commerce, streaming media, and social media. Traditional recommender systems rely on collaborative … |
TAMIM M. AL-HASAN et. al. | Big Data Cogn. Comput. | 2024-03-27 |
1235 | Evaluating The Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of Large Language Models (LLMs) has led to a parallel rise in the development of Large Multimodal Models (LMMs), which have begun to transform a variety of applications. These sophisticated multimodal models are designed to interpret and analyze complex data by integrating multiple modalities such as text and images, thereby opening new avenues for a range of applications. |
Fouad Trad; Ali Chehab; | arxiv-cs.AI | 2024-03-26 |
1236 | Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking. |
HAI-LONG NGUYEN et. al. | arxiv-cs.CL | 2024-03-26 |
1237 | Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a task for testing lexical-syntactic flexibility — the degree to which models can generalize over words in a construction with a non-prototypical part of speech. |
David R. Mortensen; Valentina Izrailevitch; Yunze Xiao; Hinrich Schütze; Leonie Weissweiler; | arxiv-cs.CL | 2024-03-26 |
1238 | Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models Using Minimal Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing’ method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer. |
Linyang He; Peili Chen; Ercong Nie; Yuanning Li; Jonathan R. Brennan; | arxiv-cs.CL | 2024-03-25 |
1239 | State Space Models As Foundation Models: A Control Theoretic Overview Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by … |
Carmen Amo Alonso; Jerome Sieber; M. Zeilinger; | ArXiv | 2024-03-25 |
1240 | Grammatical Vs Spelling Error Correction: An Investigation Into The Responsiveness of Transformer-based Language Models Using BART and MarianMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims at analyzing different kinds of error that occurs in text documents. |
Rohit Raju; Peeta Basa Pati; SA Gandheesh; Gayatri Sanjana Sannala; Suriya KS; | arxiv-cs.CL | 2024-03-25 |
1241 | LLM-Guided Formal Verification Coupled with Mutation Testing IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The increasing complexity of modern hardware designs poses significant challenges for design verification, particularly defining and verifying properties and invariants manually. … |
Muhammad Hassan; Sallar Ahmadi-Pour; Khushboo Qayyum; C. Jha; Rolf Drechsler; | 2024 Design, Automation & Test in Europe Conference & … | 2024-03-25 |
1242 | CYGENT: A Cybersecurity Conversational Agent with Log Summarization Powered By GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability. |
Prasasthy Balasubramanian; Justin Seby; Panos Kostakos; | arxiv-cs.CR | 2024-03-25 |
1243 | Towards Algorithmic Fidelity: Mental Health Representation Across Demographics in Synthetic Vs. Human-generated Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we … |
Shinka Mori; Oana Ignat; Andrew Lee; Rada Mihalcea; | International Conference on Language Resources and … | 2024-03-25 |
1244 | GPT-4 Understands Discourse at Least As Well As Humans Do Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief … |
Thomas Shultz; Jamie Wise; Ardavan Salehi Nobandegani; | arxiv-cs.CL | 2024-03-25 |
1245 | Automatic Short Answer Grading for Finnish with ChatGPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic short answer grading (ASAG) seeks to mitigate the burden on teachers by leveraging computational methods to evaluate student-constructed text responses. Large language … |
Li-Hsin Chang; Filip Ginter; | AAAI Conference on Artificial Intelligence | 2024-03-24 |
1246 | Proxyformer: Nyström-Based Linear Transformer with Trainable Proxy Tokens Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based models have demonstrated remarkable performance in various domains, including natural language processing, image processing and generative modeling. The most … |
Sangho Lee; Hayun Lee; Dongkun Shin; | AAAI Conference on Artificial Intelligence | 2024-03-24 |
1247 | Anomaly Detection and Localization in Optical Networks Using Vision Transformer and SOP Monitoring Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce an innovative vision transformer approach to identify and precisely locate high-risk events, including fiber cut precursors, in state-of-polarization derived … |
K. ABDELLI et. al. | 2024 Optical Fiber Communications Conference and Exhibition … | 2024-03-24 |
1248 | GPT-Enabled Digital Twin Assistant for Multi-task Cooperative Management in Autonomous Optical Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A GPT-enabled digital twin (DT) assistant is implemented with the capabilities of intention understanding, analysis, reasoning, and complex multi-task collaboration, which … |
YAO ZHANG et. al. | 2024 Optical Fiber Communications Conference and Exhibition … | 2024-03-24 |
1249 | Reflective Microresonator Based Microwave Photonic Sensor Assisted By Sparse Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We demonstrate a sparse transformer assisted microwave photonic sensor using a microring cascaded with an inverse designed reflector. Even with a small dataset, the … |
XIAOYI TIAN et. al. | 2024 Optical Fiber Communications Conference and Exhibition … | 2024-03-24 |
1250 | Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL). |
MINYU CHEN et. al. | arxiv-cs.AI | 2024-03-24 |
1251 | A Transformer Approach for Electricity Price Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach to electricity price forecasting (EPF) using a pure Transformer model. |
Oscar Llorente; Jose Portela; | arxiv-cs.LG | 2024-03-24 |
1252 | LlamBERT: Large-scale Low-cost Data Annotation in NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LlamBERT, a hybrid approach that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer encoders like BERT and RoBERTa. |
Bálint Csanády; Lajos Muzsai; Péter Vedres; Zoltán Nádasdy; András Lukács; | arxiv-cs.CL | 2024-03-23 |
1253 | Using Large Language Models for OntoClean-based Ontology Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology. |
Yihang Zhao; Neil Vetter; Kaveh Aryan; | arxiv-cs.AI | 2024-03-23 |
1254 | VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the purpose of future research, CafeBERT is made publicly available for research purposes. |
Phong Nguyen-Thuan Do; Son Quoc Tran; Phu Gia Hoang; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen; | arxiv-cs.CL | 2024-03-23 |
1255 | Evaluating GPT-4’s Proficiency in Addressing Cryptography Examinations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: . In the rapidly advancing domain of artificial intelligence, ChatGPT, powered by the GPT-4 model, has emerged as a state-of-the-art interactive agent, exhibiting substantial … |
Vasily Mikhalev; Nils Kopal; B. Esslinger; | IACR Cryptol. ePrint Arch. | 2024-03-23 |
1256 | On Zero-Shot Counterspeech Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech – counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind. |
Punyajoy Saha; Aalok Agrawal; Abhik Jana; Chris Biemann; Animesh Mukherjee; | arxiv-cs.CL | 2024-03-22 |
1257 | Can Large Language Models Explore In-context? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. |
Akshay Krishnamurthy; Keegan Harris; Dylan J. Foster; Cyril Zhang; Aleksandrs Slivkins; | arxiv-cs.LG | 2024-03-22 |
1258 | Geometry-aware 3D Pose Transfer Using Transformer Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View |
Shanghuan Liu; Shaoyan Gai; Feipeng Da; Fazal Waris; | Comput. Vis. Media | 2024-03-22 |
1259 | MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 – Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. |
SADIYA SAYARA CHOWDHURY PUSPO et. al. | arxiv-cs.CL | 2024-03-22 |
1260 | Comprehensive Evaluation and Insights Into The Use of Large Language Models in The Automation of Behavior-Driven Development Acceptance Test Formulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this manuscript, we propose a novel approach to enhance BDD practices using large language models (LLMs) to automate acceptance test generation. |
SHANTHI KARPURAPU et. al. | arxiv-cs.SE | 2024-03-22 |
1261 | Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish. |
Lanxin Xu; Shuo Wang; | arxiv-cs.CV | 2024-03-22 |
1262 | GPT-Connect: Interaction Between Text-Driven Human Motion Generator and 3D Scenes in A Training-free Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, intuitively training a separate scene-aware motion generator in a supervised way can require a large amount of motion samples to be troublesomely collected and annotated in a large scale of different 3D scenes. To handle this task rather in a relatively convenient manner, in this paper, we propose a novel GPT-connect framework. |
Haoxuan Qu; Ziyan Guo; Jun Liu; | arxiv-cs.CV | 2024-03-22 |
1263 | LLM-based Extraction of Contradictions from Patents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI’s GPT-4. |
Stefan Trapp; Joachim Warschat; | arxiv-cs.CL | 2024-03-21 |
1264 | K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In many literary texts, emotions are indirectly conveyed through descriptions of actions, facial expressions, and appearances, necessitating emotion inference for narrative understanding. In this paper, we introduce K-Act2Emo, a Korean commonsense knowledge graph (CSKG) comprising 1,900 indirect emotional expressions and the emotions inferable from them. |
Kyuhee Kim; Surin Lee; Sangah Lee; | arxiv-cs.CL | 2024-03-21 |
1265 | Extracting Emotion Phrases from Tweets Using BART Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we applied an approach to sentiment analysis based on a question-answering framework. |
Mahdi Rezapour; | arxiv-cs.CL | 2024-03-20 |
1266 | Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature. |
Sheetal Harris; Jinshuo Liu; Hassan Jalil Hadi; Yue Cao; | arxiv-cs.CL | 2024-03-20 |
1267 | Open Access NAO (OAN): A ROS2-based Software Framework for HRI Applications with The NAO Robot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new software framework for HRI experimentation with the sixth version of the common NAO robot produced by the United Robotics Group. |
Antonio Bono; Kenji Brameld; Luigi D’Alfonso; Giuseppe Fedele; | arxiv-cs.RO | 2024-03-20 |
1268 | Evaluate Chat-GPT’s Programming Capability in Swift Through Real University Exam Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this study, we evaluate the programming capabilities of OpenAI’s GPT‐3.5 and GPT‐4 models using Swift‐based exam questions from a third‐year university course. The results … |
Zizhuo Zhang; Lian Wen; Yanfei Jiang; Yongli Liu; | Softw. Pract. Exp. | 2024-03-20 |
1269 | Retina Vision Transformer (RetinaViT): Introducing Scaled Patches Into Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Humans see low and high spatial frequency components at the same time, and combine the information from both to form a visual scene. Drawing on this neuroscientific inspiration, we propose an altered Vision Transformer architecture where patches from scaled down versions of the input image are added to the input of the first Transformer Encoder layer. |
Yuyang Shu; Michael E. Bain; | arxiv-cs.CV | 2024-03-20 |
1270 | AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods for integrating such multimodal information often stumble, leading to less-than-ideal outcomes in the task of facial action unit detection. To overcome these shortcomings, we propose a novel approach utilizing audio-visual multimodal data. |
JUN YU et. al. | arxiv-cs.CV | 2024-03-20 |
1271 | Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks where multiple outputs are required for a single shared input. |
BO-RU LU et. al. | arxiv-cs.CL | 2024-03-19 |
1272 | Automated Data Curation for Robust Language Model Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an automated data curation pipeline CLEAR (Confidence-based LLM Evaluation And Rectification) for instruction tuning datasets, that can be used with any LLM and fine-tuning procedure. |
Jiuhai Chen; Jonas Mueller; | arxiv-cs.CL | 2024-03-19 |
1273 | A Hyperspectral Unmixing Model Using Convolutional Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Sreejam Muraleedhara Bhakthan; L. Agilandeeswari; | Earth Sci. Informatics | 2024-03-19 |
1274 | Generating Automatic Feedback on UI Mockups with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the potential of using large language models for automatic feedback. |
Peitong Duan; Jeremy Warner; Yang Li; Bjoern Hartmann; | arxiv-cs.HC | 2024-03-19 |
1275 | TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an end-to-end model called TT-BLIP that applies the bootstrapping language-image pretraining for unified vision-language understanding and generation (BLIP) for three types of information: BERT and BLIP\textsubscript{Txt} for text, ResNet and BLIP\textsubscript{Img} for images, and bidirectional BLIP encoders for multimodal information. |
Eunjee Choi; Jong-Kook Kim; | arxiv-cs.LG | 2024-03-19 |
1276 | Navigating Compiler Errors with AI Assistance — A Study of GPT Hints in An Introductory Programming Course Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler errors within a platform for automated assessment of programming assignments. |
Maciej Pankiewicz; Ryan S. Baker; | arxiv-cs.SE | 2024-03-19 |
1277 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is not aligned with the prompt compression objective. To address these issues, we propose a data distillation procedure to derive knowledge from an LLM to compress prompts without losing crucial information, and meantime, introduce an extractive text compression dataset. |
ZHUOSHI PAN et. al. | arxiv-cs.CL | 2024-03-19 |
1278 | Navigating Compiler Errors with AI Assistance – A Study of GPT Hints in An Introductory Programming Course Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler … |
Maciej Pankiewicz; Ryan S. Baker; | Proceedings of the 2024 on Innovation and Technology in … | 2024-03-19 |
1279 | End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a neuro-symbolic framework for jointly learning structured states and symbolic policies, whose key idea is to distill the vision foundation model into an efficient perception module and refine it during policy learning. |
LIRUI LUO et. al. | arxiv-cs.AI | 2024-03-19 |
1280 | GPT-4 As Evaluator: Evaluating Large Language Models on Pest Management in Agriculture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the rapidly evolving field of artificial intelligence (AI), the application of large language models (LLMs) in agriculture, particularly in pest management, remains nascent. We … |
SHANGLONG YANG et. al. | ArXiv | 2024-03-18 |
1281 | CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the dataset and benchmark naive, traditional, and Transformer models. |
Korbinian Randl; John Pavlopoulos; Aron Henriksson; Tony Lindgren; | arxiv-cs.CL | 2024-03-18 |
1282 | Shifting The Lens: Detecting Malicious Npm Packages Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this study is to assist security analysts in detecting malicious packages through the empirical study of using Large Language Models (LLMs) to detect malicious code in the npm ecosystem. |
Nusrat Zahan; Philipp Burckhardt; Mikola Lysenko; Feross Aboukhadijeh; Laurie Williams; | arxiv-cs.CR | 2024-03-18 |
1283 | Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its impressive capabilities, the financial cost associated with GPT-4V’s inference presents a substantial barrier for its wide use. To address this challenge, our work introduces Collage Prompting, a budget-friendly prompting approach that concatenates multiple images into a single visual input. |
Siyu Xu; Yunke Wang; Daochang Liu; Chang Xu; | arxiv-cs.CV | 2024-03-18 |
1284 | How Far Are We on The Decision-Making of LLMs? Evaluating LLMs’ Gaming Ability in Multi-Agent Environments IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GAMA($\gamma$)-Bench, a new framework for evaluating LLMs’ Gaming Ability in Multi-Agent environments. |
JEN-TSE HUANG et. al. | arxiv-cs.AI | 2024-03-18 |
1285 | NotebookGPT – Facilitating and Monitoring Explicit Lightweight Student GPT Help Requests During Programming Exercises Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The success of GPT with coding tasks has made it important to consider the impact of GPT and similar models on teaching programming. Students’ use of GPT to solve programming … |
Samuel D George; P. Dewan; | Companion Proceedings of the 29th International Conference … | 2024-03-18 |
1286 | Human-AI Collaboration in A Student Discussion Forum Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The recent public releases of AI tools such as ChatGPT have forced computer science educators to reconsider how they teach. These tools have demonstrated considerable ability to … |
Mason Laney; P. Dewan; | Companion Proceedings of the 29th International Conference … | 2024-03-18 |
1287 | Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social … |
CHENYU HOU et. al. | Proceedings of the 14th Learning Analytics and Knowledge … | 2024-03-18 |
1288 | AI-Generated Text Detector for Arabic Language Using Encoder-Based Transformer Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The effectiveness of existing AI detectors is notably hampered when processing Arabic texts. This study introduces a novel AI text classifier designed specifically for Arabic, … |
Hamed Alshammari; Ahmed El-Sayed; Khaled Elleithy; | Big Data Cogn. Comput. | 2024-03-18 |
1289 | Evaluating Named Entity Recognition: A Comparative Analysis of Mono- and Multilingual Transformer Models on A Novel Brazilian Corporate Earnings Call Transcripts Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our study aimed to evaluate their performance on a financial Named Entity Recognition (NER) task and determine the computational requirements for fine-tuning and inference. |
Ramon Abilio; Guilherme Palermo Coelho; Ana Estela Antunes da Silva; | arxiv-cs.CL | 2024-03-18 |
1290 | An Empirical Study on JIT Defect Prediction Based on BERT-style Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction. |
Yuxiang Guo; Xiaopeng Gao; Bo Jiang; | arxiv-cs.SE | 2024-03-17 |
1291 | Embracing The Generative AI Revolution: Advancing Tertiary Education in Cybersecurity with GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigated the impact of GPTs, specifically ChatGPT, on tertiary education in cybersecurity, and provided recommendations for universities to adapt their curricula to meet the evolving needs of the industry. |
Raza Nowrozy; David Jam; | arxiv-cs.CY | 2024-03-17 |
1292 | Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. |
Daniel Enström; Viktor Kjellberg; Moa Johansson; | arxiv-cs.LG | 2024-03-17 |
1293 | Large Language Model-powered Chatbots for Internationalizing Student Support in Higher Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This research explores the integration of chatbot technology powered by GPT-3.5 and GPT-4 Turbo into higher education to enhance internationalization and leverage digital … |
Achraf Hsain; H. E. Housni; | ArXiv | 2024-03-16 |
1294 | Using An LLM to Turn Sign Spottings Into Spoken Language Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a hybrid SLT approach, Spotter+GPT, that utilizes a sign spotter and a powerful Large Language Model (LLM) to improve SLT performance. |
Ozge Mercanoglu Sincan; Necati Cihan Camgoz; Richard Bowden; | arxiv-cs.CV | 2024-03-15 |
1295 | ATOM: Asynchronous Training of Massive Models for Deep Learning in A Decentralized Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce \atom, a resilient distributed training framework designed for asynchronous training of vast models in a decentralized setting using cost-effective hardware, including consumer-grade GPUs and Ethernet. |
Xiaofeng Wu; Jia Rao; Wei Chen; | arxiv-cs.DC | 2024-03-15 |
1296 | Sabiá-2: A New Generation of Portuguese Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Sabi\’a-2, a family of large language models trained on Portuguese texts. |
Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira; Ramon Pires; | arxiv-cs.CL | 2024-03-14 |
1297 | Evaluating LLMs for Gender Disparities in Notable Persons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect hallucinated responses or to altogether decline to even answer prompt at all. |
Lauren Rhue; Sofie Goethals; Arun Sundararajan; | arxiv-cs.CL | 2024-03-14 |
1298 | The Future of The Error Message: Comparing Large Language Models and Novice Programmer Effectiveness in Fixing Errors Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Research on enhancing error message presentation is of great interest to teachers and developers alike because improving Integrated Development Environments (IDEs) increases early … |
Brij Howard-Sarin; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-14 |
1299 | Evaluating Large Language Model Code Generation As An Autograding Mechanism for Explain in Plain English Questions IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The ability of students to ”Explain in Plain English” (EiPE) the purpose of code is a critical skill for students in introductory programming courses to develop. EiPE questions … |
David H. Smith; C. Zilles; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-14 |
1300 | AI on AI: Exploring The Utility of GPT As An Expert Annotator of AI Publications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results indicate that with effective prompt engineering, chatbots can be used as reliable data annotators even where subject-area expertise is required. To evaluate the utility of chatbot-annotated datasets on downstream classification tasks, we train a new classifier on GPT-labeled data and compare its performance to the arXiv-trained model. |
Autumn Toney-Wails; Christian Schoeberl; James Dunham; | arxiv-cs.CL | 2024-03-14 |
1301 | Automatic Classification of Multi-attributes from Person Images Using GPT-4 Vision Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Classifying multi-attributes is gaining interest in the research and business community, especially for person re-identification (ReID) and fashion trend analysis. However, manual … |
Yusei Fujimoto; Khayrul Bashar; | Proceedings of the 2024 6th International Conference on … | 2024-03-14 |
1302 | A Hierarchical Underwater Acoustic Target Recognition Method Based on Transformer and Transfer Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Underwater acoustic target recognition (UATR) is one of the essential research directions in the underwater acoustic signal processing field. The machine learning-based … |
Lu Chen; Xinwei Luo; Hanlu Zhou; | Proceedings of the 2024 6th International Conference on … | 2024-03-14 |
1303 | Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Targeting at VL PEFT tasks, we propose a family of operations, called routing functions, to enhance VL alignment in the low-rank bottlenecks. |
Tingyu Qu; Tinne Tuytelaars; Marie-Francine Moens; | arxiv-cs.CV | 2024-03-14 |
1304 | ViTCN: Vision Transformer Contrastive Network For Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Zhang et al proposed a dataset called RAVEN which can be used to test Machine Learning model abstract reasoning ability. In this paper, we purposed Vision Transformer Contrastive Network which build on previous work with the Contrastive Perceptual Inference network (CoPiNet), which set a new benchmark for permutationinvariant models Raven Progressive Matrices by incorporating contrast effects from psychology, cognition, and education, and extends this foundation by leveraging the cutting-edge Vision Transformer architecture. |
Bo Song; Yuanhao Xu; Yichao Wu; | arxiv-cs.CV | 2024-03-14 |
1305 | Evaluating The Application of Large Language Models to Generate Feedback in Programming Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study investigates the application of large language models, specifically GPT-4, to enhance programming education. The research outlines the design of a web application that … |
Sven Jacobs; Steffen Jaschke; | 2024 IEEE Global Engineering Education Conference (EDUCON) | 2024-03-13 |
1306 | Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare four of the currently most relevant large, web-crawled corpora (CC100, MaCoCu, mC4 and OSCAR) across eleven lower-resourced European languages. |
RIK VAN NOORD et. al. | arxiv-cs.CL | 2024-03-13 |
1307 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
DONG YUAN et. al. | arxiv-cs.CL | 2024-03-13 |
1308 | GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored By Compliance, Context and Attribute Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security. |
Raza Nowrozy; Khandakar Ahmed; Hua Wang; | arxiv-cs.CY | 2024-03-13 |
1309 | Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language processing (NLP) practitioners are leveraging large language models (LLM) to create structured datasets from semi-structured and unstructured data sources such as … |
Jesse Atuhurra; Seiveright Cargill Dujohn; Hidetaka Kamigaito; Hiroyuki Shindo; Taro Watanabe; | ArXiv | 2024-03-13 |
1310 | Rethinking ASTE: A Minimalist Tagging Scheme Alongside Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches to ASTE often complicate the task with additional structures or external data. In this research, we propose a novel tagging scheme and employ a contrastive learning approach to mitigate these challenges. |
Qiao Sun; Liujia Yang; Minghao Ma; Nanyang Ye; Qinying Gu; | arxiv-cs.CL | 2024-03-12 |
1311 | Pose Pattern Mining Using Transformer for Motion Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Seo-El Lee; Hyun Yoo; Kyungyong Chung; | Appl. Intell. | 2024-03-12 |
1312 | Using Generative AI to Improve The Performance and Interpretability of Rule-Based Diagnosis of Type 2 Diabetes Mellitus Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates … |
Leon Kopitar; Iztok Fister; Gregor Stiglic; | Inf. | 2024-03-12 |
1313 | The Future of Document Indexing: GPT and Donut Revolutionize Table of Content Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model. |
Degaga Wolde Feyisa; Haylemicheal Berihun; Amanuel Zewdu; Mahsa Najimoghadam; Marzieh Zare; | arxiv-cs.IR | 2024-03-12 |
1314 | GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present GPT Reddit Dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset designed to assess the performance of detection models in identifying generated responses from ChatGPT. |
Zubair Qazi; William Shiao; Evangelos E. Papalexakis; | arxiv-cs.CL | 2024-03-12 |
1315 | In-context Learning Enables Multimodal Large Language Models to Classify Cancer Pathology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. |
DYKE FERBER et. al. | arxiv-cs.CV | 2024-03-12 |
1316 | SIFiD: Reassess Summary Factual Inconsistency Detection with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4. |
JIUDING YANG et. al. | arxiv-cs.CL | 2024-03-12 |
1317 | Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. |
NICHOLAS CARLINI et. al. | arxiv-cs.CR | 2024-03-11 |
1318 | Development of A Reliable and Accessible Caregiving Language Model (CaLM) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we focused on caregivers of individuals with Alzheimer’s Disease Related Dementias. |
BAMBANG PARMANTO et. al. | arxiv-cs.CL | 2024-03-11 |
1319 | Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Which we use in another set of transformer encoder layers to learn the inter-chunk representations. We analyze the adaptability of Large Language Models (LLMs) with multi-billion parameters (GPT-Neo, and GPT-J) with the hierarchical framework of MESc and compare them with their standalone performance on legal texts. |
Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki; | arxiv-cs.CL | 2024-03-11 |
1320 | Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Context-observant LLM-Enabled Autonomous Robots (CLEAR) platform offers a general solution for large language model (LLM)-enabled robot autonomy. CLEAR-controlled robots use … |
JACOB P. MACDONALD et. al. | Companion of the 2024 ACM/IEEE International Conference on … | 2024-03-11 |
1321 | QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}. |
JIUN-MAN CHEN et. al. | arxiv-cs.CV | 2024-03-11 |
1322 | JayBot — Aiding University Students and Admission with An LLM-based Chatbot Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This demo paper presents JayBot, an LLM-based chatbot system aimed at enhancing the user experience of prospective and current students, faculty, and staff at a UK university. The … |
Julius Odede; Ingo Frommholz; | Proceedings of the 2024 Conference on Human Information … | 2024-03-10 |
1323 | LLMs Still Can’t Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard’s Capacity to Handle Object-Oriented Programming Assignments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we experimented with three prominent LLMs – GPT-3.5, GPT-4, and Bard – to solve real-world OOP exercises used in educational settings, subsequently validating their solutions using an Automatic Assessment Tool (AAT). |
Bruno Pereira Cipriano; Pedro Alves; | arxiv-cs.SE | 2024-03-10 |
1324 | Enhancing Human Annotation: Leveraging Large Language Models and Efficient Batch Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) are capable of assessing document and query characteristics, including relevance, and are now being used for a variety of different classification … |
Oleg Zendel; J. Culpepper; Falk Scholer; Paul Thomas; | Proceedings of the 2024 Conference on Human Information … | 2024-03-10 |
1325 | GPT As Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. |
HAO LU et. al. | arxiv-cs.CV | 2024-03-09 |
1326 | A Dataset and Benchmark for Hospital Course Summarization with Adapted Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel pre-processed dataset, the MIMIC-IV-BHC, encapsulating clinical note and brief hospital course (BHC) pairs to adapt LLMs for BHC synthesis. |
ASAD AALI et. al. | arxiv-cs.CL | 2024-03-08 |
1327 | How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named “GPTRadScore”. |
QINGQING ZHU et. al. | arxiv-cs.AI | 2024-03-08 |
1328 | To Err Is Human, But Llamas Can Learn It Too Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs). |
Agnes Luhtaru; Taido Purason; Martin Vainikko; Maksym Del; Mark Fishel; | arxiv-cs.CL | 2024-03-08 |
1329 | Will GPT-4 Run DOOM? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4’s reasoning and planning capabilities extend to the 1993 first-person shooter Doom. |
Adrian de Wynter; | arxiv-cs.CL | 2024-03-08 |
1330 | RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models’ reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. |
ZIHAO WANG et. al. | arxiv-cs.CL | 2024-03-08 |
1331 | Electron Density-based GPT for Optimization and Suggestion of Host–guest Binders Related Papers Related Patents Related Grants Related Venues Related Experts View |
JUAN MANUEL PARRILLA GUTIERREZ et. al. | Nature Computational Science | 2024-03-08 |
1332 | The Impact of Quantization on The Robustness of Transformer-based Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the effect of quantization on the robustness of Transformer-based models. |
Seyed Parsa Neshaei; Yasaman Boreshban; Gholamreza Ghassem-Sani; Seyed Abolghasem Mirroshandel; | arxiv-cs.CL | 2024-03-08 |
1333 | Using GPT-4 to Provide Tiered, Formative Code Feedback Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have shown promise in generating sensible code explanation and feedback in programming exercises. In this experience report, we discuss the process of … |
Ha Nguyen; Vicki Allan; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-07 |
1334 | An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design an error-based human annotation framework to assess the GPT-4’s simplification capabilities. |
Xuanxin Wu; Yuki Arase; | arxiv-cs.CL | 2024-03-07 |
1335 | A Large Scale RCT on Effective Error Messages in CS1 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial conducted in an open-access, online introductory computer … |
Sierra Wang; John C. Mitchell; C. Piech; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-07 |
1336 | Feedback-Generation for Programming Exercises With GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. |
Imen Azaiz; Natalie Kiesler; Sven Strickroth; | arxiv-cs.AI | 2024-03-07 |
1337 | Exploring The Impact of Generative AI for StandUp Report Recommendations in Software Capstone Project Development Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: StandUp Reports play an important role in capstone software engineering courses, facilitating progress tracking, obstacle identification, and team collaboration. However, despite … |
ANDRÉS NEYEM et. al. | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-07 |
1338 | Federated Recommendation Via Hybrid Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism. |
Huimin Zeng; Zhenrui Yue; Qian Jiang; Dong Wang; | arxiv-cs.IR | 2024-03-07 |
1339 | Probabilistic Topic Modelling with Transformer Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Transformer-Representation Neural Topic Model (TNTM), which combines the benefits of topic representations in transformer-based embedding spaces and probabilistic modelling. |
Arik Reuter; Anton Thielmann; Christoph Weisser; Benjamin Säfken; Thomas Kneib; | arxiv-cs.LG | 2024-03-06 |
1340 | Whodunit: Classifying Code As Human Authored or GPT-4 Generated- A Case Study on CodeChef Problems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificial intelligence (AI) assistants such as GitHub Copilot and ChatGPT, built on large language models like GPT-4, are revolutionizing how programming tasks are performed, … |
Oseremen Joy Idialu; N. Mathews; Rungroj Maipradit; J. Atlee; Mei Nagappan; | 2024 IEEE/ACM 21st International Conference on Mining … | 2024-03-06 |
1341 | Whodunit: Classifying Code As Human Authored or GPT-4 Generated — A Case Study on CodeChef Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study shows that code stylometry is a promising approach for distinguishing between GPT-4 generated code and human-authored code. |
Oseremen Joy Idialu; Noble Saji Mathews; Rungroj Maipradit; Joanne M. Atlee; Mei Nagappan; | arxiv-cs.SE | 2024-03-06 |
1342 | Assessing The Aesthetic Evaluation Capabilities of GPT-4 with Vision: Insights from Group and Individual Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, it has been recognized that large language models demonstrate high performance on various intellectual tasks. |
Yoshia Abe; Tatsuya Daikoku; Yasuo Kuniyoshi; | arxiv-cs.AI | 2024-03-06 |
1343 | Can Large Language Models Do Analytical Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports. |
YEBOWEN HU et. al. | arxiv-cs.CL | 2024-03-06 |
1344 | Designing Informative Metrics for Few-Shot Example Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a complexity-based prompt selection approach for sequence tagging tasks. |
Rishabh Adiga; Lakshminarayanan Subramanian; Varun Chandrasekaran; | arxiv-cs.CL | 2024-03-06 |
1345 | Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead of following the popular practice of directly translating existing English resources into Japanese (e.g., Japanese-Alpaca), we propose an efficient self-instruct method based on GPT-4. |
YIKUN SUN et. al. | arxiv-cs.CL | 2024-03-06 |
1346 | Japanese-English Sentence Translation Exercises Dataset for Automatic Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the task of automatic assessment of Sentence Translation Exercises (STEs), that have been used in the early stage of L2 language learning. |
NAOKI MIURA et. al. | arxiv-cs.CL | 2024-03-05 |
1347 | AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys. |
Anjalee De Silva; Janaka L. Wijekoon; Rashini Liyanarachchi; Rrubaa Panchendrarajan; Weranga Rajapaksha; | arxiv-cs.AI | 2024-03-05 |
1348 | Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled By GPT-4 for Enhanced Interpretability and Public Engagement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: And the public requires complex techniques to inquiry and understand socio-cultural and institutional factors, often hinders the public’s understanding of flood risks. To overcome these challenges, our study introduces an innovative solution: a customized AI Assistant powered by the GPT-4 Large Language Model. |
Rafaela Martelo; Ruo-Qian Wang; | arxiv-cs.AI | 2024-03-05 |
1349 | An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model Is Not A General Substitute for GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs. |
HUI HUANG et. al. | arxiv-cs.CL | 2024-03-05 |
1350 | JMI at SemEval 2024 Task 3: Two-step Approach for Multimodal ECAC Using In-context Learning with GPT and Instruction-tuned Llama Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents our system development for SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations. |
Mohammed Abbas Ansari; Chandni Saxena; Tanvir Ahmad; | arxiv-cs.CL | 2024-03-05 |
1351 | Evolution Transformer: In-Context Evolutionary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: An alternative promising approach is to leverage data and directly discover powerful optimization principles via meta-optimization. In this work, we follow such a paradigm and introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies. |
Robert Tjarko Lange; Yingtao Tian; Yujin Tang; | arxiv-cs.AI | 2024-03-05 |
1352 | Predicting Learning Performance with Large Language Models: A Study in Adult Literacy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. … |
LIANG ZHANG et. al. | ArXiv | 2024-03-04 |
1353 | Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A common way of assessing language learners’ mastery of vocabulary is via multiple-choice cloze (i.e., fill-in-the-blank) questions. But the creation of test items can be … |
Qiao Wang; Ralph L. Rose; Naho Orita; Ayaka Sugawara; | ArXiv | 2024-03-04 |
1354 | Using LLMs for The Extraction and Normalization of Product Attribute Values Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Web Data Commons – Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments. |
Alexander Brinkmann; Nick Baumann; Christian Bizer; | arxiv-cs.CL | 2024-03-04 |
1355 | Transformers for Supervised Online Continual Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformers have become the dominant architecture for sequence modeling tasks such as natural language processing or audio processing, and they are now even considered for tasks … |
J. Bornschein; Yazhe Li; Amal Rannen-Triki; | ArXiv | 2024-03-03 |
1356 | What Is Missing in Multilingual Visual Reasoning and How to Fix It Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: NLP models today strive for supporting multiple languages and modalities, improving accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal … |
Yueqi Song; Simran Khanuja; Graham Neubig; | ArXiv | 2024-03-03 |
1357 | An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer-based large language models (LLMs) such as Generative Pre-trained Transformer (GPT) have become popular due to their remarkable performance across diverse … |
SANGSOO PARK et. al. | 2024 IEEE International Symposium on High-Performance … | 2024-03-02 |
1358 | LM4OPT: Unveiling The Potential of Large Language Models in Formulating Mathematical Optimization Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the rapidly evolving field of natural language processing, the translation of linguistic descriptions into mathematical formulation of optimization problems presents a formidable challenge, demanding intricate understanding and processing capabilities from Large Language Models (LLMs). This study compares prominent LLMs, including GPT-3.5, GPT-4, and Llama-2-7b, in zero-shot and one-shot settings for this task. |
Tasnim Ahmed; Salimur Choudhury; | arxiv-cs.CL | 2024-03-02 |
1359 | Analysis of Privacy Leakage in Federated Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need … |
Minh N. Vu; Truc D. T. Nguyen; Tre’ R. Jeter; My T. Thai; | International Conference on Artificial Intelligence and … | 2024-03-02 |
1360 | Using GPT and Authentic Contextual Recognition to Generate Math Word Problems with Difficulty Levels Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wu-Yuin Hwang; Ika Qutsiati Utami; | Educ. Inf. Technol. | 2024-03-02 |
1361 | Low-light Images Enhancement Via A Dense Transformer Network Related Papers Related Patents Related Grants Related Venues Related Experts View |
YI HUANG et. al. | Digit. Signal Process. | 2024-03-01 |
1362 | Attention Combined Pyramid Vision Transformer for Polyp Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xiaogang Liu; Shuang Song; | Biomed. Signal Process. Control. | 2024-03-01 |
1363 | WaterFormer: A Global–Local Transformer for Underwater Image Enhancement With Environment Adaptor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Underwater image enhancement (UIE) is crucial for high-level vision in underwater robotics. While convolutional neural networks (CNNs) have made significant achievements in UIE, … |
JUNJIE WEN et. al. | IEEE Robotics & Automation Magazine | 2024-03-01 |
1364 | Spikeformer: Training High-performance Spiking Neural Network with Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yudong Li; Yunlin Lei; Xu Yang; | Neurocomputing | 2024-03-01 |
1365 | LCDFormer: Long-term Correlations Dual-graph Transformer for Traffic Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jiongbiao Cai; Chia-Hung Wang; Kun Hu; | Expert Syst. Appl. | 2024-03-01 |
1366 | PPTtrack: Pyramid Pooling Based Transformer Backbone for Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jun Wang; Shuai Yang; Yuanyun Wang; Guang Yang; | Expert Syst. Appl. | 2024-03-01 |
1367 | Probabilistic Gear Fatigue Life Prediction Based on Physics-informed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yang Li; Huaiju Liu; Yiming Chen; Difa Chen; | Expert Syst. Appl. | 2024-03-01 |
1368 | PWDformer: Deformable Transformer for Long-term Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zheng Wang; Haowei Ran; Jinchang Ren; Meijun Sun; | Pattern Recognit. | 2024-03-01 |
1369 | Spatial–Temporal Synchronous Transformer for Skeleton-Based Hand Gesture Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Capturing the long-range spatial-temporal correlation among joints of dynamic skeletal data efficiently is very challenging in hand gesture recognition (HGR). The flexibility of … |
Dongdong Zhao; Hongli Li; Shi Yan; | IEEE Transactions on Circuits and Systems for Video … | 2024-03-01 |
1370 | Driver Distraction Detection Using Semi-supervised Lightweight Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Adam A.Q. Mohammed; Xin Geng; Jing Wang; Zafar Ali; | Eng. Appl. Artif. Intell. | 2024-03-01 |
1371 | Transformer Based on The Prediction of Psoriasis Severity Treatment Response Related Papers Related Patents Related Grants Related Venues Related Experts View |
Cho-I Moon; Eun Bin Kim; Yoosang Baek; Onesok Lee; | Biomed. Signal Process. Control. | 2024-03-01 |
1372 | LAB: Large-Scale Alignment for ChatBots IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. |
SHIVCHANDER SUDALAIRAJ et. al. | arxiv-cs.CL | 2024-03-01 |
1373 | A Point Contextual Transformer Network for Point Cloud Completion Related Papers Related Patents Related Grants Related Venues Related Experts View |
Siyi Leng; Zhenxin Zhang; Liqiang Zhang; | Expert Syst. Appl. | 2024-03-01 |
1374 | DGFormer: Dynamic Graph Transformer for 3D Human Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhan-Heng Chen; Ju Dai; Junxuan Bai; Junjun Pan; | Pattern Recognit. | 2024-03-01 |
1375 | MGCoT: Multi-Grained Contextual Transformer for Table-based Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xianjie Mo; Yang Xiang; Youcheng Pan; Yongshuai Hou; Ping Luo; | Expert Syst. Appl. | 2024-03-01 |
1376 | T3SRS: Tensor Train Transformer for Compressing Sequential Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View |
HAO LI et. al. | Expert Syst. Appl. | 2024-03-01 |
1377 | Comparing Large Language Models and Human Programmers for Generating Programming Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4 employing the optimal prompt strategy outperforms 85 percent of human participants. |
Wenpin Hou; Zhicheng Ji; | arxiv-cs.SE | 2024-03-01 |
1378 | A Systematic Evaluation of Large Language Models for Generating Programming Code Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We systematically evaluated the performance of seven large language models in generating programming code using various prompt strategies, programming languages, and task … |
Wenpin Hou; Zhicheng Ji; | ArXiv | 2024-03-01 |
1379 | An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer models have great potential in the field of remote sensing super-resolution (SR) due to their excellent self-attention mechanisms. However, transformer models are … |
WENJIAN ZHANG et. al. | Remote. Sens. | 2024-03-01 |
1380 | K-NN Attention-based Video Vision Transformer for Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Weirong Sun; Yujun Ma; Ruili Wang; | Neurocomputing | 2024-03-01 |
1381 | Multi-modal Person Re-identification Based on Transformer Relational Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIANGTIAN ZHENG et. al. | Inf. Fusion | 2024-03-01 |
1382 | A Novel Full-convolution UNet-transformer for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Tianyou Zhu; Derui Ding; Feng Wang; Wei Liang; Bo Wang; | Biomed. Signal Process. Control. | 2024-03-01 |
1383 | Case Study Identification with GPT-4 and Implications for Mapping Studies Related Papers Related Patents Related Grants Related Venues Related Experts View |
Kai Petersen; | Inf. Softw. Technol. | 2024-03-01 |
1384 | FAM: Improving Columnar Vision Transformer with Feature Attention Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View |
LAN HUANG et. al. | Comput. Vis. Image Underst. | 2024-03-01 |
1385 | Transformer Based Multiple Instance Learning for WSI Breast Cancer Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
CHENGYANG GAO et. al. | Biomed. Signal Process. Control. | 2024-03-01 |
1386 | PeLLE: Encoder-based Language Models for Brazilian Portuguese Based on Open Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present PeLLE, a family of large language models based on the RoBERTa architecture, for Brazilian Portuguese, trained on curated, open data from the Carolina corpus. |
GUILHERME LAMARTINE DE MELLO et. al. | arxiv-cs.CL | 2024-02-29 |
1387 | Here’s A Free Lunch: Sanitizing Backdoored Models with Model Merge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared to multiple advanced defensive approaches, our method offers an effective and efficient inference-stage defense against backdoor attacks on classification and instruction-tuned tasks without additional resources or specific knowledge. |
ANSH ARORA et. al. | arxiv-cs.CL | 2024-02-29 |
1388 | PROC2PDDL: Open-Domain Planning Representations from Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. |
TIANYI ZHANG et. al. | arxiv-cs.CL | 2024-02-29 |
1389 | Can GPT Improve The State of Prior Authorization Via Guideline Based Automated Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate whether GPT can validate numerous key factors, in turn helping health plans reach a decision drastically faster. |
Shubham Vatsal; Ayush Singh; Shabnam Tafreshi; | arxiv-cs.CL | 2024-02-28 |
1390 | H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prior hardware accelerator designs primarily focused on single-chip solutions for 10 MB-class computer vision models. The GB-class transformer models for natural language … |
Yandong Luo; Shimeng Yu; | ACM Transactions on Design Automation of Electronic Systems | 2024-02-28 |
1391 | Demo: On-Device Video Analysis with LLMs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present a new on-device pipeline that efficiently summarizes lecture videos and provides relevant answers directly from a smartphone. We utilize widely accessible tools like … |
Vishnu Jaganathan; Deepak Gouda; Kriti Arora; Mohit Aggarwal; Chao Zhang; | Proceedings of the 25th International Workshop on Mobile … | 2024-02-28 |
1392 | A Language Model Based Framework for New Concept Placement in Ontologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection. |
Hang Dong; Jiaoyan Chen; Yuan He; Yongsheng Gao; Ian Horrocks; | arxiv-cs.CL | 2024-02-27 |
1393 | Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters. |
Flavio Petruzzellis; Alberto Testolin; Alessandro Sperduti; | arxiv-cs.CL | 2024-02-27 |
1394 | Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The majority of the recent initiatives targeting medium to low-resource languages produced relatively small annotated datasets, with a skewed distribution, posing challenges for the development of sophisticated propaganda detection models. To address this challenge, we carefully develop the largest propaganda dataset to date, ArPro, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques. |
Maram Hasanain; Fatema Ahmed; Firoj Alam; | arxiv-cs.CL | 2024-02-27 |
1395 | CAPT: Category-level Articulation Estimation from A Single Point Cloud Using Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CAPT: category-level articulation estimation from a point cloud using Transformer. |
Lian Fu; Ryoichi Ishikawa; Yoshihiro Sato; Takeshi Oishi; | arxiv-cs.CV | 2024-02-27 |
1396 | GeoLLM: Extracting Geospatial Knowledge from Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged for geospatial prediction tasks. |
ROHIN MANVI et. al. | iclr | 2024-02-26 |
1397 | If in A Crowdsourced Data Annotation Pipeline, A GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were … |
Zeyu He; Huang Chieh-Yang; C. C. Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang; | ArXiv | 2024-02-26 |
1398 | Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring The Design of Next-generation Neuromorphic Chips IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a general Transformer-based SNN architecture, termed as “Meta-SpikeFormer, whose goals are: (1) *Lower-power*, supports the spike-driven paradigm that there is only sparse addition in the network; (2) *Versatility*, handles various vision tasks; (3) *High-performance*, shows overwhelming performance advantages over CNN-based SNNs; (4) *Meta-architecture*, provides inspiration for future next-generation Transformer-based neuromorphic chip designs. |
MAN YAO et. al. | iclr | 2024-02-26 |
1399 | Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation can significantly benefit from large-scale video generative pre-training. |
HONGTAO WU et. al. | iclr | 2024-02-26 |
1400 | Looped Transformers Are Better at Learning Learning Algorithms IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures. |
Liu Yang; Kangwook Lee; Robert D Nowak; Dimitris Papailiopoulos; | iclr | 2024-02-26 |
1401 | Massive Editing for Large Language Models Via Meta Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameter using the normal equation. |
Chenmien Tan; Ge Zhang; Jie Fu; | iclr | 2024-02-26 |
1402 | Enhancing Neural Decoding with Large Language Models: A GPT-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Many neural decoders specialize in one function. They provide a task-dependent interpretation of the signal based on what is happening in the subject’s brain and the subject’s … |
Dong Hyeok Lee; Chun Kee Chung; | 2024 12th International Winter Conference on Brain-Computer … | 2024-02-26 |
1403 | Graph Transformers on EHRs: Better Representation Improves Downstream Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose GT-BEHRT, a new approach that leverages temporal visit embeddings extracted from a graph transformer and uses a BERT-based model to obtain more robust patient representations, especially on longer EHR sequences. |
Raphael Poulain; Rahmatollah Beheshti; | iclr | 2024-02-26 |
1404 | The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity. |
Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Re; | iclr | 2024-02-26 |
1405 | Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the effect of code on enhancing LLMs’ reasoning capability by introducing different constraints on the Code Usage Frequency of GPT-4 Code Interpreter. |
AOJUN ZHOU et. al. | iclr | 2024-02-26 |
1406 | DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT). |
XIANJUN YANG et. al. | iclr | 2024-02-26 |
1407 | Transformer-VQ: Linear-Time Transformers Via Vector Quantization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time. |
Lucas Dax Lingle; | iclr | 2024-02-26 |
1408 | Masked Distillation Advances Self-Supervised Transformer Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a masked image modelling (MIM) based self-supervised neural architecture search method specifically designed for vision transformers, termed as MaskTAS, which completely avoids the expensive costs of data labeling inherited from supervised learning. |
CAIXIA YAN et. al. | iclr | 2024-02-26 |
1409 | Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We used an ablation study to show that joint training on neuronal responses and behavior boosted performance, highlighting the model’s ability to associate behavioral and neural representations in an unsupervised manner. |
Antonis Antoniades; Yiyi Yu; Joe S Canzano; William Yang Wang; Spencer Smith; | iclr | 2024-02-26 |
1410 | Generating Effective Ensembles for Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, transformer models have revolutionized Natural Language Processing (NLP), achieving exceptional results across various tasks, including Sentiment Analysis (SA). … |
Itay Etelis; Avi Rosenfeld; Abraham Itzhak Weinberg; David Sarne; | ArXiv | 2024-02-26 |
1411 | The Reversal Curse: LLMs Trained on “A Is B” Fail to Learn “B Is A” Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is worth noting, however, that if ”_A_ is _B_” appears _in-context_, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as ”Uriah Hawthorne is the composer of _Abyssal Melodies_” and showing that they fail to correctly answer ”Who composed _Abyssal Melodies? |
LUKAS BERGLUND et. al. | iclr | 2024-02-26 |
1412 | Quantum Linear Algebra Is All You Need for Transformer Architectures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative machine learning methods such as large-language models are revolutionizing the creation of text and images. While these models are powerful they also harness a large … |
Naixu Guo; Zhan Yu; Aman Agrawal; P. Rebentrost; | ArXiv | 2024-02-26 |
1413 | MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks. |
PAN LU et. al. | iclr | 2024-02-26 |
1414 | Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing (e.g., mathematical, causal) reasoning tasks. |
Murong Yue; Jie Zhao; Min Zhang; Liang Du; Ziyu Yao; | iclr | 2024-02-26 |
1415 | Is Self-Repair A Silver Bullet for Code Generation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze Code Llama, GPT-3.5 and GPT-4’s ability to perform self-repair on problems taken from HumanEval and APPS. |
Theo X. Olausson; Jeevana Priya Inala; Chenglong Wang; Jianfeng Gao; Armando Solar-Lezama; | iclr | 2024-02-26 |
1416 | A Multi-Level Framework for Accelerating Training Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by a set of key observations of inter- and intra-layer similarities among feature maps and attentions that can be identified from typical training processes, we propose a multi-level framework for training acceleration. |
Longwei Zou; Han Zhang; Yangdong Deng; | iclr | 2024-02-26 |
1417 | MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We believe that the enhanced multi-modal generation capabilities of GPT-4 stem from the utilization of sophisticated large language models (LLM). To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen advanced LLM, Vicuna, using one projection layer. |
Deyao Zhu; Jun Chen; Xiaoqian Shen; Xiang Li; Mohamed Elhoseiny; | iclr | 2024-02-26 |
1418 | NOLA: Compressing LoRA Using Linear Combination of Random Basis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce NOLA, which overcomes the rank one lower bound present in LoRA. |
Soroush Abbasi Koohpayegani; Navaneet K L; Parsa Nooralinejad; Soheil Kolouri; Hamed Pirsiavash; | iclr | 2024-02-26 |
1419 | AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that vector graphics are typically encoded using low-level graphics primitives, generating them directly is difficult. To address this, we propose the use of TikZ, a well-known abstract graphics language that can be compiled to vector graphics, as an intermediate representation of scientific figures. |
Jonas Belouadi; Anne Lauscher; Steffen Eger; | iclr | 2024-02-26 |
1420 | If in A Crowdsourced Data Annotation Pipeline, A GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were … |
Zeyu He; Chieh-Yang Huang; Chien-Kuang Cornelia Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang; | arxiv-cs.HC | 2024-02-26 |
1421 | VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Video temporal grounding (VTG) aims to locate specific temporal segments from an untrimmed video based on a linguistic query. Most existing VTG models are trained on extensive … |
Yifang Xu; Yunzhuo Sun; Zien Xie; Benxiang Zhai; Sidan Du; | ArXiv | 2024-02-25 |
1422 | From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatility IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education. |
PRAVNEET KAUR et. al. | arxiv-cs.CL | 2024-02-25 |
1423 | Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including semantic understanding, intelligent writing, and reasoning, paving the way for a more generalized form of artificial intelligence. |
Shuning Huo; Yafei Xiang; Hanyi Yu; Mengran Zhu; Yulu Gong; | arxiv-cs.CL | 2024-02-25 |
1424 | TV-SAM: Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model (TV-SAM) without any manual annotations. |
ZEKUN JIANG et. al. | arxiv-cs.CV | 2024-02-24 |
1425 | SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection. |
Ayan Datta; Aryan Chandramania; Radhika Mamidi; | arxiv-cs.CL | 2024-02-24 |
1426 | Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We propose a novel approach for machine-generated text detection using a RoBERTa model with weighted layer averaging and AdaLoRA for parameter-efficient fine-tuning. Our method … |
Ayan Datta; Aryan Chandramania; Radhika Mamidi; | ArXiv | 2024-02-24 |
1427 | ArabianGPT: Native Arabic GPT-based Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, there is a theoretical and practical imperative for developing LLMs predominantly focused on Arabic linguistic elements. To address this gap, this paper proposes ArabianGPT, a series of transformer-based models within the ArabianLLM suite designed explicitly for Arabic. |
Anis Koubaa; Adel Ammar; Lahouari Ghouti; Omar Najar; Serry Sibaee; | arxiv-cs.CL | 2024-02-23 |
1428 | Self-Supervised Pre-Training for Table Structure Recognition Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we resolve the issue by proposing a self-supervised pre-training (SSP) method for TSR transformers. |
ShengYun Peng; Seongmin Lee; Xiaojing Wang; Rajarajeswari Balasubramaniyan; Duen Horng Chau; | arxiv-cs.CV | 2024-02-23 |
1429 | Advancing Parameter Efficiency in Fine-tuning Via Representation Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing PEFT methods pose challenges in hyperparameter selection, such as choosing the rank for LoRA or Adapter, or specifying the length of soft prompts. To address these challenges, we propose a novel fine-tuning approach for neural models, named Representation EDiting (RED), which modifies the representations generated at some layers through the application of scaling and biasing operations. |
MULING WU et. al. | arxiv-cs.LG | 2024-02-23 |
1430 | Towards Efficient Active Learning in NLP Via Pretrained Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. |
Artem Vysogorets; Achintya Gopal; | arxiv-cs.LG | 2024-02-23 |
1431 | Multimodal Transformer With A Low-Computational-Cost Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multimodal Transformers significantly suffer from a quadratic complexity of the multi-head attention with the input sequence length, especially as the number of modalities increases. To address this, we introduce Low-Cost Multimodal Transformer (LoCoMT), a novel multimodal attention mechanism that aims to reduce computational cost during training and inference with minimal performance loss. |
Sungjin Park; Edward Choi; | arxiv-cs.LG | 2024-02-23 |
1432 | A First Look at GPT Apps: Landscape and Vulnerability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. |
ZEJUN ZHANG et. al. | arxiv-cs.CR | 2024-02-23 |
1433 | Whose LLM Is It Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse inputs. |
Ariel Rosenfeld; Teddy Lazebnik; | arxiv-cs.CL | 2024-02-22 |
1434 | Tokenization Counts: The Impact of Tokenization on Arithmetic in Frontier LLMs IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Tokenization, the division of input text into input tokens, is an often overlooked aspect of the large language model (LLM) pipeline and could be the source of useful or harmful … |
Aaditya K. Singh; DJ Strouse; | ArXiv | 2024-02-22 |
1435 | OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. |
TIANYU ZHENG et. al. | arxiv-cs.SE | 2024-02-22 |
1436 | Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker. |
Younghun Lee; Dan Goldwasser; Laura Schwab Reese; | arxiv-cs.CL | 2024-02-21 |
1437 | Do Efficient Transformers Really Save Computation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to understand the capabilities and limitations of efficient Transformers, specifically the Sparse Transformer and the Linear Transformer. |
KAI YANG et. al. | arxiv-cs.LG | 2024-02-21 |
1438 | TransGOP: Transformer-Based Gaze Object Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, this paper introduces Transformer into the fields of gaze object prediction and proposes an end-to-end Transformer-based gaze object prediction method named TransGOP. |
Binglu Wang; Chenxi Guo; Yang Jin; Haisheng Xia; Nian Liu; | arxiv-cs.CV | 2024-02-21 |
1439 | On The Expressive Power of A Variant of The Looped Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide theoretical evidence of the expressive power of the AlgoFormer in solving some challenging problems, mirroring human-designed algorithms. |
YIHANG GAO et. al. | arxiv-cs.LG | 2024-02-21 |
1440 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper highlights the best practices of the PGI, Persona, Grouping, and Intelligence, method, a strategic framework that achieved a remarkable error rate of only 3,15 percent across 4,000 responses generated by GPT in response to a real business challenge. |
Aline Ioste; | arxiv-cs.CL | 2024-02-21 |
1441 | An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present an optimized, fine-tuned transformer-based DistilBERT model designed for the detection of phishing emails. |
Mohammad Amaz Uddin; Iqbal H. Sarker; | arxiv-cs.LG | 2024-02-21 |
1442 | Towards Equipping Transformer with The Ability of Systematic Compositionality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tentatively provide a successful implementation of a multi-layer CAT on the basis of the especially popular BERT. |
Chen Huang; Peixin Qin; Wenqiang Lei; Jiancheng Lv; | aaai | 2024-02-20 |
1443 | Advancing GenAI Assisted Programming–A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to explore the best practices for utilizing GenAI as a programming tool, through a comparative analysis between GPT-4 and GLM-4. |
Angus Yang; Zehan Li; Jie Li; | arxiv-cs.SE | 2024-02-20 |
1444 | SentinelLMs: Encrypted Input Adaptation and Fine-Tuning of Language Models for Private and Secure Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this introduces two fundamental risks: (a) the transmission of user inputs to the server via the network gives rise to interception vulnerabilities, and (b) privacy concerns emerge as organizations that deploy such models store user data with restricted context. To address this, we propose a novel method to adapt and fine-tune transformer-based language models on passkey-encrypted user-specific text. |
Abhijit Mishra; Mingda Li; Soham Deo; | aaai | 2024-02-20 |
1445 | Fairness-Aware Structured Pruning in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: WARNING: This work uses language that is offensive in nature. |
Abdelrahman Zayed; Gonçalo Mordido; Samira Shabanian; Ioana Baldini; Sarath Chandar; | aaai | 2024-02-20 |
1446 | Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a multilingual idiom KB (IdiomKB) developed using large LMs to address this. |
SHUANG LI et. al. | aaai | 2024-02-20 |
1447 | Span Graph Transformer for Document-Level Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to the length limit for input text, these models typically consider text at the sentence-level and cannot capture the long-range contextual dependency within a document. To address this issue, we propose a novel Span Graph Transformer (SGT) method for document-level NER, which constructs long-range contextual dependencies at both the token and span levels. |
Hongli Mao; Xian-Ling Mao; Hanlin Tang; Yu-Ming Shang; Heyan Huang; | aaai | 2024-02-20 |
1448 | How Easy Is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. |
Yusu Qian; Haotian Zhang; Yinfei Yang; Zhe Gan; | arxiv-cs.CV | 2024-02-20 |
1449 | CAR-Transformer: Cross-Attention Reinforcement Transformer for Cross-Lingual Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Cross-Attention Reinforcement (CAR) module and incorporate the module into the transformer backbone to formulate the CAR-Transformer. |
Yuang Cai; Yuyu Yuan; | aaai | 2024-02-20 |
1450 | Transformer Tricks: Precomputing The First Layer Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This micro-paper describes a trick to speed up inference of transformers with RoPE (such as LLaMA, Mistral, PaLM, and Gemma). For these models, a large portion of the first … |
Nils Graef; | arxiv-cs.LG | 2024-02-20 |
1451 | DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency Via Efficient Data Sampling and Routing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present DeepSpeed Data Efficiency, a framework that makes better use of data, increases training efficiency, and improves model quality. |
CONGLONG LI et. al. | aaai | 2024-02-20 |
1452 | S2WAT: Image Style Transfer Via Hierarchical Vision Transformer Using Strips Window Attention IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces Strips Window Attention Transformer (S2WAT), a novel hierarchical vision transformer designed for style transfer. |
Chiyu Zhang; Xiaogang Xu; Lei Wang; Zaiyan Dai; Jun Yang; | aaai | 2024-02-20 |
1453 | Are ELECTRA’s Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We notice a significant drop in performance for the ELECTRA discriminator’s last layer in comparison to prior layers. We explore this drop and propose a way to repair the embeddings using a novel truncated model fine-tuning (TMFT) method. |
Ivan Rep; David Dukić; Jan Šnajder; | arxiv-cs.CL | 2024-02-20 |
1454 | Can Large Language Models Be Used to Provide Psychological Counselling? An Analysis of GPT-4-Generated Responses Using Role-play Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For this study, we collected counseling dialogue data via role-playing scenarios involving expert counselors, and the utterances were annotated with the intentions of the counselors. |
Michimasa Inaba; Mariko Ukiyo; Keiko Takamizo; | arxiv-cs.CL | 2024-02-20 |
1455 | Advancing GenAI Assisted Programming-A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study aims to explore the best practices for utilizing GenAI as a programming tool, through a comparative analysis between GPT-4 and GLM-4. By evaluating prompting strategies … |
Angus Yang; Zehan Li; Jie Li; | ArXiv | 2024-02-20 |
1456 | Proxyformer: Nyström-Based Linear Transformer with Trainable Proxy Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel Nyström method-based transformer, called Proxyformer. |
Sangho Lee; Hayun Lee; Dongkun Shin; | aaai | 2024-02-20 |
1457 | Generalized Planning in PDDL Domains with Pretrained Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate this approach in seven PDDL domains and compare it to four ablations and four baselines. |
TOM SILVER et. al. | aaai | 2024-02-20 |
1458 | Enhancing Large Language Models for Text-to-Testcase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: In this paper, we introduce a text-to-testcase generation approach based on a large language model (GPT-3.5) that is fine-tuned on our curated dataset with an effective prompt design. |
Saranya Alagarsamy; Chakkrit Tantithamthavorn; Chetan Arora; Aldeida Aleti; | arxiv-cs.SE | 2024-02-19 |
1459 | Your Large Language Model Is Secretly A Fairness Proponent and You Should Prompt It Like One Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to this, we validate that prompting LLMs with specific roles can allow LLMs to express diverse viewpoints. Building on this insight and observation, we develop FairThinking, a pipeline designed to automatically generate roles that enable LLMs to articulate diverse perspectives for fair expressions. |
TIANLIN LI et. al. | arxiv-cs.CL | 2024-02-19 |
1460 | Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks. |
Jonathan Hayase; Ema Borevkovic; Nicholas Carlini; Florian Tramèr; Milad Nasr; | arxiv-cs.CL | 2024-02-19 |
1461 | Enabling Weak LLMs to Judge Response Reliability Via Meta Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, to enable weak LLMs to effectively assess the reliability of LLM responses, we propose a novel cross-query-comparison-based method called $\textit{Meta Ranking}$ (MR). |
ZIJUN LIU et. al. | arxiv-cs.CL | 2024-02-19 |
1462 | Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a circuit discovery framework alternative to activation patching. |
ZHENGFU HE et. al. | arxiv-cs.LG | 2024-02-19 |
1463 | Tree-Planted Transformers: Unidirectional Transformer Language Models with Implicit Syntactic Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new method dubbed tree-planting: instead of explicitly generating syntactic structures, we plant trees into attention weights of unidirectional Transformer LMs to implicitly reflect syntactic structures of natural language. |
Ryo Yoshida; Taiga Someya; Yohei Oseki; | arxiv-cs.CL | 2024-02-19 |
1464 | FinBen: A Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. |
QIANQIAN XIE et. al. | arxiv-cs.CL | 2024-02-19 |
1465 | Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. |
Anna Martin-Boyle; Aahan Tyagi; Marti A. Hearst; Dongyeop Kang; | arxiv-cs.CL | 2024-02-19 |
1466 | A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation. |
ARCHIT SHARMA et. al. | arxiv-cs.LG | 2024-02-19 |
1467 | Evaluation of ChatGPT’s Smart Contract Auditing Capabilities Based on Chain of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of enhancing smart contract security audits using the GPT-4 model. |
Yuying Du; Xueyan Tang; | arxiv-cs.CR | 2024-02-19 |
1468 | Creating A Fine Grained Entity Type Taxonomy Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy. |
Michael Gunn; Dohyun Park; Nidhish Kamath; | arxiv-cs.CL | 2024-02-19 |
1469 | Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conclusion: In this paper, we show that while GPT-4 is superior to open-source models in zero-shot report labeling, the implementation of few-shot prompting can bring open-source models on par with GPT-4. |
FELIX J. DORFNER et. al. | arxiv-cs.CL | 2024-02-19 |
1470 | 20.5 C-Transformer: A 2.6-18.1μJ/Token Homogeneous DNN-Transformer/Spiking-Transformer Processor with Big-Little Network and Implicit Weight Generation for Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, transformer-based large language models (LLMs), shown in Fig. 20.5.1, are widely used, and even on-device LLM systems with real-time responses are anticipated [1]. Many … |
SANGYEOB KIM et. al. | 2024 IEEE International Solid-State Circuits Conference … | 2024-02-18 |
1471 | LongAgent: Scaling Language Models to 128k Context Through Multi-Agent Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose \textsc{LongAgent}, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. |
JUN ZHAO et. al. | arxiv-cs.CL | 2024-02-18 |
1472 | Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive analysis of GPT-4, GPT-3.5 Turbo, and FLAN-T5 models in detecting framing in news headlines. |
Valeria Pastorino; Jasivan A. Sivakumar; Nafise Sadat Moosavi; | arxiv-cs.CL | 2024-02-18 |
1473 | Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, we propose a two-stage instruction tuning framework, in which VLMs are firstly finetuned on Vision-Flan and further tuned on GPT-4 synthesized data. We find this two-stage tuning framework significantly outperforms the traditional single-stage visual instruction tuning framework and achieves the state-of-the-art performance across a wide range of multi-modal evaluation benchmarks. |
ZHIYANG XU et. al. | arxiv-cs.CL | 2024-02-18 |
1474 | A Curious Case of Searching for The Correlation Between Training Data and Adversarial Robustness of Transformer Textual Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we want to prove that there is also a strong correlation between training data and model robustness. |
Cuong Dang; Dung D. Le; Thai Le; | arxiv-cs.LG | 2024-02-18 |
1475 | Reasoning Before Comparison: LLM-Enhanced Semantic Similarity Metrics for Domain Specialized Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage LLM to enhance the semantic analysis and develop similarity metrics for texts, addressing the limitations of traditional unsupervised NLP metrics like ROUGE and BLEU. |
SHAOCHEN XU et. al. | arxiv-cs.CL | 2024-02-17 |
1476 | Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike the traditional supervised learning approach in IR tasks, ChatGPT challenges existing paradigms, bringing forth new challenges and opportunities regarding text quality assurance, model bias, and efficiency. This paper seeks to examine the impact of ChatGPT on IR tasks and offer insights into its potential future developments. |
Yizheng Huang; Jimmy Huang; | arxiv-cs.IR | 2024-02-17 |
1477 | Detecting A Proxy for Potential Comorbid ADHD in People Reporting Anxiety Symptoms from Social Media Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present a novel task that can elucidate the connection between anxiety and ADHD; use Transformers to make progress toward solving a task that is not solvable by keyword-based … |
Claire S. Lee; Noelle Lim; Michael Guerzhoy; | ArXiv | 2024-02-17 |
1478 | Can Separators Improve Chain-of-Thought Prompting? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by human cognition, we introduce COT-SEP, a method that strategically employs separators at the end of each exemplar in CoT prompting. |
Yoonjeong Park; Hyunjin Kim; Chanyeol Choi; Junseong Kim; Jy-yong Sohn; | arxiv-cs.CL | 2024-02-16 |
1479 | WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. |
Chenhui Hu; Pengfei Cao; Yubo Chen; Kang Liu; Jun Zhao; | arxiv-cs.CL | 2024-02-16 |
1480 | Rethinking Position Embedding Methods in The Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIN ZHOU et. al. | Neural Process. Lett. | 2024-02-16 |
1481 | Human-object Interaction Detection Based on Cascade Multi-scale Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Limin Xia; Xiaoyue Ding; | Appl. Intell. | 2024-02-16 |
1482 | TBFormer: Three-branch Efficient Transformer for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Can Wei; Yan Wei; | Signal Image Video Process. | 2024-02-16 |
1483 | Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios. To address this gap, we introduce a new benchmark, Conan, designed for extracting and analysing intricate character relation graphs from detective narratives. |
RUNCONG ZHAO et. al. | arxiv-cs.CL | 2024-02-16 |
1484 | In Search of Needles in A 11M Haystack: Recurrent Memory Finds What LLMs Miss IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. |
YURI KURATOV et. al. | arxiv-cs.CL | 2024-02-16 |
1485 | Enhancing ESG Impact Type Identification Through Early Fusion and Multilingual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the evolving landscape of Environmental, Social, and Corporate Governance (ESG) impact assessment, the ML-ESG-2 shared task proposes identifying ESG impact types. To address this challenge, we present a comprehensive system leveraging ensemble learning techniques, capitalizing on early and late fusion approaches. |
Hariram Veeramani; Surendrabikram Thapa; Usman Naseem; | arxiv-cs.CL | 2024-02-16 |
1486 | Enriching Urdu NER with BERT Embedding, Data Augmentation, and Hybrid Encoder-CNN Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Named Entity Recognition (NER) is an indispensable component of Natural Language Processing (NLP), which aims to identify and classify entities within text data. While Deep … |
Anil Ahmed; Degen Huang; Syed Yasser Arafat; Imran Hameed; | ACM Transactions on Asian and Low-Resource Language … | 2024-02-15 |
1487 | Fine-tuning Large Language Model (LLM) Artificial Intelligence Chatbots in Ophthalmology and LLM-based Evaluation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Notably, qualitative analysis and the glaucoma sub-analysis revealed clinical inaccuracies in the LLM-generated responses, which were appropriately identified by the GPT-4 evaluation. |
TING FANG TAN et. al. | arxiv-cs.AI | 2024-02-15 |
1488 | An Analysis of Language Frequency and Error Correction for Esperanto Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current Grammar Error Correction (GEC) initiatives tend to focus on major languages, with less attention given to low-resource languages like Esperanto. In this article, we begin to bridge this gap by first conducting a comprehensive frequency analysis using the Eo-GP dataset, created explicitly for this purpose. |
Junhong Liang; | arxiv-cs.CL | 2024-02-14 |
1489 | GPT-4’s Assessment of Its Performance in A USMLE-based Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates GPT-4’s assessment of its performance in healthcare applications. |
UTTAM DHAKAL et. al. | arxiv-cs.AI | 2024-02-14 |
1490 | Research and Application of Transformer Based Anomaly Detection Model: A Literature Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To inspire research on Transformer-based anomaly detection, this review offers a fresh perspective on the concept of anomaly detection. |
Mingrui Ma; Lansheng Han; Chunjie Zhou; | arxiv-cs.LG | 2024-02-14 |
1491 | Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent years have witnessed a substantial increase in the use of deep learning to solve various natural language processing (NLP) problems. Early deep learning models were … |
JIAJIA WANG et. al. | ACM Computing Surveys | 2024-02-14 |
1492 | Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The prevalence of stress-related disorders has increased significantly in recent years, necessitating scalable methods to identify affected individuals. This paper proposes a … |
AHMAD RADWAN et. al. | Int. J. Web Serv. Res. | 2024-02-14 |
1493 | Changes By Butterflies: Farsighted Forecasting with Group Reservoir Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Group Reservoir Transformer to predict long-term events more accurately and robustly by overcoming two challenges in Chaos: (1) the extensive historical sequences and (2) the sensitivity to initial conditions. |
Md Kowsher; Abdul Rafae Khan; Jia Xu; | arxiv-cs.LG | 2024-02-14 |
1494 | API Pack: A Massive Multi-Programming Language Dataset for API Call Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce API Pack, a massive multi-programming language dataset containing more than 1 million instruction-API call pairs to improve the API call generation capabilities of large language models. |
Zhen Guo; Adriana Meza Soria; Wei Sun; Yikang Shen; Rameswar Panda; | arxiv-cs.CL | 2024-02-14 |
1495 | L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a language agent with chain-of-3D-thoughts (L3GO), an inference-time approach that can reason about part-based 3D mesh generation of unconventional objects that current data-driven diffusion models struggle with. |
YUTARO YAMADA et. al. | arxiv-cs.AI | 2024-02-14 |
1496 | Leveraging Large Language Models for Enhanced NLP Task Performance Through Knowledge Distillation and Optimized Training Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach presents a scalable methodology that reduces manual annotation costs and increases efficiency, making it especially pertinent in resource-limited and closed-network environments. |
Yining Huang; Keke Tang; Meilian Chen; | arxiv-cs.CL | 2024-02-14 |
1497 | Underwater Image Enhancement Using Scale-patch Synergy Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Lu Fan; Bo Wang; | Signal Image Video Process. | 2024-02-13 |
1498 | Measuring and Controlling Instruction (In)Stability in Language Model Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To combat attention decay and instruction drift, we propose a lightweight method called split-softmax, which compares favorably against two strong baselines. |
KENNETH LI et. al. | arxiv-cs.CL | 2024-02-13 |
1499 | Combining Insights From Multiple Large Language Models Improves Diagnostic Accuracy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Background: Large language models (LLMs) such as OpenAI’s GPT-4 or Google’s PaLM 2 are proposed as viable diagnostic support tools or even spoken of as replacements for curbside consults. |
Gioele Barabucci; Victor Shia; Eugene Chu; Benjamin Harack; Nathan Fu; | arxiv-cs.AI | 2024-02-13 |
1500 | A Study for Enhancing Low-resource Thai-Myanmar-English Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Several methodologies have recently been proposed to enhance the performance of low-resource Neural Machine Translation (NMT). However, these techniques have yet to be explored … |
Mya Ei San; Sasiporn Usanavasin; Ye Kyaw Thu; Manabu Okumura; | ACM Transactions on Asian and Low-Resource Language … | 2024-02-13 |