Paper Digest: Recent Papers on Transformer
Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. Since 2018, we have been serving users across the world with a number of exclusive services to track, search, review and rewrite scientific literature.
You are welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: Recent Papers on Transformer
Paper | Author(s) | Source | Date | |
---|---|---|---|---|
1 | Refuse Whenever You Feel Unsafe: Improving Safety in LLMs Via Decoupled Refusal Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach, Decoupled Refusal Training (DeRTa), designed to empower LLMs to refuse compliance to harmful prompts at any response position, significantly enhancing their safety capabilities. |
YOULIANG YUAN et. al. | arxiv-cs.CL | 2024-07-12 |
2 | Robustness of LLMs to Perturbations in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs’ robustness against the corrupt variations of the original text. |
Ayush Singh; Navpreet Singh; Shubham Vatsal; | arxiv-cs.CL | 2024-07-12 |
3 | ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a reinforcement learning formulation of the LLM red-teaming task which allows us to discover prompts that both (1) trigger toxic outputs from a frozen defender and (2) have low perplexity as scored by the defender. |
Amelia F. Hardy; Houjun Liu; Bernard Lange; Mykel J. Kochenderfer; | arxiv-cs.CL | 2024-07-12 |
4 | Movie Recommendation with Poster Attention Via Multi-modal Transformer Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal movie recommendation system by extract features of the well designed posters for each movie and the narrative text description of the movie. |
Linhan Xia; Yicheng Yang; Ziou Chen; Zheng Yang; Shengxin Zhu; | arxiv-cs.IR | 2024-07-12 |
5 | The Two Sides of The Coin: Hallucination Generation and Detection with LLMs As Evaluators for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. |
ANH THU MARIA BUI et. al. | arxiv-cs.AI | 2024-07-12 |
6 | On Exact Bit-level Reversible Transformers Without Changing Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose exact bit-level reversible transformers without changing the architectures in the inference procedure. |
Guoqiang Zhang; J. P. Lewis; W. B. Kleijn; | arxiv-cs.LG | 2024-07-12 |
7 | Detect Llama — Finding Vulnerabilities in Smart Contracts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we test the hypothesis that although OpenAI’s GPT-4 performs well generally, we can fine-tune open-source models to outperform GPT-4 in smart contract vulnerability detection. |
Peter Ince; Xiapu Luo; Jiangshan Yu; Joseph K. Liu; Xiaoning Du; | arxiv-cs.CR | 2024-07-11 |
8 | Self-Evolving GPT: A Lifelong Autonomous Experiential Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential learning framework based on LLMs to explore whether LLMs can imitate human ability for learning and utilizing experience. |
JINGLONG GAO et. al. | arxiv-cs.CL | 2024-07-11 |
9 | LLMs’ Morphological Analyses of Complex FST-generated Finnish Words Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms. |
Anssi Moisio; Mathias Creutz; Mikko Kurimo; | arxiv-cs.CL | 2024-07-11 |
10 | GPT-4 Is Judged More Human Than Humans in Displaced and Inverted Turing Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall. |
Ishika Rathi; Sydney Taylor; Benjamin K. Bergen; Cameron R. Jones; | arxiv-cs.HC | 2024-07-11 |
11 | Teaching Transformers Causal Reasoning Through Axiomatic Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. |
Aniket Vashishtha; Abhinav Kumar; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma; | arxiv-cs.LG | 2024-07-10 |
12 | FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the previous approaches has investigated the efficiency of LLM-based few-shot learning in domain-specific scenarios. To address this gap, we introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets, with a focus on industrial manufacturing and maintenance, while using multiple LLMs — GPT-4-32K, GPT-3.5-Turbo, LLaMA 2-chat, and Vicuna. |
Yongjian Tang; Rakebul Hasan; Thomas Runkler; | arxiv-cs.CL | 2024-07-10 |
13 | ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose Random Subspace Adaptation (ROSA), a method that outperforms previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time. |
Marawan Gamal Abdel Hameed; Aristides Milios; Siva Reddy; Guillaume Rabusseau; | arxiv-cs.LG | 2024-07-10 |
14 | Prompting Techniques for Secure Code Generation: A Systematic Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: OBJECTIVE: In this study, we investigate the impact of different prompting techniques on the security of code generated from NL instructions by LLMs. |
Catherine Tony; Nicolás E. Díaz Ferreyra; Markus Mutas; Salem Dhiff; Riccardo Scandariato; | arxiv-cs.SE | 2024-07-09 |
15 | Mixture-of-Modules: Reinventing Transformers As Dynamic Assemblies of Modules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization. |
ZHUOCHENG GONG et. al. | arxiv-cs.CL | 2024-07-09 |
16 | PEER: Expertizing Domain-Specific Tasks with A Multi-Agent Framework and Tuning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework. |
YIYING WANG et. al. | arxiv-cs.AI | 2024-07-09 |
17 | Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To assess the LLM output, we propose completeness, soundness, clarity, syntax, and functioning code as metrics. |
Inwon Kang; William Van Woensel; Oshani Seneviratne; | arxiv-cs.CL | 2024-07-09 |
18 | A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine five feature extraction methods (TF-IDF, LSI, BERT, MiniLM, RoBERTa) and find that Term Frequency-Inverse Document Frequency (TF-IDF) outperforms the other four methods with a precision of 75\% and an F1 score of 64\%. |
Refat Othman; Bruno Rossi; Russo Barbara; | arxiv-cs.CR | 2024-07-09 |
19 | Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce Multilingual Blending, a mixed-language query-response scheme designed to evaluate the safety alignment of various state-of-the-art LLMs (e.g., GPT-4o, GPT-3.5, Llama3) under sophisticated, multilingual conditions. |
Jiayang Song; Yuheng Huang; Zhehua Zhou; Lei Ma; | arxiv-cs.CL | 2024-07-09 |
20 | Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR). |
YAOZONG GAN et. al. | arxiv-cs.CV | 2024-07-08 |
21 | Surprising Gender Biases in GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present seven experiments exploring gender biases in GPT. |
Raluca Alexandra Fulgu; Valerio Capraro; | arxiv-cs.CY | 2024-07-08 |
22 | Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations. |
Moritz Reuss; Ömer Erdinç Yağmurlu; Fabian Wenzel; Rudolf Lioutikov; | arxiv-cs.RO | 2024-07-08 |
23 | On The Power of Convolution Augmented Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent architectural recipes, such as state-space models, have bridged the performance gap. Motivated by this, we examine the benefits of Convolution-Augmented Transformer (CAT) for recall, copying, and length generalization tasks. |
Mingchen Li; Xuechen Zhang; Yixiao Huang; Samet Oymak; | arxiv-cs.LG | 2024-07-08 |
24 | Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study underscores the crucial role of prompt engineering in maximizing the educational benefits of LLMs. By systematically categorizing and testing these strategies, we provide a comprehensive framework for both educators and students to optimize LLM-based learning experiences. |
Tianyu Wang; Nianjun Zhou; Zhixiong Chen; | arxiv-cs.AI | 2024-07-07 |
25 | Image-Conditional Diffusion Transformer for Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). |
XINGYANG NIE et. al. | arxiv-cs.CV | 2024-07-07 |
26 | Associative Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step. |
Ivan Rodkin; Yuri Kuratov; Aydar Bulatov; Mikhail Burtsev; | arxiv-cs.CL | 2024-07-05 |
27 | MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current datasets and benchmarks primarily focus on relatively simple scientific tasks and figures, lacking comprehensive assessments across diverse advanced scientific disciplines. To bridge this gap, we collected a multimodal, multidisciplinary dataset from open-access scientific articles published in Nature Communications journals. |
ZEKUN LI et. al. | arxiv-cs.CL | 2024-07-05 |
28 | Using LLMs to Label Medical Papers According to The CIViC Evidence Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP. |
Markus Hisch; Xing David Wang; | arxiv-cs.CL | 2024-07-05 |
29 | Generalists Vs. Specialists: Evaluating Large Language Models for Urdu Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare general-purpose pretrained models, GPT-4-Turbo and Llama-3-8b-Instruct with special-purpose models fine-tuned on specific tasks, XLM-Roberta-large, mT5-large, and Llama-3-8b-Instruct. |
Samee Arif; Abdul Hameed Azeemi; Agha Ali Raza; Awais Athar; | arxiv-cs.CL | 2024-07-05 |
30 | GPT Vs RETRO: Exploring The Intersection of Retrieval and Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we apply PEFT methods (P-tuning, Adapters, and LoRA) to a modified Retrieval-Enhanced Transformer (RETRO) and a baseline GPT model across several sizes, ranging from 823 million to 48 billion parameters. |
Aleksander Ficek; Jiaqi Zeng; Oleksii Kuchaiev; | arxiv-cs.CL | 2024-07-05 |
31 | Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task. |
Sachin Yadav; Tejaswi Choppa; Dominik Schlechtweg; | arxiv-cs.CL | 2024-07-04 |
32 | From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with a focus on their abilities in reasoning and explainability. |
Stefanie Krause; Frieder Stolzenburg; | arxiv-cs.AI | 2024-07-04 |
33 | TrackPGD: A White-box Attack Using Binary Masks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are proposing a novel white-box attack named TrackPGD, which relies on the predicted object binary mask to attack the robust transformer trackers. |
Fatemeh Nourilenjan Nokabadi; Yann Batiste Pequignot; Jean-Francois Lalonde; Christian Gagné; | arxiv-cs.CV | 2024-07-04 |
34 | HYBRINFOX at CheckThat! 2024 — Task 2: Enriching BERT Models with The Expert System VAGO for Subjectivity Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! |
MORGANE CASANOVA et. al. | arxiv-cs.CL | 2024-07-04 |
35 | Adaptive Step-size Perception Unfolding Network with Non-local Hybrid Attention for Hyperspectral Image Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the aforementioned drawbacks, We proposed an adaptive step-size perception unfolding network (ASPUN), a deep unfolding network based on FISTA algorithm, which uses an adaptive step-size perception module to estimate the update step-size of each spectral channel. |
Yanan Yang; Like Xin; | arxiv-cs.CV | 2024-07-04 |
36 | GPT-4 Vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. |
JIANHAO YAN et. al. | arxiv-cs.CL | 2024-07-04 |
37 | Question-Analysis Prompting Improves LLM Performance in Reasoning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel prompting strategy called Question Analysis Prompting (QAP), in which the model is prompted to explain the question in $n$ words before solving. |
Dharunish Yugeswardeenoo; Kevin Zhu; Sean O’Brien; | arxiv-cs.CL | 2024-07-04 |
38 | Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Currently there does not exist a large repository of curated CAD models along with their corresponding G-code files for additive manufacturing. To address this issue, we present SLICE-100K, a first-of-its-kind dataset of over 100,000 G-code files, along with their tessellated CAD model, LVIS (Large Vocabulary Instance Segmentation) categories, geometric properties, and renderings. |
ANUSHRUT JIGNASU et. al. | arxiv-cs.CV | 2024-07-04 |
39 | CATT: Character-based Arabic Tashkeel Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new approach to training ATD models. |
Faris Alasmary; Orjuwan Zaafarani; Ahmad Ghannam; | arxiv-cs.CL | 2024-07-03 |
40 | Large Language Models As Evaluators for Scientific Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study explores how well the state-of-the-art Large Language Models (LLMs), like GPT-4 and Mistral, can assess the quality of scientific summaries or, more fittingly, scientific syntheses, comparing their evaluations to those of human annotators. |
Julia Evans; Jennifer D’Souza; Sören Auer; | arxiv-cs.CL | 2024-07-03 |
41 | Mast Kalandar at SemEval-2024 Task 8: On The Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness. |
Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava; Radhika Mamidi; | arxiv-cs.CL | 2024-07-03 |
42 | InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. |
PAN ZHANG et. al. | arxiv-cs.CV | 2024-07-03 |
43 | Assessing The Code Clone Detection Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection. |
Zixian Zhang; Takfarinas Saber; | arxiv-cs.SE | 2024-07-02 |
44 | RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG. |
YUE YU et. al. | arxiv-cs.CL | 2024-07-02 |
45 | Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry. |
MENGLIN YANG et. al. | arxiv-cs.LG | 2024-07-01 |
46 | Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, LLMs struggle to converge even when explicitly prompted to do so, and are sensitive to prompt variations. To overcome these issues, we introduce an LLM-augmented algorithm, IF-Enhanced LLM, which takes advantage of both in-context decision-making capabilities of LLMs and theoretical guarantees inherited from classic DB algorithms. |
Fanzeng Xia; Hao Liu; Yisong Yue; Tongxin Li; | arxiv-cs.LG | 2024-07-01 |
47 | MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. |
YUBO MA et. al. | arxiv-cs.CV | 2024-07-01 |
48 | Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. |
Kota Shamanth Ramanath Nayak; Leila Kosseim; | arxiv-cs.CL | 2024-07-01 |
49 | Global-local Feature Learning for Fine-grained Food Classification Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jun-Hwa Kim; Namho Kim; Chee Sun Won; | Eng. Appl. Artif. Intell. | 2024-07-01 |
50 | Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the wide adoption of hybrid parallel paradigms like model parallelism, expert parallelism, and expert-sharding parallelism (i.e., MP+EP+ESP) to support MoE model training on GPU clusters, the training efficiency is hindered by communication costs introduced by these parallel paradigms. To address this limitation, we propose Parm, a system that accelerates MP+EP+ESP training by designing two dedicated schedules for placing communication tasks. |
XINGLIN PAN et. al. | arxiv-cs.DC | 2024-06-30 |
51 | LegalTurk Optimized BERT for Multi-Label Text Classification and NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies. |
Farnaz Zeidi; Mehmet Fatih Amasyali; Çiğdem Erol; | arxiv-cs.CL | 2024-06-30 |
52 | LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, prior research harbors two primary concerns: firstly, a lack of contemplation regarding whether the natural language generated by LLM (LLMNL) truly aligns with human natural language (HNL), a critical foundational question; secondly, an oversight that augmented data is randomly generated by LLM, implying that not all data may possess equal training value, that could impede the performance of classifiers. To address these challenges, we introduce the scaling laws to intrinsically calculate LLMNL and HNL. |
Zhenhua Wang; Guang Xu; Ming Ren; | arxiv-cs.CL | 2024-06-29 |
53 | Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users’ quit-vaping intentions. |
SAI KRISHNA REVANTH VURUMA et. al. | arxiv-cs.CL | 2024-06-28 |
54 | Machine Learning Predictors for Min-Entropy Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing data from Generalized Binary Autoregressive Models, a subset of Markov processes, we demonstrate that machine learning models (including a hybrid of convolutional and recurrent Long Short-Term Memory layers and the transformer-based GPT-2 model) outperform traditional NIST SP 800-90B predictors in certain scenarios. |
Javier Blanco-Romero; Vicente Lorenzo; Florina Almenares Mendoza; Daniel Díaz-Sánchez; | arxiv-cs.LG | 2024-06-28 |
55 | ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, the practical efficiency of this paradigm remains unverified, particularly in the context of large language models (LLMs). This paper introduces the first scalable instantiation of this paradigm called ScaleBiO, focusing on bilevel optimization for large-scale LLM data reweighting. |
RUI PAN et. al. | arxiv-cs.LG | 2024-06-28 |
56 | The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP). |
Xiliang Zhu; Shayna Gardiner; Tere Roldán; David Rossouw; | arxiv-cs.CL | 2024-06-27 |
57 | Fine-tuned Network Relies on Generic Representation to Solve Unseen Cognitive Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning pretrained language models has shown promising results on a wide range of tasks, but when encountering a novel task, do they rely more on generic pretrained representation, or develop brand new task-specific solutions? |
Dongyan Lin; | arxiv-cs.LG | 2024-06-27 |
58 | NTFormer: A Composite Node Tokenized Graph Transformer for Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a new graph Transformer called NTFormer to address this issue. |
Jinsong Chen; Siyu Jiang; Kun He; | arxiv-cs.LG | 2024-06-27 |
59 | FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FRED, a wafer-scale interconnect that is tailored for the high-BW requirements of wafer-scale networks and can efficiently execute communication patterns of different parallelization strategies. |
Saeed Rashidi; William Won; Sudarshan Srinivasan; Puneet Gupta; Tushar Krishna; | arxiv-cs.AR | 2024-06-27 |
60 | HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge Into Multimodal LLMs at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using PubMedVision, we train a 34B medical MLLM HuatuoGPT-Vision, which shows superior performance in medical multimodal scenarios among open-source MLLMs. |
JUNYING CHEN et. al. | arxiv-cs.CV | 2024-06-27 |
61 | BADGE: BADminton Report Generation and Evaluation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel framework named BADGE, designed for this purpose using LLM. |
Shang-Hsuan Chiang; Lin-Wei Chao; Kuang-Da Wang; Chih-Chuan Wang; Wen-Chih Peng; | arxiv-cs.CL | 2024-06-26 |
62 | SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR). |
Quan Mai; Susan Gauch; Douglas Adams; | arxiv-cs.CL | 2024-06-25 |
63 | This Paper Had The Smartest Reviewers — Flattery Detection Utilising An Audio-Textual Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of speech and train machine learning models for automatic flattery detection. |
LUKAS CHRIST et. al. | arxiv-cs.SD | 2024-06-25 |
64 | Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora. |
YIMING LI et. al. | arxiv-cs.CL | 2024-06-25 |
65 | CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design. |
NAFIS NEEHAL et. al. | arxiv-cs.CL | 2024-06-25 |
66 | CausalFormer: An Interpretable Transformer for Temporal Causal Discovery Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To facilitate the utilization of the whole deep learning models in temporal causal discovery, we proposed an interpretable transformer-based causal discovery model termed CausalFormer, which consists of the causality-aware transformer and the decomposition-based causality detector. |
LINGBAI KONG et. al. | arxiv-cs.LG | 2024-06-24 |
67 | Unambiguous Recognition Should Not Rely Solely on Natural Language Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This bias stems from the inherent characteristics of the dataset. To mitigate this bias, we propose a LaTeX printed text recognition model trained on a mixed dataset of pseudo-formulas and pseudo-text. |
Renqing Luo; Yuhan Xu; | arxiv-cs.CV | 2024-06-24 |
68 | Exploring The Capability of Mamba in Speech Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compared Mamba with state-of-the-art Transformer variants for various speech applications, including ASR, text-to-speech, spoken language understanding, and speech summarization. |
Koichi Miyazaki; Yoshiki Masuyama; Masato Murata; | arxiv-cs.SD | 2024-06-24 |
69 | GPT-4V Explorations: Mining Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the application of the GPT-4V(ision) large visual language model to autonomous driving in mining environments, where traditional systems often falter in understanding intentions and making accurate decisions during emergencies. |
Zixuan Li; | arxiv-cs.CV | 2024-06-24 |
70 | Exploring Factual Entailment with NLI: A News Media Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel — a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles. |
Guy Mor-Lan; Effi Levi; | arxiv-cs.CL | 2024-06-24 |
71 | DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models. |
YUANG PENG et. al. | arxiv-cs.CV | 2024-06-24 |
72 | The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions. |
Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz; | arxiv-cs.CL | 2024-06-24 |
73 | OlympicArena Medal Ranks: Who Is The Most Intelligent AI So Far? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)? |
Zhen Huang; Zengzhi Wang; Shijie Xia; Pengfei Liu; | arxiv-cs.CL | 2024-06-24 |
74 | Finding Transformer Circuits with Edge Pruning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we frame automated circuit discovery as an optimization problem and propose *Edge Pruning* as an effective and scalable solution. |
Adithya Bhaskar; Alexander Wettig; Dan Friedman; Danqi Chen; | arxiv-cs.CL | 2024-06-24 |
75 | GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent studies have identified limitations in LLMs’ ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph data structure problems along with 2000 test cases. |
Qiming Wu; Zichen Chen; Will Corcoran; Misha Sra; Ambuj K. Singh; | arxiv-cs.AI | 2024-06-23 |
76 | Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT). |
Ni Wang; Dongliang Liao; Xing Xu; | arxiv-cs.CV | 2024-06-23 |
77 | Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate a broader view of knowledge location, that of concepts or clusters of related information, instead of disparate individual facts. |
Christopher Burger; Yifan Hu; Thai Le; | arxiv-cs.LG | 2024-06-22 |
78 | Evaluating The Effectiveness of The Foundational Models for Q&A Classification in Mental Health Care Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we conducted experiments using four different types of learning approaches: traditional feature extraction, PLMs as feature extractors, Fine-tuning PLMs and prompting large language models (GPT-3.5 and GPT-4) in zero-shot and few-shot learning settings. |
Hassan Alhuzali; Ashwag Alasmari; | arxiv-cs.CL | 2024-06-22 |
79 | How Effective Is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom’s Revised Taxonomy? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | arxiv-cs.CL | 2024-06-21 |
80 | Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang. |
Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu; | naacl | 2024-06-20 |
81 | VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack. |
Jonathan Rusert; | naacl | 2024-06-20 |
82 | MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3. |
SANCHIT AHUJA et. al. | naacl | 2024-06-20 |
83 | ChatGPT As Research Scientist: Probing GPT’s Capabilities As A Research Librarian, Research Ethicist, Data Generator and Data Predictor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research … |
Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R. Banaji; | arxiv-cs.AI | 2024-06-20 |
84 | Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults. |
Afonso de Sá Delgado Neto; Maximilian Egger; Mayank Bakshi; Rawad Bitar; | arxiv-cs.LG | 2024-06-20 |
85 | A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems. |
Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas; | naacl | 2024-06-20 |
86 | Branch-Solve-Merge Improves Large Language Model Evaluation and Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance can fall short, due to the model�s lack of coherence and inability to plan and decompose the problem. We propose Branch-Solve-Merge (BSM), a Large Language Model program (Schlag et al. , 2023) for tackling such challenging natural language tasks. |
SWARNADEEP SAHA et. al. | naacl | 2024-06-20 |
87 | Does GPT-4 Pass The Turing Test? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness. |
Cameron Jones; Ben Bergen; | naacl | 2024-06-20 |
88 | Does GPT Really Get It? A Hierarchical Scale to Quantify Human Vs AI’s Understanding of Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences. |
Mirabel Reid; Santosh S. Vempala; | arxiv-cs.AI | 2024-06-20 |
89 | Metacognitive Prompting Improves Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes. |
Yuqing Wang; Yun Zhao; | naacl | 2024-06-20 |
90 | Removing RLHF Protections in GPT-4 Via Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate. |
QIUSI ZHAN et. al. | naacl | 2024-06-20 |
91 | Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance. |
XIANGRU TANG et. al. | naacl | 2024-06-20 |
92 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
DONG YUAN et. al. | naacl | 2024-06-20 |
93 | Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; | naacl | 2024-06-20 |
94 | Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature. |
Anshuman Chhabra; Hadi Askari; Prasant Mohapatra; | naacl | 2024-06-20 |
95 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng; | naacl | 2024-06-20 |
96 | CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively. |
Ming Jiang; Mansi Joshi; | naacl | 2024-06-20 |
97 | On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck. |
TING-RUI CHIANG et. al. | naacl | 2024-06-20 |
98 | Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3. |
Sindhu Kishore; Hangfeng He; | naacl | 2024-06-20 |
99 | Evaluating Implicit Bias in Large Language Models By Attacking From A Psychometric Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As Large Language Models (LLMs) become an important way of information seeking, there have been increasing concerns about the unethical content LLMs may generate. In this paper, we conduct a rigorous evaluation of LLMs’ implicit bias towards certain groups by attacking them with carefully crafted instructions to elicit biased responses. |
Yuchen Wen; Keping Bi; Wei Chen; Jiafeng Guo; Xueqi Cheng; | arxiv-cs.CL | 2024-06-20 |
100 | CryptoGPT: A 7B Model Rivaling GPT-4 in The Task of Analyzing and Classifying Real-time Financial News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CryptoGPT: a 7B model competing with GPT-4 in a specific task — The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT. |
Ying Zhang; Matthieu Petit Guillaume; Aurélien Krauth; Manel Labidi; | arxiv-cs.AI | 2024-06-20 |
101 | SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning. |
ARASH ARDAKANI et. al. | naacl | 2024-06-20 |
102 | A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem. |
JIAQI LIU et. al. | arxiv-cs.RO | 2024-06-19 |
103 | Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles. |
Hamdireza Rouzegar; Masoud Makrehchi; | arxiv-cs.CL | 2024-06-19 |
104 | Fine-Tuning BERTs for Definition Extraction from Mathematical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fine-tuned three pre-trained BERT models on the task of definition extraction from mathematical English written in LaTeX. |
Lucy Horowitz; Ryan Hathaway; | arxiv-cs.CL | 2024-06-19 |
105 | ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. |
TEAM GLM et. al. | arxiv-cs.CL | 2024-06-18 |
106 | SwinStyleformer Is A Favorable Choice for Image Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects. |
Jiawei Mao; Guangyi Zhao; Xuesong Yin; Yuanqi Chang; | arxiv-cs.CV | 2024-06-18 |
107 | Adversarial Attacks on Multimodal Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that multimodal agents raise new safety risks, even though attacking agents is more challenging than prior attacks due to limited access to and knowledge about the environment. |
Chen Henry Wu; Jing Yu Koh; Ruslan Salakhutdinov; Daniel Fried; Aditi Raghunathan; | arxiv-cs.LG | 2024-06-18 |
108 | What Makes Two Language Models Think Alike? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. |
Jeanne Salle; Louis Jalouzot; Nur Lan; Emmanuel Chemla; Yair Lakretz; | arxiv-cs.CL | 2024-06-18 |
109 | Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a thorough analysis and discussion of the results. |
ANKIT AICH et. al. | arxiv-cs.CL | 2024-06-18 |
110 | Generating Educational Materials with Different Levels of Readability Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning. |
Chieh-Yang Huang; Jing Wei; Ting-Hao ‘Kenneth’ Huang; | arxiv-cs.CL | 2024-06-18 |
111 | Promises, Outlooks and Challenges of Diffusion Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For example, autoregressive token generation is notably slow and can be prone to \textit{exposure bias}. The diffusion-based language models were proposed as an alternative to autoregressive generation to address some of these limitations. |
Justin Deschenaux; Caglar Gulcehre; | arxiv-cs.CL | 2024-06-17 |
112 | Can Many-Shot In-Context Learning Help Long-Context LLM Judges? See More, Judge Better! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this type of approach is affected by the potential biases in LLMs, raising concerns about the reliability of the evaluation results. To mitigate this issue, we propose and study two versions of many-shot in-context prompts, which rely on two existing settings of many-shot ICL for helping GPT-4o-as-a-Judge in single answer grading to mitigate the potential biases in LLMs, Reinforced ICL and Unsupervised ICL. |
Mingyang Song; Mao Zheng; Xuan Luo; | arxiv-cs.CL | 2024-06-17 |
113 | DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales. |
FAN ZHOU et. al. | arxiv-cs.DB | 2024-06-17 |
114 | Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify a pitfall of vanilla iterative DPO – improved response quality can lead to increased verbosity. |
JIE LIU et. al. | arxiv-cs.CL | 2024-06-17 |
115 | Cultural Conditioning or Placebo? On The Effectiveness of Socio-Demographic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI) or neutral (MMLU and ETHICS). |
SAGNIK MUKHERJEE et. al. | arxiv-cs.CL | 2024-06-17 |
116 | A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a two-dimensional zero-shot evaluation method for DST using GPT-4, which divides the evaluation into two dimensions: accuracy and completeness. |
Ming Gu; Yan Yang; | arxiv-cs.CL | 2024-06-17 |
117 | Look Further Ahead: Testing The Limits of GPT-4 in Path Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still face challenges with long-horizon planning. To study this, we propose path planning tasks as a platform to evaluate LLMs’ ability to navigate long trajectories under geometric constraints. |
Mohamed Aghzal; Erion Plaku; Ziyu Yao; | arxiv-cs.AI | 2024-06-17 |
118 | Minimal Self in Humanoid Robot Alter3 Driven By Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Alter3, a humanoid robot that demonstrates spontaneous motion generation through the integration of GPT-4, Large Language Model (LLM). |
Takahide Yoshida; Suzune Baba; Atsushi Masumori; Takashi Ikegami; | arxiv-cs.RO | 2024-06-17 |
119 | GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts. |
Binnur Görer; Fatma Başak Aydemir; | arxiv-cs.SE | 2024-06-17 |
120 | VideoVista: A Versatile Benchmark for Video Understanding and Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, it encompasses 19 types of understanding tasks (e.g., anomaly detection, interaction understanding) and 8 reasoning tasks (e.g., logical reasoning, causal reasoning). To achieve this, we present an automatic data construction framework, leveraging powerful GPT-4o alongside advanced analysis tools (e.g., video splitting, object segmenting, and tracking). |
YUNXIN LI et. al. | arxiv-cs.CV | 2024-06-17 |
121 | WellDunn: On The Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model’s utility in clinical practice. |
SEYEDALI MOHAMMADI et. al. | arxiv-cs.AI | 2024-06-17 |
122 | Connecting The Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using The New York Times Connections Word Game Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To deepen our understanding we create a taxonomy of the knowledge types required to successfully categorize words in the Connections game, revealing that LLMs struggle with associative, encyclopedic, and linguistic knowledge. |
PRISHA SAMADARSHI et. al. | arxiv-cs.CL | 2024-06-16 |
123 | Exposing The Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dataset MWP-MISTAKE, incorporating MWPs with both correct and incorrect reasoning steps generated through rule-based methods and smaller language models. |
Joykirat Singh; Akshay Nambi; Vibhav Vineet; | arxiv-cs.CL | 2024-06-16 |
124 | Grading Massive Open Online Courses Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the feasibility of using large language models (LLMs) to replace peer grading in MOOCs. |
Shahriar Golchin; Nikhil Garuda; Christopher Impey; Matthew Wenger; | arxiv-cs.CL | 2024-06-16 |
125 | Large Language Models for Automatic Milestone Detection in Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate an LLM’s performance on recordings of a group oral communication task in which utterances are often truncated or not well-formed. |
ZHUOXU DUAN et. al. | arxiv-cs.CL | 2024-06-16 |
126 | Enhancing Supermarket Robot Interaction: A Multi-Level LLM Conversational Interface for Handling Diverse Customer Intents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the design and evaluation of a novel multi-level LLM interface for supermarket robots to assist customers. |
Chandran Nandkumar; Luka Peternel; | arxiv-cs.RO | 2024-06-16 |
127 | Distilling Opinions at Scale: Incremental Opinion Summarization Using XL-OPSUMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally. |
SRI RAGHAVA MUDDU et. al. | arxiv-cs.CL | 2024-06-16 |
128 | Generating Tables from The Parametric Knowledge of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables. |
Yevgeni Berkovitch; Oren Glickman; Amit Somech; Tomer Wolfson; | arxiv-cs.CL | 2024-06-16 |
129 | Breaking Boundaries: Investigating The Effects of Model Editing on Cross-linguistic Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. |
SOMNATH BANERJEE et. al. | arxiv-cs.CL | 2024-06-16 |
130 | ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we present Video Diffusion GPT (ViD-GPT). |
Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao; | arxiv-cs.CV | 2024-06-16 |
131 | KGPA: Robustness Evaluation for Large Language Models Via Cross-Domain Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). |
AIHUA PEI et. al. | arxiv-cs.CL | 2024-06-16 |
132 | Multilingual Large Language Models and Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks. |
Daniil Gurgurov; Tanja Bäumel; Tatiana Anikina; | arxiv-cs.CL | 2024-06-15 |
133 | ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning Via Shared Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces an approach to optimize Parameter Efficient Fine Tuning (PEFT) for Pretrained Language Models (PLMs) by implementing a Shared Low Rank Adaptation (ShareLoRA). |
Yurun Song; Junchen Zhao; Ian G. Harris; Sangeetha Abdu Jyothi; | arxiv-cs.CL | 2024-06-15 |
134 | Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage the edited videos on a popular short video platform, \textit{i.e.}, TikTok, and build a video VQA benchmark (named EditVid-QA) covering four typical editing categories, i.e., effect, funny, meme, and game. |
LU XU et. al. | arxiv-cs.CV | 2024-06-14 |
135 | Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work enables extensive hardware/mapping exploration by extending the DSE framework Stream towards support for transformers across a wide variety of hardware architectures and different execution schedules. |
Steven Colleman; Arne Symons; Victor J. B. Jung; Marian Verhelst; | arxiv-cs.AR | 2024-06-14 |
136 | GPT-4o: Visual Perception Performance of Multimodal Large Language Models in Piglet Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The initial evaluation experiments in this study validate the potential of multimodal large language models in livestock scene video understanding and provide new directions and references for future research on animal behavior video understanding. |
Yiqi Wu; Xiaodan Hu; Ziming Fu; Siling Zhou; Jiangong Li; | arxiv-cs.CV | 2024-06-14 |
137 | The Devil Is in The Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we try to unveil the mystery of social bias inside language models by introducing the concept of {\sc Social Bias Neurons}. |
YAN LIU et. al. | arxiv-cs.CL | 2024-06-14 |
138 | GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic versatile and human-aligned evaluation metric for text-to-3D generative models. |
TONG WU et. al. | cvpr | 2024-06-13 |
139 | Alleviating Distortion in Image Generation Via Multi-Resolution Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. |
QIHAO LIU et. al. | arxiv-cs.CV | 2024-06-13 |
140 | Complex Image-Generative Diffusion Transformer for Audio Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to enhance audio denoising performance, this paper introduces a complex image-generative diffusion transformer that captures more information from the complex Fourier domain. |
Junhui Li; Pu Wang; Jialu Li; Youshan Zhang; | arxiv-cs.SD | 2024-06-13 |
141 | Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally bidirectional feature propagation is highly sensitive to inaccurate optical flow in blurry frames leading to error accumulation during the propagation process. To address these issues we propose BSSTNet Blur-aware Spatio-temporal Sparse Transformer Network. |
Huicong Zhang; Haozhe Xie; Hongxun Yao; | cvpr | 2024-06-13 |
142 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose VisualFactChecker (VFC) a flexible training-free pipeline that generates high-fidelity and detailed captions for both 2D images and 3D objects. |
YUNHAO GE et. al. | cvpr | 2024-06-13 |
143 | GPT-Fabric: Folding and Smoothing Fabric By Leveraging Pre-Trained Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-Fabric for the canonical tasks of fabric folding and smoothing, where GPT directly outputs an action informing a robot where to grasp and pull a fabric. |
Vedant Raval; Enyu Zhao; Hejia Zhang; Stefanos Nikolaidis; Daniel Seita; | arxiv-cs.RO | 2024-06-13 |
144 | Mean-Shift Feature Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer models developed in NLP make a great impact on computer vision fields producing promising performance on various tasks. |
Takumi Kobayashi; | cvpr | 2024-06-13 |
145 | SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum we introduce SDPose a new self-distillation method for improving the performance of small transformer-based models. |
SICHEN CHEN et. al. | cvpr | 2024-06-13 |
146 | MoMask: Generative Masked Modeling of 3D Human Motions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MoMask a novel masked modeling framework for text-driven 3D human motion generation. |
Chuan Guo; Yuxuan Mu; Muhammad Gohar Javed; Sen Wang; Li Cheng; | cvpr | 2024-06-13 |
147 | MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion. |
Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi; | cvpr | 2024-06-13 |
148 | ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing studies are devoted to designing vision-specific transformers to solve the above problems which introduce additional pre-training costs. Therefore we present a plain pre-training-free and feature-enhanced ViT backbone with Convolutional Multi-scale feature interaction named ViT-CoMer which facilitates bidirectional interaction between CNN and transformer. |
Chunlong Xia; Xinliang Wang; Feng Lv; Xin Hao; Yifeng Shi; | cvpr | 2024-06-13 |
149 | Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models. |
Han Cai; Muyang Li; Qinsheng Zhang; Ming-Yu Liu; Song Han; | cvpr | 2024-06-13 |
150 | OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions without a large-scale animal text-motion dataset. |
ZHANGSIHAO YANG et. al. | cvpr | 2024-06-13 |
151 | Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-standard varieties from around the world). |
EVE FLEISIG et. al. | arxiv-cs.CL | 2024-06-13 |
152 | Permutation Equivariance of Transformers and Its Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks. |
HENGYUAN XU et. al. | cvpr | 2024-06-13 |
153 | General Point Model Pretraining with Autoencoding and Autoregressive Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the General Language Model we propose a General Point Model (GPM) that seamlessly integrates autoencoding and autoregressive tasks in a point cloud transformer. |
ZHE LI et. al. | cvpr | 2024-06-13 |
154 | Agent Instructs Large Language Models to Be General Zero-Shot Reasoners IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a method to improve the zero-shot reasoning abilities of large language models on general language understanding tasks. |
Nicholas Crispino; Kyle Montgomery; Fankun Zeng; Dawn Song; Chenguang Wang; | icml | 2024-06-12 |
155 | SpikeZIP-TF: Conversion Is All You Need for Transformer-based SNN Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel ANN-to-SNN conversion method called SpikeZIP-TF, where ANN and SNN are exactly equivalent, thus incurring no accuracy degradation. |
KANG YOU et. al. | icml | 2024-06-12 |
156 | Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and compact transformer block, Transformer with Differentiable Channel Selection, or DCS-Transformer. |
Yancheng Wang; Ping Li; Yingzhen Yang; | icml | 2024-06-12 |
157 | AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively. |
REDUAN ACHTIBAT et. al. | icml | 2024-06-12 |
158 | Privacy-Preserving Embedding Via Look-up Table Evaluation with Fully Homomorphic Encryption Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, our study proposes an efficient algorithm for privacy-preserving embedding via look-up table evaluation with HE(HELUT) by developing an encrypted indicator function (EIF) that assures high precision with the use of the approximate HE scheme(CKKS). |
Jae-yun Kim; Saerom Park; Joohee Lee; Jung Hee Cheon; | icml | 2024-06-12 |
159 | GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website. |
Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su; | icml | 2024-06-12 |
160 | Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias terms. |
Brian K Chen; Tianyang Hu; Hui Jin; Hwee Kuan Lee; Kenji Kawaguchi; | icml | 2024-06-12 |
161 | In-context Learning on Function Classes Unveiled for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given some training examples, a pre-trained model can make accurate predictions on an unseen input. |
Zhijie Wang; Bo Jiang; Shuai Li; | icml | 2024-06-12 |
162 | Accelerating Transformer Pre-training with 2:4 Sparsity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: First, we define a “flip rate” to monitor the stability of a 2:4 training process. Utilizing this metric, we propose three techniques to preserve accuracy: to modify the sparse-refined straight-through estimator by applying the masked decay term on gradients, to determine a feasible decay factor in warm-up stage, and to enhance the model’s quality by a dense fine-tuning procedure near the end of pre-training. |
Yuezhou Hu; Kang Zhao; Weiyu Huang; Jianfei Chen; Jun Zhu; | icml | 2024-06-12 |
163 | VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we present a Description-Program-Reasoning (DPR) chain to enhance the logical accuracy of reasoning processes through graphical structure description generation and algorithm-aware multi-step reasoning. |
YUNXIN LI et. al. | icml | 2024-06-12 |
164 | Asymmetry in Low-Rank Adapters of Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. |
JIACHENG ZHU et. al. | icml | 2024-06-12 |
165 | Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. |
Jaehoon Kim; Seungwan Jin; Sohyun Park; Someen Park; Kyungsik Han; | arxiv-cs.CL | 2024-06-12 |
166 | An Empirical Study of Mamba-based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In a controlled setting (e.g., same data), however, studies so far have only presented small scale experiments comparing SSMs to Transformers. To understand the strengths and weaknesses of these architectures at larger scales, we present a direct comparison between 8B-parameter Mamba, Mamba-2, and Transformer models trained on the same datasets of up to 3.5T tokens. |
ROGER WALEFFE et. al. | arxiv-cs.LG | 2024-06-12 |
167 | Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. |
NICHOLAS CARLINI et. al. | icml | 2024-06-12 |
168 | Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses—that intuitively contain more learnable information and are harder to overfit—from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge. |
Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; | icml | 2024-06-12 |
169 | COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For a comprehensive assessment of LLM safety, it is essential to consider jailbreaks with diverse attributes, such as contextual coherence and sentiment/stylistic variations, and hence it is beneficial to study controllable jailbreaking, i.e. how to enforce control on LLM attacks. In this paper, we formally formulate the controllable attack generation problem, and build a novel connection between this problem and controllable text generation, a well-explored topic of natural language processing. |
Xingang Guo; Fangxu Yu; Huan Zhang; Lianhui Qin; Bin Hu; | icml | 2024-06-12 |
170 | Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the slow speed of current trackers limits their applicability on devices with constrained computational resources. To address this challenge, we introduce ABTrack, an adaptive computation framework that adaptively bypassing transformer blocks for efficient visual tracking. |
XIANGYANG YANG et. al. | arxiv-cs.CV | 2024-06-12 |
171 | How Language Model Hallucinations Can Snowball IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim. |
Muru Zhang; Ofir Press; William Merrill; Alisa Liu; Noah A. Smith; | icml | 2024-06-12 |
172 | Timer: Generative Pre-trained Transformers Are Large Time Series Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM). |
YONG LIU et. al. | icml | 2024-06-12 |
173 | Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by previous theoretical study of static version of the attention multiplication problem [Zandieh, Han, Daliri, and Karbasi ICML 2023, Alman and Song NeurIPS 2023], we formally define a dynamic version of attention matrix multiplication problem. |
Jan van den Brand; Zhao Song; Tianyi Zhou; | icml | 2024-06-12 |
174 | Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. |
HAOQI WU et. al. | icml | 2024-06-12 |
175 | Delving Into Differentially Private Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such ‘reduction’ is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively. |
YOULONG DING et. al. | icml | 2024-06-12 |
176 | Trainable Transformer in Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new efficient construction, Transformer in Transformer (in short, TINT), that allows a transformer to simulate and fine-tune more complex models during inference (e.g., pre-trained language models). |
Abhishek Panigrahi; Sadhika Malladi; Mengzhou Xia; Sanjeev Arora; | icml | 2024-06-12 |
177 | PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality. |
Praneeth Kacham; Vahab Mirrokni; Peilin Zhong; | icml | 2024-06-12 |
178 | Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we first introduce LoCoV1, a 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long. |
Jon Saad-Falcon; Daniel Y Fu; Simran Arora; Neel Guha; Christopher Re; | icml | 2024-06-12 |
179 | Entropy-Reinforced Planning with Large Language Models for Drug Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose ERP, Entropy-Reinforced Planning for Transformer Decoding, which employs an entropy-reinforced planning algorithm to enhance the Transformer decoding process and strike a balance between exploitation and exploration. |
Xuefeng Liu; Chih-chan Tien; Peng Ding; Songhao Jiang; Rick L. Stevens; | icml | 2024-06-12 |
180 | InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval. |
BOXIN WANG et. al. | icml | 2024-06-12 |
181 | Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to *weakly supervise* superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model? |
COLLIN BURNS et. al. | icml | 2024-06-12 |
182 | Do Efficient Transformers Really Save Computation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to understand the capabilities and limitations of efficient Transformers, specifically the Sparse Transformer and the Linear Transformer. |
KAI YANG et. al. | icml | 2024-06-12 |
183 | In-Context Principle Learning from Mistakes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples. |
TIANJUN ZHANG et. al. | icml | 2024-06-12 |
184 | Fine-Tuned ‘Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification. |
Martin Juan José Bucher; Marco Martini; | arxiv-cs.CL | 2024-06-12 |
185 | Prodigy: An Expeditiously Adaptive Parameter-Free Learner IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally. |
Konstantin Mishchenko; Aaron Defazio; | icml | 2024-06-12 |
186 | What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth. |
Xingwu Chen; Difan Zou; | icml | 2024-06-12 |
187 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge. |
Fangyun Wei; Xi Chen; Lin Luo; | icml | 2024-06-12 |
188 | Position: On The Possibilities of AI-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds. |
SOURADIP CHAKRABORTY et. al. | icml | 2024-06-12 |
189 | Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on improving the FFN module within the vision transformer. |
YIXING XU et. al. | icml | 2024-06-12 |
190 | How Smooth Is Attention? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a detailed study of the Lipschitz constant of self-attention in several practical scenarios, discussing the impact of the sequence length $n$ and layer normalization on the local Lipschitz constant of both unmasked and masked self-attention. |
Valérie Castin; Pierre Ablin; Gabriel Peyré; | icml | 2024-06-12 |
191 | Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective. |
CHENG HAN et. al. | icml | 2024-06-12 |
192 | Outlier-Efficient Hopfield Layers for Large Transformer-Based Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed `OutEffHop`) and use it to address the outlier inefficiency problem of training gigantic transformer-based models. |
JERRY YAO-CHIEH HU et. al. | icml | 2024-06-12 |
193 | Gated Linear Attention Transformers with Hardware-Efficient Training IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work describes a hardware-efficient algorithm for linear attention that trades off memory movement against parallelizability. |
Songlin Yang; Bailin Wang; Yikang Shen; Rameswar Panda; Yoon Kim; | icml | 2024-06-12 |
194 | Discrete Diffusion Modeling By Estimating The Ratios of The Data Distribution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Crucially, standard diffusion models rely on the well-established theory of score matching, but efforts to generalize this to discrete structures have not yielded the same empirical gains. In this work, we bridge this gap by proposing score entropy, a novel loss that naturally extends score matching to discrete spaces, integrates seamlessly to build discrete diffusion models, and significantly boosts performance. |
Aaron Lou; Chenlin Meng; Stefano Ermon; | icml | 2024-06-12 |
195 | Teaching Language Models to Self-Improve By Learning from Language Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Self-Refinement Tuning (SRT), a method that leverages model feedback for alignment, thereby reducing reliance on human annotations. |
Chi Hu; Yimin Hu; Hang Cao; Tong Xiao; Jingbo Zhu; | arxiv-cs.CL | 2024-06-11 |
196 | Improving Autoformalization Using Type Checking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis shows that the performance of these models is largely limited by their inability to generate formal statements that successfully type-check (i.e., are syntactically correct and consistent with types) – with a whopping 86.6% of GPT-4o errors starting from a type-check failure. In this work, we propose a method to fix this issue through decoding with type-check filtering, where we initially sample a diverse set of candidate formalizations for an informal statement, then use the Lean proof assistant to filter out candidates that do not type-check. |
Auguste Poiroux; Gail Weiss; Viktor Kunčak; Antoine Bosselut; | arxiv-cs.CL | 2024-06-11 |
197 | Anomaly Detection on Unstable Logs with GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs. |
Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand; | arxiv-cs.SE | 2024-06-11 |
198 | Towards Generalized Hydrological Forecasting Using Transformer Models for 120-Hour Streamflow Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing data from the preceding 72 hours, including precipitation, evapotranspiration, and discharge values, we developed a generalized model to predict future streamflow. |
Bekir Z. Demiray; Ibrahim Demir; | arxiv-cs.LG | 2024-06-11 |
199 | LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approaches for the SMM4H24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders. |
Dasun Athukoralage; Thushari Atapattu; Menasha Thilakaratne; Katrina Falkner; | arxiv-cs.CL | 2024-06-11 |
200 | Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models. |
AmirMohammad Azadi; Baktash Ansari; Sina Zamani; | arxiv-cs.CL | 2024-06-11 |
201 | Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture. |
Martin Courtois; Malte Ostendorff; Leonhard Hennig; Georg Rehm; | arxiv-cs.CL | 2024-06-10 |
202 | Annotation Alignment: Comparing LLM and Human Annotations of Conversational Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that larger datasets are needed to resolve whether GPT-4 exhibits disparities in how well it correlates with demographic groups. |
Rajiv Movva; Pang Wei Koh; Emma Pierson; | arxiv-cs.CL | 2024-06-10 |
203 | Unveiling The Safety of GPT-4o: An Empirical Study Using Jailbreak Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this paper adopts a series of multi-modal and uni-modal jailbreak attacks on 4 commonly used benchmarks encompassing three modalities (ie, text, speech, and image), which involves the optimization of over 4,000 initial text queries and the analysis and statistical evaluation of nearly 8,000+ response on GPT-4o. |
Zonghao Ying; Aishan Liu; Xianglong Liu; Dacheng Tao; | arxiv-cs.CR | 2024-06-10 |
204 | LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent. |
SIMRANJIT SINGH et. al. | arxiv-cs.DC | 2024-06-10 |
205 | Large Language Models for Generating Rules, Yay or Nay? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach that leverages Large Language Models (LLMs), such as GPT-3.5 and GPT-4, as a potential world model to accelerate the engineering of software systems. |
SHANGEETHA SIVASOTHY et. al. | arxiv-cs.SE | 2024-06-10 |
206 | Validating LLM-Generated Programs with Metamorphic Prompt Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research is required to comprehensively explore these critical concerns surrounding LLM-generated code. In this paper, we propose a novel solution called metamorphic prompt testing to address these challenges. |
Xiaoyin Wang; Dakai Zhu; | arxiv-cs.SE | 2024-06-10 |
207 | In-Context Learning and Fine-Tuning GPT for Argument Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an ICL strategy for ATC combining kNN-based examples selection and majority vote ensembling. |
Jérémie Cabessa; Hugo Hernault; Umer Mushtaq; | arxiv-cs.CL | 2024-06-10 |
208 | Hidden Holes: Topological Aspects of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods developed in this paper are novel in the field and based on mathematical apparatus that might be unfamiliar to the target audience. |
Stephen Fitz; Peter Romero; Jiyan Jonas Schneider; | arxiv-cs.CL | 2024-06-09 |
209 | Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, resource-intensive VTs updating and high mobility of vehicles require intensive computation, communication, and storage resources, especially for their migration among RSUs with limited coverages. To address these issues, we propose an attribute-aware auction-based mechanism to optimize resource allocation during VTs migration by considering both price and non-monetary attributes, e.g., location and reputation. |
YONGJU TONG et. al. | arxiv-cs.AI | 2024-06-08 |
210 | MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. |
GYEONG HOON YI et. al. | arxiv-cs.CL | 2024-06-08 |
211 | Automata Extraction from Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automata extraction algorithm specifically designed for Transformer models. |
Yihao Zhang; Zeming Wei; Meng Sun; | arxiv-cs.LG | 2024-06-08 |
212 | Do LLMs Recognize Me, When I Is Not Me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first study examining indexical shift in any language, releasing a Turkish dataset specifically designed for this purpose. |
Metehan Oğuz; Yusuf Umut Ciftci; Yavuz Faruk Bakman; | arxiv-cs.CL | 2024-06-08 |
213 | G-Transformer: Counterfactual Outcome Prediction Under Dynamic and Time-varying Treatment Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present G-Transformer, a Transformer-based framework supporting g-computation for counterfactual prediction under dynamic and time-varying treatment strategies. |
Hong Xiong; Feng Wu; Leon Deng; Megan Su; Li-wei H Lehman; | arxiv-cs.LG | 2024-06-08 |
214 | SelfDefend: LLMs Can Defend Themselves Against Jailbreaking in A Practical Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by how the traditional security concept of shadow stacks defends against memory overflow attacks, this paper introduces a generic LLM jailbreak defense framework called SelfDefend, which establishes a shadow LLM defense instance to concurrently protect the target LLM instance in the normal stack and collaborate with it for checkpoint-based access control. |
XUNGUANG WANG et. al. | arxiv-cs.CR | 2024-06-08 |
215 | VTrans: Accelerating Transformer Compression with Variational Information Bottleneck Based Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges, we propose VTrans, an iterative pruning framework guided by the Variational Information Bottleneck (VIB) principle. |
Oshin Dutta; Ritvik Gupta; Sumeet Agarwal; | arxiv-cs.LG | 2024-06-07 |
216 | Concept Formation and Alignment in Language Models: Bridging Statistical Patterns in Latent Space to Concept Taxonomy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a mechanism for identifying concepts and their hierarchical organization within the semantic representations learned by various LMs, encompassing a spectrum from early models like Glove to the transformer-based language models like ALBERT and T5. |
Mehrdad Khatir; Chandan K. Reddy; | arxiv-cs.CL | 2024-06-07 |
217 | BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense. |
Baktash Ansari; Mohammadmostafa Rostamkhani; Sauleh Eetemadi; | arxiv-cs.CL | 2024-06-07 |
218 | Are Large Language Models More Empathetic Than Humans? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study exploring the empathetic responding capabilities of four state-of-the-art LLMs: GPT-4, LLaMA-2-70B-Chat, Gemini-1.0-Pro, and Mixtral-8x7B-Instruct in comparison to a human baseline. |
Anuradha Welivita; Pearl Pu; | arxiv-cs.CL | 2024-06-07 |
219 | Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, Sentence-BERT tackles STS tasks from a classification perspective, overlooking the progressive nature of semantic relationships, which results in suboptimal performance. To bridge this gap, this paper presents an innovative regression framework and proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss. |
Bowen Zhang; Chunping Li; | arxiv-cs.CL | 2024-06-07 |
220 | Transformer Conformal Prediction for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a conformal prediction method for time series using the Transformer architecture to capture long-memory and long-range dependencies. |
Junghwan Lee; Chen Xu; Yao Xie; | arxiv-cs.LG | 2024-06-07 |
221 | Low-Resource Cross-Lingual Summarization Through Few-Shot Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4. |
Gyutae Park; Seojin Hwang; Hwanhee Lee; | arxiv-cs.CL | 2024-06-07 |
222 | Mixture-of-Agents Enhances Large Language Model Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. |
Junlin Wang; Jue Wang; Ben Athiwaratkun; Ce Zhang; James Zou; | arxiv-cs.CL | 2024-06-07 |
223 | Logic Synthesis with Generative Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named ctrw (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting. |
XIHAN LI et. al. | arxiv-cs.LO | 2024-06-07 |
224 | Exploring The Latest LLMs for Leaderboard Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document). |
Salomon Kabongo; Jennifer D’Souza; Sören Auer; | arxiv-cs.CL | 2024-06-06 |
225 | Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations. |
NEEMESH YADAV et. al. | arxiv-cs.CL | 2024-06-06 |
226 | MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Multi-path Enhanced Taylor (MET) Transformer based U-net for Speech Enhancement (MUSE), a lightweight speech enhancement network built upon the Unet architecture. |
Zizhen Lin; Xiaoting Chen; Junyu Wang; | arxiv-cs.SD | 2024-06-06 |
227 | Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs By Sampling with People Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by methods from cognitive science, we propose an iterative method for simultaneously eliciting conversational tones and sentences, where participants alternate between two tasks: (1) one participant identifies the tone of a given sentence and (2) a different participant generates a sentence based on that tone. |
Dun-Ming Huang; Pol Van Rijn; Ilia Sucholutsky; Raja Marjieh; Nori Jacoby; | arxiv-cs.CL | 2024-06-06 |
228 | The Good, The Bad, and The Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states. |
MIKHAIL MOZIKOV et. al. | arxiv-cs.AI | 2024-06-05 |
229 | Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Global Clipper and Global Hybrid Clipper, effective mitigation strategies specifically designed for transformer-based models. |
QUTUB SYED SHA et. al. | arxiv-cs.CV | 2024-06-05 |
230 | CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition. |
YE ZENG et. al. | arxiv-cs.IT | 2024-06-05 |
231 | From Tarzan to Tolkien: Controlling The Language Proficiency Level of LLMs for Content Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of controlling the difficulty level of text generated by Large Language Models (LLMs) for contexts where end-users are not fully proficient, such as language learners. |
Ali Malik; Stephen Mayhew; Chris Piech; Klinton Bicknell; | arxiv-cs.CL | 2024-06-05 |
232 | Learning to Grok: Emergence of In-context Learning and Skill Composition in Modular Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks. |
Tianyu He; Darshil Doshi; Aritra Das; Andrey Gromov; | arxiv-cs.LG | 2024-06-04 |
233 | Probing The Category of Verbal Aspect in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian. |
Anisia Katinskaia; Roman Yangarber; | arxiv-cs.CL | 2024-06-04 |
234 | Randomized Geometric Algebra Methods for Convex Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce randomized algorithms to Clifford’s Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces. |
Yifei Wang; Sungyoon Kim; Paul Chu; Indu Subramaniam; Mert Pilanci; | arxiv-cs.LG | 2024-06-04 |
235 | OccamLLM: Fast and Exact Language Model Arithmetic in A Single Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that enables exact arithmetic in \textit{a single autoregressive step}, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities. |
OWEN DUGAN et. al. | arxiv-cs.CL | 2024-06-04 |
236 | Too Big to Fail: Larger Language Models Are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous findings show that changes in PPL when masking attention layers in pre-trained transformer-based NLMs reflect linguistic anomalies associated with Alzheimer’s disease dementia. Building upon this, we explore a novel bidirectional attention head ablation method that exhibits properties attributed to the concepts of cognitive and brain reserve in human brain studies, which postulate that people with more neurons in the brain and more efficient processing are more resilient to neurodegeneration. |
Changye Li; Zhecheng Sheng; Trevor Cohen; Serguei Pakhomov; | arxiv-cs.CL | 2024-06-04 |
237 | Multi-layer Learnable Attention Mask for Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Comprehensive experimental validation on various datasets, such as MADv2, QVHighlights, ImageNet 1K, and MSRVTT, demonstrates the efficacy of the LAM, exemplifying its ability to enhance model performance while mitigating redundant computations. This pioneering approach presents a significant advancement in enhancing the understanding of complex scenarios, such as in movie understanding. |
Wayner Barrios; SouYoung Jin; | arxiv-cs.CV | 2024-06-04 |
238 | A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs). |
Remi Genet; Hugo Inzirillo; | arxiv-cs.LG | 2024-06-04 |
239 | Eliciting The Priors of Large Language Models Using Iterated In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a prompt-based workflow for eliciting prior distributions from LLMs. |
Jian-Qiao Zhu; Thomas L. Griffiths; | arxiv-cs.CL | 2024-06-03 |
240 | Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our empirical study focuses on evaluating adversarial robustness of object trackers based on bounding box versus binary mask predictions, and attack methods at different levels of perturbations. |
Fatemeh Nourilenjan Nokabadi; Jean-François Lalonde; Christian Gagné; | arxiv-cs.CV | 2024-06-03 |
241 | Superhuman Performance in Urology Board Questions By An Explainable Large Language Model Enabled for Context Integration of The European Association of Urology Guidelines: The UroBot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: UroBot was developed using OpenAI’s GPT-3.5, GPT-4, and GPT-4o models, employing retrieval-augmented generation (RAG) and the latest 2023 guidelines from the European Association of Urology (EAU). |
MARTIN J. HETZ et. al. | arxiv-cs.CL | 2024-06-03 |
242 | SemCoder: Training Code Language Models with Comprehensive Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to bridge the gap between Code LLMs’ reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair. |
YANGRUIBO DING et. al. | arxiv-cs.CL | 2024-06-03 |
243 | Prototypical Transformer As Unified Motion Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective. |
CHENG HAN et. al. | arxiv-cs.CV | 2024-06-03 |
244 | In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties? |
GRZEGORZ KASZUBA et. al. | arxiv-cs.LG | 2024-06-03 |
245 | Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: From the examiner perspective, we define four evaluation tasks for error identification and correction along with a new dataset with annotated error types and steps. |
XIAOYUAN LI et. al. | arxiv-cs.CL | 2024-06-02 |
246 | Maximum-Entropy Regularized Decision Transformer with Reward Relabelling for Dynamic Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce a novel methodology named Max-Entropy enhanced Decision Transformer with Reward Relabeling for Offline RLRS (EDT4Rec). |
Xiaocong Chen; Siyu Wang; Lina Yao; | arxiv-cs.IR | 2024-06-02 |
247 | Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs. |
SHIQI LIU et. al. | arxiv-cs.CL | 2024-06-02 |
248 | RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks. |
Md. Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Md. Ashad Alam; | arxiv-cs.CL | 2024-06-01 |
249 | Multimodal Metadata Assignment for Cultural Heritage Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset. |
LUIS REI et. al. | arxiv-cs.CV | 2024-06-01 |
250 | EdgeTran: Device-Aware Co-Search of Transformers for Efficient Inference on Mobile Edge Platforms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while … |
Shikhar Tuli; N. Jha; | IEEE Transactions on Mobile Computing | 2024-06-01 |
251 | Multi-granularity Cross Transformer Network for Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yanping Li; Duoqian Miao; Hongyun Zhang; Jie Zhou; Cairong Zhao; | Pattern Recognit. | 2024-06-01 |
252 | Beyond Metrics: Evaluating LLMs’ Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our evaluation includes both quantitative analysis using metrics like F1 score and qualitative assessment of LLMs’ explanations for their predictions. We find that, while Mistral-7b and Mixtral-8x7b achieved high F1 scores, they and other LLMs such as GPT-3.5-Turbo, Llama-2-70b, and Gemma-7b struggled with understanding linguistic and contextual nuances, as well as lack of transparency in their decision-making process as observed from their explanations. |
MILLICENT OCHIENG et. al. | arxiv-cs.CL | 2024-06-01 |
253 | Bi-Directional Transformers Vs. Word2vec: Discovering Vulnerabilities in Lifted Compiled Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting vulnerabilities within compiled binaries is challenging due to lost high-level code structures and other factors such as architectural dependencies, compilers, and optimization options. To address these obstacles, this research explores vulnerability detection by using natural language processing (NLP) embedding techniques with word2vec, BERT, and RoBERTa to learn semantics from intermediate representation (LLVM) code. |
Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier; | arxiv-cs.CR | 2024-05-30 |
254 | Divide-and-Conquer Meets Consensus: Unleashing The Power of Functions in Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. |
JINGCHANG CHEN et. al. | arxiv-cs.CL | 2024-05-30 |
255 | QClusformer: A Quantum Transformer-based Framework for Unsupervised Visual Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce QClusformer, a pioneering Transformer-based framework leveraging Quantum machines to tackle unsupervised vision clustering challenges. |
XUAN-BAC NGUYEN et. al. | arxiv-cs.CV | 2024-05-30 |
256 | Automatic Graph Topology-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers. |
CHAO WANG et. al. | arxiv-cs.NE | 2024-05-30 |
257 | The Point of View of A Sentiment: Towards Clinician Bias Detection in Psychiatric Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging large language models, this work aims to identify the sentiment expressed in psychiatric clinical notes based on the reader’s point of view. |
Alissa A. Valentine; Lauren A. Lepow; Alexander W. Charney; Isotta Landi; | arxiv-cs.CL | 2024-05-30 |
258 | Hyper-Transformer for Amodal Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Learning shape priors is crucial for effective amodal completion, but traditional methods often rely on two-stage processes or additional information, leading to inefficiencies and potential error accumulation. To address these shortcomings, we introduce a novel framework named the Hyper-Transformer Amodal Network (H-TAN). |
Jianxiong Gao; Xuelin Qian; Longfei Liang; Junwei Han; Yanwei Fu; | arxiv-cs.CV | 2024-05-30 |
259 | DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. |
JIA LI et. al. | arxiv-cs.CL | 2024-05-30 |
260 | Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RIR robustly improves knowledge-intensive visual question answering (VQA) of GPT-4V by 37-43%, GPT-4 Turbo by 25-27%, and GPT-4o by 18-20% in terms of open-ended VQA evaluation metrics. To our surprise, we discover that RIR helps the model to better access its own world knowledge. |
Jialiang Xu; Michael Moor; Jure Leskovec; | arxiv-cs.CL | 2024-05-29 |
261 | Multi-objective Cross-task Learning Via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new learning-based framework by leveraging the strong reasoning capability of the GPT-based architecture to automate surgical robotic tasks. |
Jiawei Fu; Yonghao Long; Kai Chen; Wang Wei; Qi Dou; | arxiv-cs.RO | 2024-05-29 |
262 | Voice Jailbreak Attacks Against GPT-4o Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first systematic measurement of jailbreak attacks against the voice mode of GPT-4o. |
Xinyue Shen; Yixin Wu; Michael Backes; Yang Zhang; | arxiv-cs.CR | 2024-05-29 |
263 | AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we rethink the approach to jailbreaking LLMs and formally define three essential properties from the attacker’ s perspective, which contributes to guiding the design of jailbreak methods. |
JIAWEI CHEN et. al. | arxiv-cs.CV | 2024-05-29 |
264 | Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As interest in reformulating the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA. |
Simranjit Singh; Georgios Pavlakos; Dimitrios Stamoulis; | arxiv-cs.CV | 2024-05-29 |
265 | Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Repeat Ranking method – where we evaluate the same responses multiple times and train only on those responses which are consistently ranked. |
Peter Devine; | arxiv-cs.CL | 2024-05-29 |
266 | A Multi-Source Retrieval Question Answering Framework Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information. |
RIDONG WU et. al. | arxiv-cs.IR | 2024-05-29 |
267 | LMO-DP: Optimizing The Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$). |
QIN YANG et. al. | arxiv-cs.CR | 2024-05-29 |
268 | MDS-ViTNet: Improving Saliency Prediction for Eye-Tracking with Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking. |
Polezhaev Ignat; Goncharenko Igor; Iurina Natalya; | arxiv-cs.CV | 2024-05-29 |
269 | Data-Efficient Approach to Humanoid Control Via Fine-Tuning A Pre-Trained GPT on Action Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories. |
Siddharth Padmanabhan; Kazuki Miyazawa; Takato Horii; Takayuki Nagai; | arxiv-cs.RO | 2024-05-28 |
270 | Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks. |
Shubham Vatsal; Ayush Singh; | arxiv-cs.CL | 2024-05-28 |
271 | Edinburgh Clinical NLP at MEDIQA-CORR 2024: Guiding Large Language Models with Hints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we evaluate the capability of general LLMs, specifically GPT-3.5 and GPT-4, to identify and correct medical errors with multiple prompting strategies. |
ARYO PRADIPTA GEMA et. al. | arxiv-cs.CL | 2024-05-28 |
272 | Notes on Applicability of GPT-4 to Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics. |
Łukasz Borchmann; | arxiv-cs.CL | 2024-05-28 |
273 | An Empirical Analysis on Large Language Models in Debate Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation. |
Xinyi Liu; Pinxin Liu; Hangfeng He; | arxiv-cs.CL | 2024-05-28 |
274 | I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach aims to revolutionize teachers’ assessment of students’ practices by leveraging Generative Artificial Intelligence (GenAI) to offer detailed insights into classroom dynamics. |
UNGGI LEE et. al. | arxiv-cs.HC | 2024-05-28 |
275 | Multi-objective Representation for Numbers in Clinical Narratives Using CamemBERT-bio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to classify numerical values extracted from medical documents across seven distinct physiological categories, employing CamemBERT-bio. |
Boammani Aser Lompo; Thanh-Dung Le; | arxiv-cs.CL | 2024-05-27 |
276 | RLAIF-V: Aligning MLLMs Through Open-Source AI Feedback for Super GPT-4V Trustworthiness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce RLAIF-V, a novel framework that aligns MLLMs in a fully open-source paradigm for super GPT-4V trustworthiness. |
TIANYU YU et. al. | arxiv-cs.CL | 2024-05-27 |
277 | PivotMesh: Generic 3D Mesh Generation Via Pivot Vertices Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a generic and scalable mesh generation framework PivotMesh, which makes an initial attempt to extend the native mesh generation to large-scale datasets. |
Haohan Weng; Yikai Wang; Tong Zhang; C. L. Philip Chen; Jun Zhu; | arxiv-cs.CV | 2024-05-27 |
278 | Are Self-Attentions Effective for Time Series Forecasting? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we shift focus from the overall architecture of the Transformer to the effectiveness of self-attentions for time series forecasting. |
Dongbin Kim; Jinseong Park; Jaewook Lee; Hoki Kim; | arxiv-cs.LG | 2024-05-27 |
279 | Vision-and-Language Navigation Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposal, the Vision-and-Language Navigation Generative Pretrained Transformer (VLN-GPT), adopts a transformer decoder model (GPT2) to model trajectory sequence dependencies, bypassing the need for historical encoding modules. |
Wen Hanlin; | arxiv-cs.AI | 2024-05-27 |
280 | Deployment of NLP and LLM Techniques to Control Mobile Robots at The Edge: A Case Study Using GPT-4-Turbo and LLaMA 2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. We aim to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated. |
PASCAL SIKORSKI et. al. | arxiv-cs.RO | 2024-05-27 |
281 | InversionView: A General-Purpose Method for Reading Information from Neural Activations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activations. |
Xinting Huang; Madhur Panwar; Navin Goyal; Michael Hahn; | arxiv-cs.LG | 2024-05-27 |
282 | LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker. |
Shaohua Dong; Yunhe Feng; Qing Yang; Yuewei Lin; Heng Fan; | arxiv-cs.CV | 2024-05-27 |
283 | Performance Evaluation of Reddit Comments Using Machine Learning and Natural Language Processing Methods in Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments. |
Xiaoxia Zhang; Xiuyuan Qi; Zixin Teng; | arxiv-cs.CL | 2024-05-26 |
284 | M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. |
MINGSHUANG LUO et. al. | arxiv-cs.CV | 2024-05-25 |
285 | Accelerating Transformers with Spectrum-Preserving Token Merging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. |
HOAI-CHAU TRAN et. al. | arxiv-cs.LG | 2024-05-25 |
286 | AutoManual: Generating Instruction Manuals By LLM Agents Via Interactive Environmental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AutoManual, a framework enabling LLM agents to autonomously build their understanding through interaction and adapt to new environments. |
MINGHAO CHEN et. al. | arxiv-cs.AI | 2024-05-25 |
287 | Zero-Shot Spam Email Classification Using Pre-trained Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of pre-trained large language models (LLMs) for spam email classification using zero-shot prompting. |
Sergio Rojas-Galeano; | arxiv-cs.CL | 2024-05-24 |
288 | Incremental Comprehension of Garden-Path Sentences By Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa. |
ANDREW LI et. al. | arxiv-cs.CL | 2024-05-24 |
289 | Enhancing Augmentative and Alternative Communication with Card Prediction and Colourful Semantics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an approach to enhancing Augmentative and Alternative Communication (AAC) systems by integrating Colourful Semantics (CS) with transformer-based language models specifically tailored for Brazilian Portuguese. |
Jayr Pereira; Francisco Rodrigues; Jaylton Pereira; Cleber Zanchettin; Robson Fidalgo; | arxiv-cs.CL | 2024-05-24 |
290 | Spectraformer: A Unified Random Feature Framework for Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Spectraformer, a unified framework for approximating and learning the kernel function in linearized attention of the Transformer. |
Duke Nguyen; Aditya Joshi; Flora Salim; | arxiv-cs.LG | 2024-05-24 |
291 | GPTZoo: A Large-scale Dataset of GPTs for The Research Community Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support academic research on GPTs, we introduce GPTZoo, a large-scale dataset comprising 730,420 GPT instances. |
Xinyi Hou; Yanjie Zhao; Shenao Wang; Haoyu Wang; | arxiv-cs.SE | 2024-05-24 |
292 | A Comparative Analysis of Distributed Training Strategies for GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques developed to address these challenges, enabling the efficient and scalable training of Large Language Models. |
Ishan Patwardhan; Shubham Gandhi; Om Khare; Amit Joshi; Suraj Sawant; | arxiv-cs.DC | 2024-05-24 |
293 | Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD). |
Gao Tianci; | arxiv-cs.RO | 2024-05-24 |
294 | Activator: GLU Activations As The Core Functions of A Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimental assessments conducted by this research show that both proposed modifications and reductions offer competitive performance in relation to baseline architectures, in support of the aims of this work in establishing a more efficient yet capable alternative to the traditional attention mechanism as the core component in designing transformer architectures. |
Abdullah Nazhat Abdullah; Tarkan Aydin; | arxiv-cs.CV | 2024-05-24 |
295 | SMART: Scalable Multi-agent Real-time Simulation Via Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens. |
Wei Wu; Xiaoxin Feng; Ziyan Gao; Yuheng Kan; | arxiv-cs.RO | 2024-05-24 |
296 | Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS). |
Barış Büyüktaş; Kenneth Weitzel; Sebastian Völkers; Felix Zailskas; Begüm Demir; | arxiv-cs.CV | 2024-05-24 |
297 | Large Language Models Reflect Human Citation Patterns with A Heightened Citation Bias Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By analyzing citation graphs, we show that the references recommended by GPT-4 are embedded in the relevant citation context, suggesting an even deeper conceptual internalization of the citation networks. |
ANDRES ALGABA et. al. | arxiv-cs.DL | 2024-05-24 |
298 | SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings. |
GUIBAO SHEN et. al. | arxiv-cs.CV | 2024-05-24 |
299 | PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis. |
ZICHENG WANG et. al. | arxiv-cs.CV | 2024-05-24 |
300 | GPT Is Not An Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey. |
Virginia K. Felkner; Jennifer A. Thompson; Jonathan May; | arxiv-cs.CL | 2024-05-24 |
301 | Steerable Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we introduce Steerable Transformers, an extension of the Vision Transformer mechanism that maintains equivariance to the special Euclidean group $\mathrm{SE}(d)$. |
Soumyabrata Kundu; Risi Kondor; | arxiv-cs.CV | 2024-05-24 |
302 | CulturePark: Boosting Cross-cultural Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection. |
CHENG LI et. al. | arxiv-cs.AI | 2024-05-23 |
303 | An Evaluation of Estimative Uncertainty in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares estimative uncertainty in commonly used large language models (LLMs) like GPT-4 and ERNIE-4 to that of humans, and to each other. |
Zhisheng Tang; Ke Shen; Mayank Kejriwal; | arxiv-cs.CL | 2024-05-23 |
304 | Quantifying The Gain in Weak-to-Strong Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a theoretical framework for understanding weak-to-strong generalization. |
Moses Charikar; Chirag Pabbaraju; Kirankumar Shiragur; | arxiv-cs.LG | 2024-05-23 |
305 | UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed UDKAG. |
CHUANHAO LI et. al. | arxiv-cs.CV | 2024-05-23 |
306 | Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to The Edge of Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with. |
Boshi Wang; Xiang Yue; Yu Su; Huan Sun; | arxiv-cs.CL | 2024-05-23 |
307 | Understanding The Training and Generalization of Pretrained Transformer for Sequential Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the supervised pretrained transformer for a class of sequential decision-making problems. |
HANZHAO WANG et. al. | arxiv-cs.LG | 2024-05-23 |
308 | JiuZhang3.0: Efficiently Improving Mathematical Reasoning By Training Small Data Synthesis Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data. |
KUN ZHOU et. al. | arxiv-cs.CL | 2024-05-23 |
309 | CEEBERT: Cross-Domain Inference in Early Exit BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point. |
Divya Jyoti Bajpai; Manjesh Kumar Hanawal; | arxiv-cs.CL | 2024-05-23 |
310 | PuTR: A Pure Transformer for Decoupled and Online Multi-Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we demonstrate that the trajectory graph is a directed acyclic graph, which can be represented by an object sequence arranged by frame and a binary adjacency matrix. |
Chongwei Liu; Haojie Li; Zhihui Wang; Rui Xu; | arxiv-cs.CV | 2024-05-22 |
311 | ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD. |
Luan Thanh Nguyen; | arxiv-cs.CL | 2024-05-22 |
312 | Quantifying Emergence in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a quantifiable solution for estimating emergence. |
Hang Chen; Xinyu Yang; Jiaying Zhu; Wenya Wang; | arxiv-cs.CL | 2024-05-21 |
313 | How Reliable AI Chatbots Are for Disease Prediction from Patient Complaints? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. |
Ayesha Siddika Nipu; K M Sajjadul Islam; Praveen Madiraju; | arxiv-cs.AI | 2024-05-21 |
314 | Generative AI and Large Language Models for Cyber Security: All Insights You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs). |
MOHAMED AMINE FERRAG et. al. | arxiv-cs.CR | 2024-05-21 |
315 | Transformer in Touch: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review aims to comprehensively outline the application and development of Transformers in tactile technology. |
Jing Gao; Ning Cheng; Bin Fang; Wenjuan Han; | arxiv-cs.LG | 2024-05-21 |
316 | Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We developed a comprehensive theoretical framework for social dynamics and introduced two evaluation tasks: Inverse Reasoning (IR) and Inverse Inverse Planning (IIP). |
JUNQI WANG et. al. | arxiv-cs.AI | 2024-05-20 |
317 | Fennec: Fine-grained Language Model Evaluation and Correction Extended Through Branching and Bridging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Particularly, we present a step-by-step evaluation framework: \textbf{Fennec}, capable of \textbf{F}ine-grained \textbf{E}valuatio\textbf{N} and correctio\textbf{N} \textbf{E}xtended through bran\textbf{C}hing and bridging. |
XIAOBO LIANG et. al. | arxiv-cs.CL | 2024-05-20 |
318 | Automated Hardware Logic Obfuscation Framework Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process. |
BANAFSHEH SABER LATIBARI et. al. | arxiv-cs.CR | 2024-05-20 |
319 | DaVinci at SemEval-2024 Task 9: Few-shot Prompting GPT-3.5 for Unconventional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types. |
Suyash Vardhan Mathur; Akshett Rai Jindal; Manish Shrivastava; | arxiv-cs.CL | 2024-05-19 |
320 | Zero-Shot Stance Detection Using Contextual Data Generation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this approach, we aim to fine-tune an existing model at test time. |
Ghazaleh Mahmoudi; Babak Behkamkia; Sauleh Eetemadi; | arxiv-cs.CL | 2024-05-19 |
321 | Enhancing User Experience in Large Language Models Through Human-centered Design: Integrating Theoretical Insights with An Experimental Study to Meet Diverse Software Learning Needs with A Single Document Knowledge Base Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The experimental results demonstrate the effect of different elements’ forms and organizational methods in the document, as well as GPT’s relevant configurations, on the interaction effectiveness between GPT and software learners. |
Yuchen Wang; Yin-Shan Lin; Ruixin Huang; Jinyin Wang; Sensen Liu; | arxiv-cs.HC | 2024-05-19 |
322 | Can Public LLMs Be Used for Self-Diagnosis of Medical Conditions ? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we prepare a prompt engineered dataset of 10000 samples and test the performance on the general task of self-diagnosis. |
Nikil Sharan Prabahar Balasubramanian; Sagnik Dakshit; | arxiv-cs.CL | 2024-05-18 |
323 | Transformer Based Neural Networks for Emotion Recognition in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper outlines the approach of the ISDS-NLP team in the SemEval 2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF). |
Claudiu Creanga; Liviu P. Dinu; | arxiv-cs.CL | 2024-05-18 |
324 | Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques. |
Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; Chen Hajaj; | arxiv-cs.CR | 2024-05-18 |
325 | Language Models Can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks. |
Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty; | arxiv-cs.CL | 2024-05-17 |
326 | Benchmarking Large Language Models on CFLUE — A Chinese Financial Language Understanding Evaluation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CFLUE, the Chinese Financial Language Understanding Evaluation benchmark, designed to assess the capability of LLMs across various dimensions. |
Jie Zhu; Junhui Li; Yalong Wen; Lifan Guo; | arxiv-cs.CL | 2024-05-17 |
327 | GPTs Window Shopping: An Analysis of The Landscape of Custom ChatGPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Customization comes in the form of prompt-tuning, analysis of reference resources, browsing, and external API interactions, alongside a promise of revenue sharing for created custom GPTs. In this work, we peer into the window of the GPT Store and measure its impact. |
Benjamin Zi Hao Zhao; Muhammad Ikram; Mohamed Ali Kaafar; | arxiv-cs.SI | 2024-05-17 |
328 | GPT Store Mining and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings aim to enhance understanding of the GPT ecosystem, providing valuable insights for future research, development, and policy-making in generative AI. |
Dongxun Su; Yanjie Zhao; Xinyi Hou; Shenao Wang; Haoyu Wang; | arxiv-cs.LG | 2024-05-16 |
329 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training and evaluation on multiple devices. To address this, we introduce HW-GPT-Bench, a hardware-aware benchmark that utilizes surrogate predictions to approximate various hardware metrics across 13 devices of architectures in the GPT-2 family, with architectures containing up to 774M parameters. |
RHEA SANJAY SUKTHANKER et. al. | arxiv-cs.LG | 2024-05-16 |
330 | Many-Shot In-Context Learning in Multimodal Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we evaluate the performance of multimodal foundation models scaling from few-shot to many-shot ICL. |
YIXING JIANG et. al. | arxiv-cs.LG | 2024-05-16 |
331 | FinTextQA: A Dataset for Long-form Financial Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces FinTextQA, a novel dataset for long-form question answering (LFQA) in finance. |
JIAN CHEN et. al. | arxiv-cs.CL | 2024-05-16 |
332 | Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, no comparative study examining different LLMs has yet been reported for web-form-test generation. |
TAO LI et. al. | arxiv-cs.SE | 2024-05-16 |
333 | Comparing The Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: This study aims to compare the performance of two large language models, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings. |
Birger Moell; | arxiv-cs.CL | 2024-05-15 |
334 | Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP). |
Tong Zhan; Chenxi Shi; Yadong Shi; Huixiang Li; Yiyu Lin; | arxiv-cs.CL | 2024-05-15 |
335 | ALPINE: Unveiling The Planning Capability of Autoregressive Learning in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the findings of our Project ALPINE which stands for “Autoregressive Learning for Planning In NEtworks. |
SIWEI WANG et. al. | arxiv-cs.LG | 2024-05-15 |
336 | GPT-3.5 for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models. |
Anisia Katinskaia; Roman Yangarber; | arxiv-cs.CL | 2024-05-14 |
337 | Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a theoretical framework that sheds light on the memorization process and performance dynamics of transformer-based language models. |
Xueyan Niu; Bo Bai; Lei Deng; Wei Han; | arxiv-cs.LG | 2024-05-14 |
338 | EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Continual learning, which acts as an effective tool for detecting newly emerged deepfake audio while maintaining performance on older types, lacks a well-constructed and user-friendly evaluation framework. To address this gap, we introduce EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection. |
Xiaohui Zhang; Jiangyan Yi; Jianhua Tao; | arxiv-cs.SD | 2024-05-14 |
339 | Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes a concurrent programming framework for quantitatively analyzing the efficiency challenges in serving multiple long-context requests under limited size of GPU high-bandwidth memory (HBM) regime. |
Yao Fu; | arxiv-cs.LG | 2024-05-14 |
340 | Can GNN Be Good Adapter for LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. |
XUANWEN HUANG et. al. | www | 2024-05-13 |
341 | Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs. |
CHENGYUE WU et. al. | arxiv-cs.CL | 2024-05-13 |
342 | Open-vocabulary Auditory Neural Decoding Using FMRI-prompted LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method, the \textbf{Brain Prompt GPT (BP-GPT)}. |
Xiaoyu Chen; Changde Du; Che Liu; Yizhe Wang; Huiguang He; | arxiv-cs.HC | 2024-05-13 |
343 | MacBehaviour: An R Package for Behavioural Experimentation on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The package offers a comprehensive set of functions designed for LLM experiments, covering experiment design, stimuli presentation, model behaviour manipulation, logging response and token probability. |
Xufeng Duan; Shixuan Li; Zhenguang G. Cai1; | arxiv-cs.CL | 2024-05-13 |
344 | Hierarchical Decision Mamba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models. |
André Correia; Luís A. Alexandre; | arxiv-cs.LG | 2024-05-13 |
345 | COLA: Cross-city Mobility Transformer for Human Trajectory Simulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to explore the intriguing problem of mobility transfer across cities, grasping the universal patterns of human trajectories to augment the powerful Transformer with external mobility data. |
Yu Wang; Tongya Zheng; Yuxuan Liang; Shunyu Liu; Mingli Song; | www | 2024-05-13 |
346 | Coding Historical Causes of Death Data with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. |
BJØRN PEDERSEN et. al. | arxiv-cs.LG | 2024-05-13 |
347 | Limited Ability of LLMs to Simulate Human Psychological Behaviours: A Psychometric Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we prompt OpenAI’s flagship models, GPT-3.5 and GPT-4, to assume different personas and respond to a range of standardized measures of personality constructs. |
Nikolay B Petrov; Gregory Serapio-García; Jason Rentfrow; | arxiv-cs.CL | 2024-05-12 |
348 | Can Language Models Explain Their Own Classification Behavior? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules. |
Dane Sherburn; Bilal Chughtai; Owain Evans; | arxiv-cs.LG | 2024-05-12 |
349 | L(u)PIN: LLM-based Political Ideology Nowcasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method to analyze ideological positions of individual parliamentary representatives by leveraging the latent knowledge of LLMs. |
Ken Kato; Annabelle Purnomo; Christopher Cochrane; Raeid Saqur; | arxiv-cs.CL | 2024-05-12 |
350 | Evaluating Task-based Effectiveness of MLLMs on Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a forward-thinking question: Is GPT-4V effective at low-level data analysis tasks on charts? |
Yifan Wu; Lutao Yan; Yuyu Luo; Yunhai Wang; Nan Tang; | arxiv-cs.CL | 2024-05-11 |
351 | Quite Good, But Not Enough: Nationality Bias in Large Language Models — A Case Study of ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The research covers 195 countries, 4 temperature settings, and 3 distinct prompt types, generating 4,680 discourses about nationality descriptions in Chinese and English. |
Shucheng Zhu; Weikang Wang; Ying Liu; | arxiv-cs.CL | 2024-05-11 |
352 | Retrieval Enhanced Zero-Shot Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to take advantage of existing pre-trained large-scale vision and language models to directly generate captions with test time adaptation. |
YUNCHUAN MA et. al. | arxiv-cs.CV | 2024-05-11 |
353 | Automating Code Adaptation for MLOps — A Benchmarking Study on LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the possibilities of the current generation of Large Language Models for incorporating Machine Learning Operations (MLOps) functionalities into ML training code bases. |
HARSH PATEL et. al. | arxiv-cs.LG | 2024-05-10 |
354 | ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design. |
Francisco Olivos; Minhui Liu; | arxiv-cs.CY | 2024-05-10 |
355 | Residual-based Attention Physics-informed Neural Networks for Efficient Spatio-Temporal Lifetime Assessment of Transformers Operated in Renewable Power Plants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article introduces an efficient spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution. |
IBAI RAMIREZ et. al. | arxiv-cs.LG | 2024-05-10 |
356 | A Lightweight Transformer for Remote Sensing Image Change Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task. |
Dongwei Sun; Yajie Bao; Xiangyong Cao; | arxiv-cs.CV | 2024-05-10 |
357 | TacoERE: Cluster-aware Compression for Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm. |
YONG GUAN et. al. | arxiv-cs.CL | 2024-05-10 |
358 | Multimodal LLMs Struggle with Basic Visual Network Analysis: A VNA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose. |
Evan M. Williams; Kathleen M. Carley; | arxiv-cs.CV | 2024-05-10 |
359 | Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder. |
YAO GE et. al. | arxiv-cs.CL | 2024-05-09 |
360 | Large Language Models Show Human-like Social Desirability Biases in Survey Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed an experimental framework using Big Five personality surveys and uncovered a previously undetected social desirability bias in a wide range of LLMs. |
AADESH SALECHA et. al. | arxiv-cs.AI | 2024-05-09 |
361 | From Human Judgements to Predictive Models: Unravelling Acceptability in Code-Mixed Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we construct Cline – a dataset containing human acceptability judgements for English-Hindi (en-hi) code-mixed text. |
PRASHANT KODALI et. al. | arxiv-cs.CL | 2024-05-09 |
362 | Ditto: Quantization-aware Secure Inference of Transformers Upon MPC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. |
HAOQI WU et. al. | arxiv-cs.CR | 2024-05-08 |
363 | Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals. |
Aylin Gunal; Baihan Lin; Djallel Bouneffouf; | arxiv-cs.CL | 2024-05-08 |
364 | Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by recent work that has utilised very powerful LLMs, such as GPT-4, to evaluate the outputs produced by less powerful models, we conduct an automated analysis of the quality of the feedback produced by several open source models using a dataset from an introductory programming course. |
CHARLES KOUTCHEME et. al. | arxiv-cs.CL | 2024-05-08 |
365 | Evaluating Text Summaries Generated By Large Language Models Using OpenAI’s GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research examines the effectiveness of OpenAI’s GPT models as independent evaluators of text summaries generated by six transformer-based models from Hugging Face: DistilBART, BERT, ProphetNet, T5, BART, and PEGASUS. |
Hassan Shakil; Atqiya Munawara Mahi; Phuoc Nguyen; Zeydy Ortiz; Mamoun T. Mardini; | arxiv-cs.CL | 2024-05-07 |
366 | Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries. |
Hassan Shakil; Zeydy Ortiz; Grant C. Forbes; | arxiv-cs.CL | 2024-05-07 |
367 | GPT-Enabled Cybersecurity Training: A Tailored Approach for Effective Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the limitations of traditional Cybersecurity Awareness and Training (CSAT) programs and proposes an innovative solution using Generative Pre-Trained Transformers (GPT) to address these shortcomings. |
Nabil Al-Dhamari; Nathan Clarke; | arxiv-cs.CR | 2024-05-07 |
368 | How Does GPT-2 Predict Acronyms? Extracting and Understanding A Circuit Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on understanding how GPT-2 Small performs the task of predicting three-letter acronyms. |
Jorge García-Carrasco; Alejandro Maté; Juan Trujillo; | arxiv-cs.LG | 2024-05-07 |
369 | Structured Click Control in Transformer-based Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness of the response, we propose a structured click intent model based on graph neural networks, which adaptively obtains graph nodes via the global similarity of user-clicked Transformer tokens. |
Long Xu; Yongquan Chen; Rui Huang; Feng Wu; Shiwu Lai; | arxiv-cs.CV | 2024-05-07 |
370 | The Silicon Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness. |
Lena Armstrong; Abbey Liu; Stephen MacNeil; Danaë Metaxa; | arxiv-cs.CY | 2024-05-07 |
371 | A Transformer with Stack Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism. |
Jiaoda Li; Jennifer C. White; Mrinmaya Sachan; Ryan Cotterell; | arxiv-cs.CL | 2024-05-07 |
372 | Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This anchored bias challenges the integrity of GPT-2’s decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias. |
Ruizhe Li; Yanjun Gao; | arxiv-cs.CL | 2024-05-06 |
373 | MAmmoTH2: Scaling Instructions from The Web IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a paradigm to efficiently harvest 10 million naturally existing instruction data from the pre-training web corpus to enhance LLM reasoning. |
Xiang Yue; Tuney Zheng; Ge Zhang; Wenhu Chen; | arxiv-cs.CL | 2024-05-06 |
374 | Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs have not yet been used to characterize synergistic learning in students’ collaborative discourse. In this exploratory work, we take a first step towards adopting a human-in-the-loop prompt engineering approach with GPT-4-Turbo to summarize and categorize students’ synergistic learning during collaborative discourse. |
Clayton Cohn; Caitlin Snyder; Justin Montenegro; Gautam Biswas; | arxiv-cs.CL | 2024-05-06 |
375 | Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation. |
Keith Burghardt; Kai Chen; Kristina Lerman; | arxiv-cs.CL | 2024-05-06 |
376 | Detecting Anti-Semitic Hate Speech Using Transformer-based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we developed a new data labeling technique and established a proof of concept targeting anti-Semitic hate speech, utilizing a variety of transformer models such as BERT (arXiv:1810.04805), DistillBERT (arXiv:1910.01108), RoBERTa (arXiv:1907.11692), and LLaMA-2 (arXiv:2307.09288), complemented by the LoRA fine-tuning approach (arXiv:2106.09685). |
Dengyi Liu; Minghao Wang; Andrew G. Catlin; | arxiv-cs.CL | 2024-05-06 |
377 | Unraveling The Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study addresses the underexplored area of evaluating LLMs in low-resourced languages such as Bengali. |
Fatema Tuj Johora Faria; Mukaffi Bin Moin; Asif Iftekher Fahim; Pronay Debnath; Faisal Muhammad Shah; | arxiv-cs.CL | 2024-05-05 |
378 | Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems. |
Bingzhang Wang; Muhammad Monjurul Karim; Chenxi Liu; Yinhai Wang; | arxiv-cs.MA | 2024-05-05 |
379 | Can Large Language Models Make The Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents reports on a series of experiments with a novel dataset evaluating how well Large Language Models (LLMs) can mark (i.e. grade) open text responses to short answer questions, Specifically, we explore how well different combinations of GPT version and prompt engineering strategies performed at marking real student answers to short answer across different domain areas (Science and History) and grade-levels (spanning ages 5-16) using a new, never-used-before dataset from Carousel, a quizzing platform. |
Owen Henkel; Adam Boxer; Libby Hills; Bill Roberts; | arxiv-cs.CL | 2024-05-05 |
380 | A Combination of BERT and Transformer for Vietnamese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction. |
Hieu Ngo Trung; Duong Tran Ham; Tin Huynh; Kiem Hoang; | arxiv-cs.CL | 2024-05-04 |
381 | U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on self-attention with downsampled tokens, we propose a series of U-shaped DiTs (U-DiTs) in the paper and conduct extensive experiments to demonstrate the extraordinary performance of U-DiT models. |
YUCHUAN TIAN et. al. | arxiv-cs.CV | 2024-05-04 |
382 | REASONS: A Benchmark for REtrieval and Automated CitationS Of ScieNtific Sentences Using Public and Proprietary LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article. |
DEEPA TILWANI et. al. | arxiv-cs.CL | 2024-05-03 |
383 | Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to Test BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. |
PATRICK KRAUSS et. al. | arxiv-cs.CL | 2024-05-03 |
384 | Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on The Travelling Salesman Problem Using GPT-3.5 Turbo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP). |
Mahmoud Masoud; Ahmed Abdelhay; Mohammed Elhenawy; | arxiv-cs.CL | 2024-05-03 |
385 | Structural Pruning of Pre-trained Language Models Via Neural Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process. |
Aaron Klein; Jacek Golebiowski; Xingchen Ma; Valerio Perrone; Cedric Archambeau; | arxiv-cs.LG | 2024-05-03 |
386 | Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit’s Showerthoughts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. |
TOLGA BUZ et. al. | arxiv-cs.CL | 2024-05-02 |
387 | The Effectiveness of LLMs As Annotators: A Comparative Overview and Empirical Analysis of Direct Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data. |
Maja Pavlovic; Massimo Poesio; | arxiv-cs.CL | 2024-05-02 |
388 | GroupedMixer: An Entropy Model with Group-wise Token-Mixers for Learned Image Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel transformer-based entropy model called GroupedMixer, which enjoys both faster coding speed and better compression performance than previous transformer-based methods. |
DAXIN LI et. al. | arxiv-cs.CV | 2024-05-02 |
389 | UQA: Corpus for Urdu Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers. |
Samee Arif; Sualeha Farid; Awais Athar; Agha Ali Raza; | arxiv-cs.CL | 2024-05-02 |
390 | Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements. |
SEUNGONE KIM et. al. | arxiv-cs.CL | 2024-05-02 |
391 | A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges. |
Ayaz Mehmood; Muhammad Tayyab Zamir; Muhammad Asif Ayub; Nasir Ahmad; Kashif Ahmad; | arxiv-cs.CL | 2024-05-01 |
392 | Vision Transformer: To Discover The four Secrets of Image Patches Related Papers Related Patents Related Grants Related Venues Related Experts View |
TAO ZHOU et. al. | Inf. Fusion | 2024-05-01 |
393 | Chat-GPT; Validating Technology Acceptance Model (TAM) in Education Sector Via Ubiquitous Learning Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View |
N. SAIF et. al. | Comput. Hum. Behav. | 2024-05-01 |
394 | GRAformer: A Gated Residual Attention Transformer for Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Chengcao Yang; Yutian Wang; Binghao Yang; Jun Chen; | Neurocomputing | 2024-05-01 |
395 | Hierarchical Vector Transformer Vehicle Trajectories Prediction with Diffusion Convolutional Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yingjuan Tang; Hongwen He; Yong Wang; | Neurocomputing | 2024-05-01 |
396 | Enhanced Visible-infrared Person Re-identification Based on Cross-attention Multiscale Residual Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Prodip Kumar Sarker; Qingjie Zhao; | Pattern Recognit. | 2024-05-01 |
397 | How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. |
JIONGHAO LIN et. al. | arxiv-cs.CL | 2024-05-01 |
398 | A Novel Approach for Rumor Detection in Social Platforms: Memory-augmented Transformer with Graph Convolutional Networks Related Papers Related Patents Related Grants Related Venues Related Experts View |
Qian Chang; Xia Li; Zhao Duan; | Knowl. Based Syst. | 2024-05-01 |
399 | Learning Multiple Attention Transformer Super-resolution Method for Grape Disease Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Haibin Jin; Xiaoquan Chu; Jianfang Qi; Jianying Feng; Weisong Mu; | Expert Syst. Appl. | 2024-05-01 |
400 | How Can I Improve? Using GPT to Highlight The Desired and Undesired Parts of Open-ended Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our aim is to equip tutors with actionable, explanatory feedback during online training lessons. |
JIONGHAO LIN et. al. | arxiv-cs.CL | 2024-04-30 |
401 | Do Large Language Models Understand Conversational Implicature — A Case Study with A Chinese Sitcom Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $\textit{My Own Swordsman}$. |
Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu; | arxiv-cs.CL | 2024-04-30 |
402 | RSCaMa: Remote Sensing Image Change Captioning with State Space Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite previous methods progressing in the spatial change perception, there are still weaknesses in joint spatial-temporal modeling. To address this, in this paper, we propose a novel RSCaMa model, which achieves efficient joint spatial-temporal modeling through multiple CaMa layers, enabling iterative refinement of bi-temporal features. |
CHENYANG LIU et. al. | arxiv-cs.CV | 2024-04-29 |
403 | Can GPT-4 Do L2 Analytic Assessment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components. |
Stefano Bannò; Hari Krishna Vydana; Kate M. Knill; Mark J. F. Gales; | arxiv-cs.CL | 2024-04-29 |
404 | Ethical Reasoning and Moral Value Alignment of LLMs Depend on The Language We Prompt Them in Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs — GPT-4, ChatGPT, and Llama2-70B-Chat — perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted. |
Utkarsh Agarwal; Kumar Tanmay; Aditi Khandelwal; Monojit Choudhury; | arxiv-cs.CL | 2024-04-29 |
405 | Time Machine GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative. |
Felix Drinkall; Eghbal Rahimikia; Janet B. Pierrehumbert; Stefan Zohren; | arxiv-cs.CL | 2024-04-29 |
406 | GPT-4 Passes Most of The 297 Written Polish Board Certification Examinations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods: We developed a software program to download and process PES exams and tested the performance of GPT models using OpenAI Application Programming Interface. |
Jakub Pokrywka; Jeremi Kaczmarek; Edward Gorzelańczyk; | arxiv-cs.CL | 2024-04-29 |
407 | Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the study of how subwording affects the understanding capacity of language models has been very few and only limited to a handful of languages. To reduce this gap we used 6 different tokenization schemes to pretrain relatively small language models in Nepali and used the representations learned to finetune on several downstream tasks. |
Nishant Luitel; Nirajan Bekoju; Anand Kumar Sah; Subarna Shakya; | arxiv-cs.CL | 2024-04-28 |
408 | PatentGPT: A Large Language Model for Intellectual Property Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain. |
ZILONG BAI et. al. | arxiv-cs.CL | 2024-04-28 |
409 | CLFT: Camera-LiDAR Fusion Transformer for Semantic Segmentation in Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, the vision transformer is the novel ground-breaker that successfully brought the multi-head-attention mechanism to computer vision applications. Therefore, we propose a vision-transformer-based network to carry out camera-LiDAR fusion for semantic segmentation applied to autonomous driving. |
Junyi Gu; Mauro Bellone; Tomáš Pivoňka; Raivo Sell; | arxiv-cs.CV | 2024-04-27 |
410 | GPT for Games: A Scoping Review (2020-2023) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a scoping review of 55 articles to explore GPT’s potential for games, offering researchers a comprehensive understanding of the current applications and identifying both emerging trends and unexplored areas. |
Daijin Yang; Erica Kleinman; Casper Harteveld; | arxiv-cs.HC | 2024-04-27 |
411 | MRScore: Evaluating Radiology Report Generation with LLM-based Reward System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MRScore, an automatic evaluation metric tailored for radiology report generation by leveraging Large Language Models (LLMs). |
YUNYI LIU et. al. | arxiv-cs.CL | 2024-04-27 |
412 | Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work addresses the task of detecting conspiracy theories in German Telegram messages. |
Milena Pustet; Elisabeth Steffen; Helena Mihaljević; | arxiv-cs.CL | 2024-04-27 |
413 | Evaluation of Few-Shot Learning for Classification Tasks in The Polish Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a few-shot benchmark consisting of 7 different classification tasks native to the Polish language. |
Tsimur Hadeliya; Dariusz Kajtoch; | arxiv-cs.CL | 2024-04-27 |
414 | CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments. |
KAIXUAN HUANG et. al. | arxiv-cs.AI | 2024-04-27 |
415 | Quantifying Memorization of Domain-Specific Pre-trained Language Models Using Japanese Newspaper and Paywalls Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Especially, few studies focus on domain-specific PLM. In this study, we pre-trained domain-specific GPT-2 models using a limited corpus of Japanese newspaper articles and quantified memorization of training data by comparing them with general Japanese GPT-2 models. |
Shotaro Ishihara; | arxiv-cs.CL | 2024-04-26 |
416 | ChatGPT Is Here to Help, Not to Replace Anybody — An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, 52 first-year CS students were surveyed in order to assess their views on technologies with code-generation capabilities, both from academic and professional perspectives. |
Bruno Pereira Cipriano; Pedro Alves; | arxiv-cs.ET | 2024-04-26 |
417 | Enhancing Legal Compliance and Regulation Analysis with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks. |
Shabnam Hassani; | arxiv-cs.SE | 2024-04-26 |
418 | Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT As A Pivot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process. |
Michelle Terblanche; Kayode Olaleye; Vukosi Marivate; | arxiv-cs.CL | 2024-04-26 |
419 | UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt — A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. |
PARTH VASHISHT et. al. | arxiv-cs.AI | 2024-04-26 |
420 | TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters. |
LIANG ZHANG et. al. | arxiv-cs.CV | 2024-04-25 |
421 | Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explored the addition bias, a cognitive tendency to prefer adding elements over removing them to alter an initial state or structure, by conducting four preregistered experiments examining the problem-solving behavior of both humans and OpenAl’s GPT-4 large language model. |
Lydia Uhler; Verena Jordan; Jürgen Buder; Markus Huff; Frank Papenmeier; | arxiv-cs.CL | 2024-04-25 |
422 | Exploring Internal Numeracy in Language Models: A Case Study on ALBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models. |
Ulme Wennberg; Gustav Eje Henter; | arxiv-cs.CL | 2024-04-25 |
423 | Player-Driven Emergence in LLM-Driven Game Narrative Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives. |
XIANGYU PENG et. al. | arxiv-cs.CL | 2024-04-25 |
424 | IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate a wide range of proprietary and open-source LLMs including GPT-3.5, GPT-4, PaLM-2, mT5, Gemma, BLOOM and LLaMA on IndicGenBench in a variety of settings. |
Harman Singh; Nitish Gupta; Shikhar Bharadwaj; Dinesh Tewari; Partha Talukdar; | arxiv-cs.CL | 2024-04-25 |
425 | Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library. |
YE FANG et. al. | arxiv-cs.CV | 2024-04-25 |
426 | Promoting CNNs with Cross-Architecture Knowledge Distillation for Efficient Monocular Depth Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively distill valuable information from the transformer teacher and bridge the gap between convolution and transformer features, we introduce a method to acclimate the teacher with a ghost decoder. |
Zhimeng Zheng; Tao Huang; Gongsheng Li; Zuyi Wang; | arxiv-cs.CV | 2024-04-25 |
427 | Towards Efficient Patient Recruitment for Clinical Trials: Application of A Prompt-Based Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR. |
Mojdeh Rahmanian; Seyed Mostafa Fakhrahmad; Seyedeh Zahra Mousavi; | arxiv-cs.CL | 2024-04-24 |
428 | GeckOpt: LLM System Efficiency Via Intent-Based Tool Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By … |
Michael Fore; Simranjit Singh; Dimitrios Stamoulis; | Proceedings of the Great Lakes Symposium on VLSI 2024 | 2024-04-24 |
429 | A Comprehensive Survey on Evaluating Large Language Model Applications in The Medical Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective and ethical deployment. |
Yining Huang; Keke Tang; Meilian Chen; Boyuan Wang; | arxiv-cs.CL | 2024-04-24 |
430 | The Promise and Challenges of Using LLMs to Accelerate The Screening Process of Systematic Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening. |
Aleksi Huotala; Miikka Kuutila; Paul Ralph; Mika Mäntylä; | arxiv-cs.CL | 2024-04-24 |
431 | Automated Creation of Source Code Variants of A Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CR | 2024-04-24 |
432 | Transformers Can Represent $n$-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; | arxiv-cs.CL | 2024-04-23 |
433 | Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on a specific use case, pharmaceutical manufacturing investigations, and propose that leveraging historical records of manufacturing incidents and deviations in an organization can be beneficial for addressing and closing new cases, or de-risking new manufacturing campaigns. |
Hossein Salami; Brandye Smith-Goettler; Vijay Yadav; | arxiv-cs.CL | 2024-04-23 |
434 | Pyramid Hierarchical Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). |
Muhammad Ahmad; Muhammad Hassaan Farooq Butt; Manuel Mazzara; Salvatore Distifano; | arxiv-cs.CV | 2024-04-23 |
435 | Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance. |
Qianru Meng; Xiao Zhang; Guus Ramackers; Visser Joost; | arxiv-cs.SE | 2024-04-23 |
436 | PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs. |
SHASHI KANT GUPTA et. al. | arxiv-cs.CL | 2024-04-23 |
437 | From Complexity to Clarity: How AI Enhances Perceptions of Scientists and The Public’s Understanding of Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public’s understanding of science. |
David M. Markowitz; | arxiv-cs.CL | 2024-04-23 |
438 | Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Marking, a novel grading task that enhances automated grading systems by performing an in-depth analysis of student responses and providing students with visual highlights. |
Shashank Sonkar; Naiming Liu; Debshila B. Mallick; Richard G. Baraniuk; | arxiv-cs.CL | 2024-04-22 |
439 | What Do Transformers Know About Government? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. |
JUE HOU et. al. | arxiv-cs.CL | 2024-04-22 |
440 | A Multimodal Feature Distillation with CNN-Transformer Network for Brain Tumor Segmentation with Incomplete Modalities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Multimodal feature distillation with Convolutional Neural Network (CNN)-Transformer hybrid network (MCTSeg) for accurate brain tumor segmentation with missing modalities. |
Ming Kang; Fung Fung Ting; Raphaël C. -W. Phan; Zongyuan Ge; Chee-Ming Ting; | arxiv-cs.CV | 2024-04-22 |
441 | Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CL | 2024-04-22 |
442 | How Well Can LLMs Echo Us? Evaluating AI Chatbots’ Role-Play Ability with ECHO Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an oversight limits the potential for advancements in digital human clones and non-player characters in video games. To bridge this gap, we introduce ECHO, an evaluative framework inspired by the Turing test. |
MAN TIK NG et. al. | arxiv-cs.CL | 2024-04-22 |
443 | Pre-Calc: Learning to Use The Calculator Improves Numeracy in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively. |
Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate; | arxiv-cs.CL | 2024-04-22 |
444 | Assessing GPT-4-Vision’s Capabilities in UML-Based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emergence of advanced neural networks has opened up new ways in automated code generation from conceptual models, promising to enhance software development processes. This paper presents a preliminary evaluation of GPT-4-Vision, a state-of-the-art deep learning model, and its capabilities in transforming Unified Modeling Language (UML) class diagrams into fully operating Java class files. |
Gábor Antal; Richárd Vozár; Rudolf Ferenc; | arxiv-cs.SE | 2024-04-22 |
445 | SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM’s SVG Editing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For quantitative evaluation of LLMs’ ability to edit SVG, we propose SVGEditBench. |
Kunato Nishina; Yusuke Matsui; | arxiv-cs.CV | 2024-04-21 |
446 | Automated Text Mining of Experimental Methodologies from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the fine-tuned DistilBERT, a methodology-specific, pre-trained generative classification language model for mining biomedicine texts. |
Ziqing Guo; | arxiv-cs.CL | 2024-04-21 |
447 | Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, current works use GPT-4 to only predict the correct option without providing any explanation and thus do not provide any insight into the thinking process and reasoning used by GPT-4 or other LLMs. Therefore, we introduce a new domain-specific error taxonomy derived from collaboration with medical students. |
SOUMYADEEP ROY et. al. | arxiv-cs.CL | 2024-04-20 |
448 | Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a solution, we propose a combined intrinsic-extrinsic evaluation framework for subword tokenization. |
KHUYAGBAATAR BATSUREN et. al. | arxiv-cs.CL | 2024-04-20 |
449 | Do English Named Entity Recognizers Work Well on Global Englishes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As such, it is unclear whether they generalize for analyzing use of English globally. To test this, we build a newswire dataset, the Worldwide English NER Dataset, to analyze NER model performance on low-resource English variants from around the world. |
Alexander Shan; John Bauer; Riley Carlson; Christopher Manning; | arxiv-cs.CL | 2024-04-20 |
450 | Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. |
Danqing Ma; Meng Wang; Ao Xiang; Zongqing Qi; Qin Yang; | arxiv-cs.CV | 2024-04-19 |
451 | Linearly-evolved Transformer for Pan-sharpening Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource satellites.To address this challenge between favorable performance and expensive computation, we tailor an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework. |
JUNMING HOU et. al. | arxiv-cs.CV | 2024-04-19 |
452 | Crowdsourcing Public Attitudes Toward Local Services Through The Lens of Google Maps Reviews: An Urban Density-based Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel data source and methodological framework that can be easily adapted to different regions, offering useful insights into public sentiment toward the built environment and shedding light on how planning policies can be designed to handle related challenges. |
Lingyao Li; Songhua Hu; Atiyya Shaw; Libby Hemphill; | arxiv-cs.SI | 2024-04-19 |
453 | Enabling Natural Zero-Shot Prompting on Encoder Models Via Statement-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an Encoder model to discriminate between the potential statements to determine the label. |
Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji; | arxiv-cs.CL | 2024-04-19 |
454 | TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling. |
Aleksei Dorkin; Kairit Sirts; | arxiv-cs.CL | 2024-04-19 |
455 | Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce two new methods, Dubo-SQL v1 and v2. |
Dayton G. Thorpe; Andrew J. Duberstein; Ian A. Kinsey; | arxiv-cs.CL | 2024-04-18 |
456 | Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. … |
Nils Graef; | arxiv-cs.LG | 2024-04-18 |
457 | X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities. |
HAOYUAN JIANG et. al. | arxiv-cs.AI | 2024-04-18 |
458 | Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation. |
Yucheng Lin; Yuhan Xia; Yunfei Long; | arxiv-cs.CL | 2024-04-18 |
459 | MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm. |
Jinwu Wang; Wei Mao; Miaomiao Liu; | arxiv-cs.SD | 2024-04-18 |
460 | Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles. |
Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch; | arxiv-cs.CL | 2024-04-18 |
461 | EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research. |
Jimenez Eladio; Hao Wu; | arxiv-cs.CL | 2024-04-18 |
462 | Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications. |
Wei Chen; Zhiyuan Li; | arxiv-cs.CL | 2024-04-17 |
463 | Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants. |
ZHUOYI LIN et. al. | arxiv-cs.AI | 2024-04-17 |
464 | CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. |
Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu; | arxiv-cs.AI | 2024-04-17 |
465 | In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introduce the concept of state vector. |
DONGFANG LI et. al. | arxiv-cs.CL | 2024-04-17 |
466 | CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions. |
Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak; | arxiv-cs.CL | 2024-04-16 |
467 | MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how to build small models that have GPT-4-level performance but for 400x lower cost. |
Liyan Tang; Philippe Laban; Greg Durrett; | arxiv-cs.CL | 2024-04-16 |
468 | AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance. |
Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara; | arxiv-cs.CV | 2024-04-15 |
469 | Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I argue that we an extract from the way the transformer architecture works a picture of the relationship between context and meaning. |
Jumbly Grindrod; | arxiv-cs.CL | 2024-04-15 |
470 | Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain. |
Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu; | arxiv-cs.CV | 2024-04-15 |
471 | Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. |
PEIYUAN ZHI et. al. | arxiv-cs.RO | 2024-04-15 |
472 | Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility. |
SIQIAO XUE et. al. | arxiv-cs.AI | 2024-04-15 |
473 | Few-shot Name Entity Recognition on StackOverflow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning. |
Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo; | arxiv-cs.CL | 2024-04-14 |
474 | Understanding The Role of Temperature in Diverse Question Generation By GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a preliminary study of the effect of GPT’s temperature parameter on the diversity of GPT4-generated questions. |
ARAV AGARWAL et. al. | arxiv-cs.CL | 2024-04-14 |
475 | Large Language Models for Mobile GUI Text Input Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We collected 114 UI pages from 62 open-source Android apps and extracted contextual information from the UI pages to construct prompts for LLMs to generate text inputs. |
CHENHUI CUI et. al. | arxiv-cs.SE | 2024-04-13 |
476 | Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap. |
Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych; | arxiv-cs.CL | 2024-04-12 |
477 | Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels. |
William Gantt; Aaron Steven White; | arxiv-cs.CL | 2024-04-12 |
478 | CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs. |
Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran; | arxiv-cs.CL | 2024-04-12 |
479 | Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code. |
MENGNAN QI et. al. | arxiv-cs.PL | 2024-04-12 |
480 | Pre-training Small Base LMs with Fewer Tokens Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate Inheritune in a slightly different setting where we train small LMs utilizing larger LMs and their full pre-training dataset. |
Sunny Sanyal; Sujay Sanghavi; Alexandros G. Dimakis; | arxiv-cs.CL | 2024-04-12 |
481 | Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented. |
Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi; | arxiv-cs.CY | 2024-04-11 |
482 | Remembering Transformer for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing data fine-tuning and regularization methods necessitate task identity information during inference and cannot eliminate interference among different tasks, while soft parameter sharing approaches encounter the problem of an increasing model parameter size. To tackle these challenges, we propose the Remembering Transformer, inspired by the brain’s Complementary Learning Systems (CLS). |
Yuwei Sun; Ippei Fujisawa; Arthur Juliani; Jun Sakuma; Ryota Kanai; | arxiv-cs.LG | 2024-04-11 |
483 | From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates. |
Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu; | arxiv-cs.CL | 2024-04-11 |
484 | Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items. |
Andreas Säuberli; Simon Clematide; | arxiv-cs.CL | 2024-04-11 |
485 | On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models. |
QINGYI LIU et. al. | arxiv-cs.CL | 2024-04-11 |
486 | Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images. |
Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi; | arxiv-cs.RO | 2024-04-11 |
487 | LLM Agents Can Autonomously Exploit One-day Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems. |
Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang; | arxiv-cs.CR | 2024-04-11 |
488 | Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs. |
Miriam Anschütz; Edoardo Mosca; Georg Groh; | arxiv-cs.CL | 2024-04-10 |
489 | Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Command. |
Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan; | arxiv-cs.CL | 2024-04-09 |
490 | Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data. |
YANJIE LI et. al. | arxiv-cs.LG | 2024-04-09 |
491 | InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration. |
XIAOYI DONG et. al. | arxiv-cs.CV | 2024-04-09 |
492 | PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English. |
Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar; | arxiv-cs.CL | 2024-04-08 |
493 | OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets. |
MEHRAN SAFAYANI et. al. | arxiv-cs.CL | 2024-04-08 |
494 | Use of A Structured Knowledge Base Enhances Metadata Curation By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the potential of large language models (LLMs), specifically GPT-4, to improve adherence to metadata standards. |
SOWMYA S. SUNDARAM et. al. | arxiv-cs.AI | 2024-04-08 |
495 | Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data. |
AHMAD IDRISSI-YAGHIR et. al. | arxiv-cs.CL | 2024-04-08 |
496 | Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to compare the performance of GPT with traditional deep learning models (Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT)) in extracting acupoint-related location relations and assess the impact of pretraining and fine-tuning on GPT’s performance. |
YIMING LI et. al. | arxiv-cs.CL | 2024-04-08 |
497 | PagPassGPT: Pattern Guided Password Guessing Via Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT). |
XINGYU SU et. al. | arxiv-cs.CR | 2024-04-07 |
498 | Clinical Trials Protocol Authoring Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies. |
Morteza Maleki; | arxiv-cs.CE | 2024-04-07 |
499 | RecGPT: Generative Personalized Prompts for Sequential Recommendation Via ChatGPT Training Paradigm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as … |
YABIN ZHANG et. al. | ArXiv | 2024-04-06 |
500 | Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel approach, Joint Visual and Text Prompting (VTPrompt), that employs fine-grained visual information to enhance the capability of MLLMs in VQA, especially for object-oriented perception. |
SONGTAO JIANG et. al. | arxiv-cs.CL | 2024-04-06 |
501 | Scope Ambiguities in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models — GPT-2, GPT-3/3.5, Llama 2 and GPT-4 — treat scope ambiguous sentences, and compare this with human judgments. |
Gaurav Kamath; Sebastian Schuster; Sowmya Vajjala; Siva Reddy; | arxiv-cs.CL | 2024-04-05 |
502 | Evaluating LLMs at Detecting Errors in LLM Responses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs. |
RYO KAMOI et. al. | arxiv-cs.CL | 2024-04-04 |
503 | Outlier-Efficient Hopfield Layers for Large Transformer-Based Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models. |
JERRY YAO-CHIEH HU et. al. | arxiv-cs.LG | 2024-04-04 |
504 | Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical … |
CORBY ROSSET et. al. | ArXiv | 2024-04-04 |
505 | Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the … |
SHUO CHEN et. al. | arxiv-cs.LG | 2024-04-04 |
506 | NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation Using Few-Shot Multi-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate. |
Anish Pahilajani; Samyak Rajesh Jain; Devasha Trivedi; | arxiv-cs.CL | 2024-04-03 |
507 | UTeBC-NLP at SemEval-2024 Task 9: Can LLMs Be Lateral Thinkers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT) and direct prompting, enhancing with informative descriptions, and employing contextualizing prompts using a retrieval augmented generation (RAG) pipeline. |
Pouya Sadeghi; Amirhossein Abaskohi; Yadollah Yaghoobzadeh; | arxiv-cs.CL | 2024-04-03 |
508 | Visual Autoregressive Modeling: Scalable Image Generation Via Next-Scale Prediction IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine next-scale prediction or next-resolution prediction, diverging from the standard raster-scan next-token prediction. |
Keyu Tian; Yi Jiang; Zehuan Yuan; Bingyue Peng; Liwei Wang; | arxiv-cs.CV | 2024-04-03 |
509 | Task Agnostic Architecture for Algorithm Induction Via Implicit Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking this trend of generalization to the extreme suggests the possibility of a single deep network architecture capable of solving all tasks. This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed. |
Sahil J. Sindhi; Ignas Budvytis; | arxiv-cs.LG | 2024-04-03 |
510 | BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this, our team engaged in SemEval-2024 Task 4, a hierarchical multi-label classification task designed to identify rhetorical and psychological persuasion techniques embedded within memes. To tackle this problem, we introduced a caption generation step to assess the modality gap and the impact of additional semantic information from images, which improved our result. |
Amirhossein Abaskohi; Amirhossein Dabiriaghdam; Lele Wang; Giuseppe Carenini; | arxiv-cs.CL | 2024-04-03 |
511 | GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. |
Ali Pesaranghader; Nikhil Verma; Manasa Bharadwaj; | arxiv-cs.CL | 2024-04-03 |
512 | Collapse of Self-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In various fields of knowledge creation, including science, new ideas often build on pre-existing information. In this work, we explore this concept within the context of language models. |
David Herel; Tomas Mikolov; | arxiv-cs.CL | 2024-04-02 |
513 | Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this way, we achieve nearly 100% attack success rate — according to GPT-4 as a judge — on Vicuna-13B, Mistral-7B, Phi-3-Mini, Nemotron-4-340B, Llama-2-Chat-7B/13B/70B, Llama-3-Instruct-8B, Gemma-7B, GPT-3.5, GPT-4, and R2D2 from HarmBench that was adversarially trained against the GCG attack. |
Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; | arxiv-cs.CR | 2024-04-02 |
514 | Accelerating Transformer Pre-Training with 2: 4 Sparsity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Training large transformers is slow, but recent innovations on GPU architecture give us an advantage. NVIDIA Ampere GPUs can execute a fine-grained 2:4 sparse matrix … |
Yuezhou Hu; Kang Zhao; Wei Huang; Jianfei Chen; Jun Zhu; | ArXiv | 2024-04-02 |
515 | WcDT: World-centric Diffusion Transformer for Traffic Scene Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel approach for autonomous driving trajectory generation by harnessing the complementary strengths of diffusion probabilistic models (a.k.a., diffusion models) and transformers. |
Chen Yang; Aaron Xuxiang Tian; Dong Chen; Tianyu Shi; Arsalan Heydarian; | arxiv-cs.CV | 2024-04-02 |
516 | SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SGSH–a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. |
SHASHA GUO et. al. | arxiv-cs.CL | 2024-04-02 |
517 | Generative AI-Based Text Generation Methods Using Pre-Trained GPT-2 Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through analysis of greedy search, beam search, top-k sampling, top-p sampling, contrastive searching, and locally typical searching, this work has provided valuable insights into the strengths, weaknesses, and potential applications of each method. |
ROHIT PANDEY et. al. | arxiv-cs.CL | 2024-04-02 |
518 | Release of Pre-Trained Models for The Japanese Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI democratization aims to create a world in which the average person can utilize AI techniques. |
KEI SAWADA et. al. | arxiv-cs.CL | 2024-04-02 |
519 | Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks. |
AMIRHOSSEIN ABASKOHI et. al. | arxiv-cs.CL | 2024-04-02 |
520 | METAL: Towards Multilingual Meta-Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a framework for an end-to-end assessment of LLMs as evaluators in multilingual scenarios. |
Rishav Hada; Varun Gumma; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram; | arxiv-cs.CL | 2024-04-02 |
521 | GPT-COPE: A Graph-Guided Point Transformer for Category-Level Object Pose Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Category-level object pose estimation aims to predict the 6D pose and 3D metric size of objects from given categories. Due to significant intra-class shape variations among … |
Lu Zou; Zhangjin Huang; Naijie Gu; Guoping Wang; | IEEE Transactions on Circuits and Systems for Video … | 2024-04-01 |
522 | Syntactic Robustness for LLM-based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on prompts that ask for code that generates solutions to variables in an equation, when given coefficients of the equation as input. |
Laboni Sarker; Mara Downing; Achintya Desai; Tevfik Bultan; | arxiv-cs.SE | 2024-04-01 |
523 | BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research. |
Jiarong Xian; Jibao Yuan; Peiwei Zheng; Dexian Chen; | arxiv-cs.CL | 2024-04-01 |
524 | Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. |
HAN CAI et. al. | arxiv-cs.CV | 2024-04-01 |
525 | Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of zero-shot Large Multimodal Models (LMMs) in the domain of drone perception. |
Christian Limberg; Artur Gonçalves; Bastien Rigault; Helmut Prendinger; | arxiv-cs.CV | 2024-04-01 |
526 | Large Language Model Evaluation Via Multi AI Agents: Preliminary Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite extensive efforts to examine LLMs from various perspectives, there is a noticeable lack of multi-agent AI models specifically designed to evaluate the performance of different LLMs. To address this gap, we introduce a novel multi-agent AI model that aims to assess and compare the performance of various LLMs. |
Zeeshan Rasheed; Muhammad Waseem; Kari Systä; Pekka Abrahamsson; | arxiv-cs.SE | 2024-04-01 |
527 | Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify and address ethical issues through empirical studies. |
Richard Kimera; Yun-Seon Kim; Heeyoul Choi; | arxiv-cs.CL | 2024-04-01 |
528 | TWIN-GPT: Digital Twins for Clinical Trials Via Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a large language model-based digital twin creation approach, called TWIN-GPT. |
YUE WANG et. al. | arxiv-cs.LG | 2024-04-01 |
529 | Image Fusion for The Novelty Rotating Synthetic Aperture System Based on Vision Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
YU SUN et. al. | Inf. Fusion | 2024-04-01 |
530 | LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment. |
Zilong Wang; Xufang Luo; Xinyang Jiang; Dongsheng Li; Lili Qiu; | arxiv-cs.CL | 2024-04-01 |
531 | CHOPS: CHat with CustOmer Profile Systems for Customer Service with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a practical dataset, the CPHOS-dataset, which includes a database, guiding files, and QA pairs collected from CPHOS, an online platform that facilitates the organization of simulated Physics Olympiads for high school teachers and students. |
JINGZHE SHI et. al. | arxiv-cs.CL | 2024-03-31 |
532 | EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new benchmark – EvoCodeBench to address the preceding problems, which has three primary advances. |
Jia Li; Ge Li; Xuanming Zhang; Yihong Dong; Zhi Jin; | arxiv-cs.CL | 2024-03-31 |
533 | Transformer Based Pluralistic Image Completion with Reduced Information Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer. To mitigate these issues, we propose a new transformer based framework called PUT. |
QIANKUN LIU et. al. | arxiv-cs.CV | 2024-03-30 |
534 | Spread Your Wings: A Radial Strip Transformer for Image Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Radial Strip Transformer (RST), which is a transformer-based architecture that restores the blur images in a polar coordinate system instead of a Cartesian one. |
DUOSHENG CHEN et. al. | arxiv-cs.CV | 2024-03-30 |
535 | Jetsons at FinNLP 2024: Towards Understanding The ESG Impact of A News Article Using Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. |
PARAG PRAVIN DAKLE et. al. | arxiv-cs.CL | 2024-03-30 |
536 | Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite efforts to create open-source models, their limited parameters often result in insufficient multi-step reasoning capabilities required for solving complex medical problems. To address this, we introduce Meerkat, a new family of medical AI systems ranging from 7 to 70 billion parameters. |
HYUNJAE KIM et. al. | arxiv-cs.CL | 2024-03-30 |
537 | A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In pursuit of suitable data augmentation methods, this study explores both established legacy approaches and contemporary practices such as Large Language Models (LLM), including GPT in Hate Speech detection. |
Md Saroar Jahan; Mourad Oussalah; Djamila Romaissa Beddia; Jhuma kabir Mim; Nabil Arhab; | arxiv-cs.CL | 2024-03-30 |
538 | Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune Your Model Unless You Have Access to GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a PEFT method to improve the consistency of LLMs by merging adapters that were fine-tuned separately using triplet and language modelling objectives. |
Aryo Pradipta Gema; Giwon Hong; Pasquale Minervini; Luke Daines; Beatrice Alex; | arxiv-cs.CL | 2024-03-30 |
539 | Cross-lingual Named Entity Corpus for Slavic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a corpus manually annotated with named entities for six Slavic languages – Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. |
Jakub Piskorski; Michał Marcińczuk; Roman Yangarber; | arxiv-cs.CL | 2024-03-30 |
540 | TRABSA: Interpretable Sentiment Analysis of Tweets Using Attention-based BiLSTM and Twitter-RoBERTa Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this. |
Md Abrar Jahin; Md Sakib Hossain Shovon; M. F. Mridha; Md Rashedul Islam; Yutaka Watanobe; | arxiv-cs.CL | 2024-03-30 |
541 | ChatGPT V.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can ChatGPT detect media bias? This study seeks to answer this question by leveraging the Media Bias Identification Benchmark (MBIB) to assess ChatGPT’s competency in distinguishing six categories of media bias, juxtaposed against fine-tuned models such as BART, ConvBERT, and GPT-2. |
Zehao Wen; Rabih Younes; | arxiv-cs.CL | 2024-03-29 |
542 | ReALM: Reference Resolution As Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality. |
JOEL RUBEN ANTONY MONIZ et. al. | arxiv-cs.CL | 2024-03-29 |
543 | Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. |
Ahmad Diab; Rr. Nefriana; Yu-Ru Lin; | arxiv-cs.CL | 2024-03-29 |
544 | Shallow Cross-Encoders for Low-Latency Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, keeping search latencies low is important for user satisfaction and energy usage. In this paper, we show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings since they can estimate the relevance of more documents in the same time budget. |
Aleksandr V. Petrov; Sean MacAvaney; Craig Macdonald; | arxiv-cs.IR | 2024-03-29 |
545 | Decision Mamba: Reinforcement Learning Via Sequence Modeling with Selective State Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios. |
Toshihiro Ota; | arxiv-cs.LG | 2024-03-28 |
546 | Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks. |
ANG LV et. al. | arxiv-cs.CL | 2024-03-28 |
547 | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator’s behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT). |
Norman Di Palo; Edward Johns; | arxiv-cs.RO | 2024-03-28 |
548 | Reshaping Free-Text Radiology Notes Into Structured Reports With Generative Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a pipeline to extract information from free-text radiology reports, that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma. |
LAURA BERGOMI et. al. | arxiv-cs.CL | 2024-03-27 |
549 | 3P-LLM: Probabilistic Path Planning Using Large Language Model for Autonomous Robot Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning. |
Ehsan Latif; | arxiv-cs.RO | 2024-03-27 |
550 | The Topos of Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a theoretical analysis of the expressivity of the transformer architecture through the lens of topos theory. |
Mattia Jacopo Villani; Peter McBurney; | arxiv-cs.LG | 2024-03-27 |
551 | BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Can smaller, more targeted models compete? To address this question, we build and release BioMedLM, a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles. |
ELLIOT BOLTON et. al. | arxiv-cs.CL | 2024-03-27 |
552 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan Sheng Foo; Luu Anh Tuan; See-Kiong Ng; | arxiv-cs.CL | 2024-03-27 |
553 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students’ physics labs. |
Ehsan Latif; Ramviyas Parasuraman; Xiaoming Zhai; | arxiv-cs.RO | 2024-03-27 |
554 | RankMamba: Benchmarking Mamba’s Document Ranking Performance in The Era of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine \mamba’s efficacy through the lens of a classical IR task — document ranking. |
Zhichao Xu; | arxiv-cs.IR | 2024-03-27 |
555 | Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed three approaches for leveraging LLMs for text classification: employing LLMs as zero-shot classifiers, us-ing LLMs as annotators to annotate training data for supervised classifiers, and utilizing LLMs with few-shot examples for augmentation of manually annotated data. |
Yuting Guo; Anthony Ovadje; Mohammed Ali Al-Garadi; Abeed Sarker; | arxiv-cs.CL | 2024-03-27 |
556 | AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. |
FELIX VIRGO et. al. | arxiv-cs.CL | 2024-03-27 |
557 | A Survey on Large Language Models from Concept to Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series. |
Chen Wang; Jin Zhao; Jiaqi Gong; | arxiv-cs.CL | 2024-03-27 |
558 | Evaluating The Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of Large Language Models (LLMs) has led to a parallel rise in the development of Large Multimodal Models (LMMs), which have begun to transform a variety of applications. These sophisticated multimodal models are designed to interpret and analyze complex data by integrating multiple modalities such as text and images, thereby opening new avenues for a range of applications. |
Fouad Trad; Ali Chehab; | arxiv-cs.AI | 2024-03-26 |
559 | Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking. |
HAI-LONG NGUYEN et. al. | arxiv-cs.CL | 2024-03-26 |
560 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents. |
WEI TAO et. al. | arxiv-cs.SE | 2024-03-26 |
561 | Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a task for testing lexical-syntactic flexibility — the degree to which models can generalize over words in a construction with a non-prototypical part of speech. |
David R. Mortensen; Valentina Izrailevitch; Yunze Xiao; Hinrich Schütze; Leonie Weissweiler; | arxiv-cs.CL | 2024-03-26 |
562 | LLMs in HCI Data Work: Bridging The Gap Between Information Retrieval and Responsible Research Practices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Efficient and accurate information extraction from scientific papers is significant in the rapidly developing human-computer interaction research in the literature review process. |
Neda Taghizadeh Serajeh; Iman Mohammadi; Vittorio Fuccella; Mattia De Rosa; | arxiv-cs.HC | 2024-03-26 |
563 | Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models Using Minimal Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing’ method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer. |
Linyang He; Peili Chen; Ercong Nie; Yuanning Li; Jonathan R. Brennan; | arxiv-cs.CL | 2024-03-25 |
564 | State Space Models As Foundation Models: A Control Theoretic Overview Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by … |
Carmen Amo Alonso; Jerome Sieber; M. Zeilinger; | ArXiv | 2024-03-25 |
565 | CYGENT: A Cybersecurity Conversational Agent with Log Summarization Powered By GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability. |
Prasasthy Balasubramanian; Justin Seby; Panos Kostakos; | arxiv-cs.CR | 2024-03-25 |
566 | GPT-4 Understands Discourse at Least As Well As Humans Do Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief … |
Thomas Shultz; Jamie Wise; Ardavan Salehi Nobandegani; | arxiv-cs.CL | 2024-03-25 |
567 | Grammatical Vs Spelling Error Correction: An Investigation Into The Responsiveness of Transformer-based Language Models Using BART and MarianMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims at analyzing different kinds of error that occurs in text documents. |
Rohit Raju; Peeta Basa Pati; SA Gandheesh; Gayatri Sanjana Sannala; Suriya KS; | arxiv-cs.CL | 2024-03-25 |
568 | Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL). |
MINYU CHEN et. al. | arxiv-cs.AI | 2024-03-24 |
569 | A Transformer Approach for Electricity Price Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach to electricity price forecasting (EPF) using a pure Transformer model. |
Oscar Llorente; Jose Portela; | arxiv-cs.LG | 2024-03-24 |
570 | LlamBERT: Large-scale Low-cost Data Annotation in NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LlamBERT, a hybrid approach that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer encoders like BERT and RoBERTa. |
Bálint Csanády; Lajos Muzsai; Péter Vedres; Zoltán Nádasdy; András Lukács; | arxiv-cs.CL | 2024-03-23 |
571 | VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the purpose of future research, CafeBERT is made publicly available for research purposes. |
Phong Nguyen-Thuan Do; Son Quoc Tran; Phu Gia Hoang; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen; | arxiv-cs.CL | 2024-03-23 |
572 | Using Large Language Models for OntoClean-based Ontology Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology. |
Yihang Zhao; Neil Vetter; Kaveh Aryan; | arxiv-cs.AI | 2024-03-23 |
573 | ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents ParFormer as an enhanced transformer architecture that allows the incorporation of different token mixers into a single stage, hence improving feature extraction capabilities. |
NOVENDRA SETYAWAN et. al. | arxiv-cs.CV | 2024-03-22 |
574 | MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 – Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. |
SADIYA SAYARA CHOWDHURY PUSPO et. al. | arxiv-cs.CL | 2024-03-22 |
575 | Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish. |
Lanxin Xu; Shuo Wang; | arxiv-cs.CV | 2024-03-22 |
576 | Can Large Language Models Explore In-context? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. |
Akshay Krishnamurthy; Keegan Harris; Dylan J. Foster; Cyril Zhang; Aleksandrs Slivkins; | arxiv-cs.LG | 2024-03-22 |
577 | Comprehensive Evaluation and Insights Into The Use of Large Language Models in The Automation of Behavior-Driven Development Acceptance Test Formulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this manuscript, we propose a novel approach to enhance BDD practices using large language models (LLMs) to automate acceptance test generation. |
SHANTHI KARPURAPU et. al. | arxiv-cs.SE | 2024-03-22 |
578 | GPT-Connect: Interaction Between Text-Driven Human Motion Generator and 3D Scenes in A Training-free Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, intuitively training a separate scene-aware motion generator in a supervised way can require a large amount of motion samples to be troublesomely collected and annotated in a large scale of different 3D scenes. To handle this task rather in a relatively convenient manner, in this paper, we propose a novel GPT-connect framework. |
Haoxuan Qu; Ziyan Guo; Jun Liu; | arxiv-cs.CV | 2024-03-22 |
579 | On Zero-Shot Counterspeech Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech – counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind. |
Punyajoy Saha; Aalok Agrawal; Abhik Jana; Chris Biemann; Animesh Mukherjee; | arxiv-cs.CL | 2024-03-22 |
580 | K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In many literary texts, emotions are indirectly conveyed through descriptions of actions, facial expressions, and appearances, necessitating emotion inference for narrative understanding. In this paper, we introduce K-Act2Emo, a Korean commonsense knowledge graph (CSKG) comprising 1,900 indirect emotional expressions and the emotions inferable from them. |
Kyuhee Kim; Surin Lee; Sangah Lee; | arxiv-cs.CL | 2024-03-21 |
581 | LLM-based Extraction of Contradictions from Patents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI’s GPT-4. |
Stefan Trapp; Joachim Warschat; | arxiv-cs.CL | 2024-03-21 |
582 | Extracting Emotion Phrases from Tweets Using BART Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we applied an approach to sentiment analysis based on a question-answering framework. |
Mahdi Rezapour; | arxiv-cs.CL | 2024-03-20 |
583 | Retina Vision Transformer (RetinaViT): Introducing Scaled Patches Into Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Humans see low and high spatial frequency components at the same time, and combine the information from both to form a visual scene. Drawing on this neuroscientific inspiration, we propose an altered Vision Transformer architecture where patches from scaled down versions of the input image are added to the input of the first Transformer Encoder layer. |
Yuyang Shu; Michael E. Bain; | arxiv-cs.CV | 2024-03-20 |
584 | AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods for integrating such multimodal information often stumble, leading to less-than-ideal outcomes in the task of facial action unit detection. To overcome these shortcomings, we propose a novel approach utilizing audio-visual multimodal data. |
JUN YU et. al. | arxiv-cs.CV | 2024-03-20 |
585 | Open Access NAO (OAN): A ROS2-based Software Framework for HRI Applications with The NAO Robot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new software framework for HRI experimentation with the sixth version of the common NAO robot produced by the United Robotics Group. |
Antonio Bono; Kenji Brameld; Luigi D’Alfonso; Giuseppe Fedele; | arxiv-cs.RO | 2024-03-20 |
586 | Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature. |
Sheetal Harris; Jinshuo Liu; Hassan Jalil Hadi; Yue Cao; | arxiv-cs.CL | 2024-03-20 |
587 | End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a neuro-symbolic framework for jointly learning structured states and symbolic policies, whose key idea is to distill the vision foundation model into an efficient perception module and refine it during policy learning. |
LIRUI LUO et. al. | arxiv-cs.AI | 2024-03-19 |
588 | Navigating Compiler Errors with AI Assistance — A Study of GPT Hints in An Introductory Programming Course Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler errors within a platform for automated assessment of programming assignments. |
Maciej Pankiewicz; Ryan S. Baker; | arxiv-cs.SE | 2024-03-19 |
589 | Generating Automatic Feedback on UI Mockups with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the potential of using large language models for automatic feedback. |
Peitong Duan; Jeremy Warner; Yang Li; Bjoern Hartmann; | arxiv-cs.HC | 2024-03-19 |
590 | Automated Data Curation for Robust Language Model Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an automated data curation pipeline CLEAR (Confidence-based LLM Evaluation And Rectification) for instruction tuning datasets, that can be used with any LLM and fine-tuning procedure. |
Jiuhai Chen; Jonas Mueller; | arxiv-cs.CL | 2024-03-19 |
591 | Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks where multiple outputs are required for a single shared input. |
BO-RU LU et. al. | arxiv-cs.CL | 2024-03-19 |
592 | TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an end-to-end model called TT-BLIP that applies the bootstrapping language-image pretraining for unified vision-language understanding and generation (BLIP) for three types of information: BERT and BLIP\textsubscript{Txt} for text, ResNet and BLIP\textsubscript{Img} for images, and bidirectional BLIP encoders for multimodal information. |
Eunjee Choi; Jong-Kook Kim; | arxiv-cs.LG | 2024-03-19 |
593 | Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the application of GPT-4, a large language model, for automatic information extraction from UK Employment Tribunal (UKET) cases. |
Joana Ribeiro de Faria; Huiyuan Xie; Felix Steffek; | arxiv-cs.CL | 2024-03-19 |
594 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is not aligned with the prompt compression objective. To address these issues, we propose a data distillation procedure to derive knowledge from an LLM to compress prompts without losing crucial information, and meantime, introduce an extractive text compression dataset. |
ZHUOSHI PAN et. al. | arxiv-cs.CL | 2024-03-19 |
595 | CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the dataset and benchmark naive, traditional, and Transformer models. |
Korbinian Randl; John Pavlopoulos; Aron Henriksson; Tony Lindgren; | arxiv-cs.CL | 2024-03-18 |
596 | Shifting The Lens: Detecting Malware in Npm Ecosystem with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this study is to assist security analysts in identifying malicious packages through the empirical study of large language models (LLMs) to detect potential malware in the npm ecosystem. |
Nusrat Zahan; Philipp Burckhardt; Mikola Lysenko; Feross Aboukhadijeh; Laurie Williams; | arxiv-cs.CR | 2024-03-18 |
597 | Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social … |
CHENYU HOU et. al. | Proceedings of the 14th Learning Analytics and Knowledge … | 2024-03-18 |
598 | How Far Are We on The Decision-Making of LLMs? Evaluating LLMs’ Gaming Ability in Multi-Agent Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Subsequently, we introduce our framework, GAMA-Bench, including eight classical multi-agent games. |
JEN-TSE HUANG et. al. | arxiv-cs.AI | 2024-03-18 |
599 | Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its impressive capabilities, the financial cost associated with GPT-4V’s inference presents a substantial barrier for its wide use. To address this challenge, our work introduces Collage Prompting, a budget-friendly prompting approach that concatenates multiple images into a single visual input. |
Siyu Xu; Yunke Wang; Daochang Liu; Chang Xu; | arxiv-cs.CV | 2024-03-18 |
600 | Embracing The Generative AI Revolution: Advancing Tertiary Education in Cybersecurity with GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigated the impact of GPTs, specifically ChatGPT, on tertiary education in cybersecurity, and provided recommendations for universities to adapt their curricula to meet the evolving needs of the industry. |
Raza Nowrozy; David Jam; | arxiv-cs.CY | 2024-03-17 |
601 | Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. |
Daniel Enström; Viktor Kjellberg; Moa Johansson; | arxiv-cs.LG | 2024-03-17 |
602 | An Empirical Study on JIT Defect Prediction Based on BERT-style Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction. |
Yuxiang Guo; Xiaopeng Gao; Bo Jiang; | arxiv-cs.SE | 2024-03-17 |
603 | Using An LLM to Turn Sign Spottings Into Spoken Language Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a hybrid SLT approach, Spotter+GPT, that utilizes a sign spotter and a powerful Large Language Model (LLM) to improve SLT performance. |
Ozge Mercanoglu Sincan; Necati Cihan Camgoz; Richard Bowden; | arxiv-cs.CV | 2024-03-15 |
604 | ATOM: Asynchronous Training of Massive Models for Deep Learning in A Decentralized Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce \atom, a resilient distributed training framework designed for asynchronous training of vast models in a decentralized setting using cost-effective hardware, including consumer-grade GPUs and Ethernet. |
Xiaofeng Wu; Jia Rao; Wei Chen; | arxiv-cs.DC | 2024-03-15 |
605 | From Words to Routes: Applying Large Language Models to Vehicle Routing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of LLMs in these tasks leads us to wonder: What is the ability of LLMs to solve vehicle routing problems (VRPs) with natural language task descriptions? In this work, we study this question in three steps. |
Zhehui Huang; Guangyao Shi; Gaurav S. Sukhatme; | arxiv-cs.CL | 2024-03-15 |
606 | GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called GiT, simultaneously applicable for various vision tasks only with a vanilla ViT. |
HAIYANG WANG et. al. | arxiv-cs.CV | 2024-03-14 |
607 | Reality Bites: Assessing The Realism of Driving Scenarios with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) are demonstrating outstanding potential for tasks such as text generation, summarization, and classification. |
Jiahui Wu; Chengjie Lu; Aitor Arrieta; Tao Yue; Shaukat Ali; | arxiv-cs.SE | 2024-03-14 |
608 | Sabiá-2: A New Generation of Portuguese Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Sabi\’a-2, a family of large language models trained on Portuguese texts. |
Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira; Ramon Pires; | arxiv-cs.CL | 2024-03-14 |
609 | FBPT: A Fully Binary Point Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices. |
Zhixing Hou; Yuzhang Shang; Yan Yan; | arxiv-cs.CV | 2024-03-14 |
610 | Evaluating LLMs for Gender Disparities in Notable Persons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect hallucinated responses or to altogether decline to even answer prompt at all. |
Lauren Rhue; Sofie Goethals; Arun Sundararajan; | arxiv-cs.CL | 2024-03-14 |
611 | AI on AI: Exploring The Utility of GPT As An Expert Annotator of AI Publications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results indicate that with effective prompt engineering, chatbots can be used as reliable data annotators even where subject-area expertise is required. To evaluate the utility of chatbot-annotated datasets on downstream classification tasks, we train a new classifier on GPT-labeled data and compare its performance to the arXiv-trained model. |
Autumn Toney-Wails; Christian Schoeberl; James Dunham; | arxiv-cs.CL | 2024-03-14 |
612 | Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Targeting at VL PEFT tasks, we propose a family of operations, called routing functions, to enhance VL alignment in the low-rank bottlenecks. |
Tingyu Qu; Tinne Tuytelaars; Marie-Francine Moens; | arxiv-cs.CV | 2024-03-14 |
613 | ViTCN: Vision Transformer Contrastive Network For Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Zhang et al proposed a dataset called RAVEN which can be used to test Machine Learning model abstract reasoning ability. In this paper, we purposed Vision Transformer Contrastive Network which build on previous work with the Contrastive Perceptual Inference network (CoPiNet), which set a new benchmark for permutationinvariant models Raven Progressive Matrices by incorporating contrast effects from psychology, cognition, and education, and extends this foundation by leveraging the cutting-edge Vision Transformer architecture. |
Bo Song; Yuanhao Xu; Yichao Wu; | arxiv-cs.CV | 2024-03-14 |
614 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
DONG YUAN et. al. | arxiv-cs.CL | 2024-03-13 |
615 | GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored By Compliance, Context and Attribute Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security. |
Raza Nowrozy; Khandakar Ahmed; Hua Wang; | arxiv-cs.CY | 2024-03-13 |
616 | Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare four of the currently most relevant large, web-crawled corpora (CC100, MaCoCu, mC4 and OSCAR) across eleven lower-resourced European languages. |
RIK VAN NOORD et. al. | arxiv-cs.CL | 2024-03-13 |
617 | In-context Learning Enables Multimodal Large Language Models to Classify Cancer Pathology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. |
DYKE FERBER et. al. | arxiv-cs.CV | 2024-03-12 |
618 | Towards A Clinically Accessible Radiology Foundation Model: Open-access and Lightweight, with Automated Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. |
JUAN MANUEL ZAMBRANO CHAVES et. al. | arxiv-cs.CL | 2024-03-12 |
619 | Rethinking ASTE: A Minimalist Tagging Scheme Alongside Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches to ASTE often complicate the task with additional structures or external data. In this research, we propose a novel tagging scheme and employ a contrastive learning approach to mitigate these challenges. |
Qiao Sun; Liujia Yang; Minghao Ma; Nanyang Ye; Qinying Gu; | arxiv-cs.CL | 2024-03-12 |
620 | The Future of Document Indexing: GPT and Donut Revolutionize Table of Content Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model. |
Degaga Wolde Feyisa; Haylemicheal Berihun; Amanuel Zewdu; Mahsa Najimoghadam; Marzieh Zare; | arxiv-cs.IR | 2024-03-12 |
621 | GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present GPT Reddit Dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset designed to assess the performance of detection models in identifying generated responses from ChatGPT. |
Zubair Qazi; William Shiao; Evangelos E. Papalexakis; | arxiv-cs.CL | 2024-03-12 |
622 | SIFiD: Reassess Summary Factual Inconsistency Detection with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4. |
JIUDING YANG et. al. | arxiv-cs.CL | 2024-03-12 |
623 | Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Context-observant LLM-Enabled Autonomous Robots (CLEAR) platform offers a general solution for large language model (LLM)-enabled robot autonomy. CLEAR-controlled robots use … |
JACOB P. MACDONALD et. al. | Companion of the 2024 ACM/IEEE International Conference on … | 2024-03-11 |
624 | Stealing Part of A Production Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. |
NICHOLAS CARLINI et. al. | arxiv-cs.CR | 2024-03-11 |
625 | QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}. |
JIUN-MAN CHEN et. al. | arxiv-cs.CV | 2024-03-11 |
626 | Development of A Reliable and Accessible Caregiving Language Model (CaLM) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we focused on caregivers of individuals with Alzheimer’s Disease Related Dementias. |
BAMBANG PARMANTO et. al. | arxiv-cs.CL | 2024-03-11 |
627 | Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Which we use in another set of transformer encoder layers to learn the inter-chunk representations. We analyze the adaptability of Large Language Models (LLMs) with multi-billion parameters (GPT-Neo, and GPT-J) with the hierarchical framework of MESc and compare them with their standalone performance on legal texts. |
Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki; | arxiv-cs.CL | 2024-03-11 |
628 | S3L: Spectrum Transformer for Self-Supervised Learning in Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the realm of Earth observation and remote sensing data analysis, the advancement of hyperspectral imaging (HSI) classification technology is of paramount importance. … |
Hufeng Guo; Wenyi Liu; | Remote. Sens. | 2024-03-10 |
629 | LLMs Still Can’t Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard’s Capacity to Handle Object-Oriented Programming Assignments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we experimented with three prominent LLMs – GPT-3.5, GPT-4, and Bard – to solve real-world OOP exercises used in educational settings, subsequently validating their solutions using an Automatic Assessment Tool (AAT). |
Bruno Pereira Cipriano; Pedro Alves; | arxiv-cs.SE | 2024-03-10 |
630 | GPT As Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. |
HAO LU et. al. | arxiv-cs.CV | 2024-03-09 |
631 | Will GPT-4 Run DOOM? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4’s reasoning and planning capabilities extend to the 1993 first-person shooter Doom. |
Adrian de Wynter; | arxiv-cs.CL | 2024-03-08 |
632 | To Err Is Human, But Llamas Can Learn It Too Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs). |
Agnes Luhtaru; Taido Purason; Martin Vainikko; Maksym Del; Mark Fishel; | arxiv-cs.CL | 2024-03-08 |
633 | How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analyses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named “GPTRadScore”. |
QINGQING ZHU et. al. | arxiv-cs.AI | 2024-03-08 |
634 | RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models’ reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. |
ZIHAO WANG et. al. | arxiv-cs.CL | 2024-03-08 |
635 | The Impact of Quantization on The Robustness of Transformer-based Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the effect of quantization on the robustness of Transformer-based models. |
Seyed Parsa Neshaei; Yasaman Boreshban; Gholamreza Ghassem-Sani; Seyed Abolghasem Mirroshandel; | arxiv-cs.CL | 2024-03-08 |
636 | An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design an error-based human annotation framework to assess the GPT-4’s simplification capabilities. |
Xuanxin Wu; Yuki Arase; | arxiv-cs.CL | 2024-03-07 |
637 | Federated Recommendation Via Hybrid Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism. |
Huimin Zeng; Zhenrui Yue; Qian Jiang; Dong Wang; | arxiv-cs.IR | 2024-03-07 |
638 | A Large Scale RCT on Effective Error Messages in CS1 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial conducted in an open-access, online introductory computer … |
Sierra Wang; John C. Mitchell; C. Piech; | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-07 |
639 | Feedback-Generation for Programming Exercises With GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. |
Imen Azaiz; Natalie Kiesler; Sven Strickroth; | arxiv-cs.AI | 2024-03-07 |
640 | Assessing The Aesthetic Evaluation Capabilities of GPT-4 with Vision: Insights from Group and Individual Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, it has been recognized that large language models demonstrate high performance on various intellectual tasks. |
Yoshia Abe; Tatsuya Daikoku; Yasuo Kuniyoshi; | arxiv-cs.AI | 2024-03-06 |
641 | Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead of following the popular practice of directly translating existing English resources into Japanese (e.g., Japanese-Alpaca), we propose an efficient self-instruct method based on GPT-4. |
YIKUN SUN et. al. | arxiv-cs.CL | 2024-03-06 |
642 | Whodunit: Classifying Code As Human Authored or GPT-4 Generated — A Case Study on CodeChef Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study shows that code stylometry is a promising approach for distinguishing between GPT-4 generated code and human-authored code. |
Oseremen Joy Idialu; Noble Saji Mathews; Rungroj Maipradit; Joanne M. Atlee; Mei Nagappan; | arxiv-cs.SE | 2024-03-06 |
643 | Probabilistic Topic Modelling with Transformer Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Transformer-Representation Neural Topic Model (TNTM), which combines the benefits of topic representations in transformer-based embedding spaces and probabilistic modelling. |
Arik Reuter; Anton Thielmann; Christoph Weisser; Benjamin Säfken; Thomas Kneib; | arxiv-cs.LG | 2024-03-06 |
644 | Can Large Language Models Do Analytical Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports. |
YEBOWEN HU et. al. | arxiv-cs.CL | 2024-03-06 |
645 | Designing Informative Metrics for Few-Shot Example Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a complexity-based prompt selection approach for sequence tagging tasks. |
Rishabh Adiga; Lakshminarayanan Subramanian; Varun Chandrasekaran; | arxiv-cs.CL | 2024-03-06 |
646 | Japanese-English Sentence Translation Exercises Dataset for Automatic Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the task of automatic assessment of Sentence Translation Exercises (STEs), that have been used in the early stage of L2 language learning. |
NAOKI MIURA et. al. | arxiv-cs.CL | 2024-03-05 |
647 | AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys. |
Anjalee De Silva; Janaka L. Wijekoon; Rashini Liyanarachchi; Rrubaa Panchendrarajan; Weranga Rajapaksha; | arxiv-cs.AI | 2024-03-05 |
648 | Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled By GPT-4 for Enhanced Interpretability and Public Engagement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: And the public requires complex techniques to inquiry and understand socio-cultural and institutional factors, often hinders the public’s understanding of flood risks. To overcome these challenges, our study introduces an innovative solution: a customized AI Assistant powered by the GPT-4 Large Language Model. |
Rafaela Martelo; Ruo-Qian Wang; | arxiv-cs.AI | 2024-03-05 |
649 | On The Limitations of Fine-tuned Judge Models for LLM Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs. |
HUI HUANG et. al. | arxiv-cs.CL | 2024-03-05 |
650 | Design2Code: How Far Are We From Automating Front-End Engineering? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This can enable a new paradigm of front-end development, in which multimodal LLMs might directly convert visual designs into code implementations. In this work, we formalize this as a Design2Code task and conduct comprehensive benchmarking. |
Chenglei Si; Yanzhe Zhang; Zhengyuan Yang; Ruibo Liu; Diyi Yang; | arxiv-cs.CL | 2024-03-05 |
651 | InjectTST: A Transformer Method of Injecting Global Information Into Independent Channels for Long Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, an injection method for global information into channel-independent Transformer, InjectTST, is proposed in this paper. |
CE CHI et. al. | arxiv-cs.LG | 2024-03-05 |
652 | Evolution Transformer: In-Context Evolutionary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: An alternative promising approach is to leverage data and directly discover powerful optimization principles via meta-optimization. In this work, we follow such a paradigm and introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies. |
Robert Tjarko Lange; Yingtao Tian; Yujin Tang; | arxiv-cs.AI | 2024-03-05 |
653 | JMI at SemEval 2024 Task 3: Two-step Approach for Multimodal ECAC Using In-context Learning with GPT and Instruction-tuned Llama Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents our system development for SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations. |
Mohammed Abbas Ansari; Chandni Saxena; Tanvir Ahmad; | arxiv-cs.CL | 2024-03-05 |
654 | PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In today’s landscape where role-play is a common strategy when using LLMs, our research highlights the need for caution, as models that adopt specific personas with personalities potentially also alter their reasoning abilities in an unexpected manner. |
FIONA ANTING TAN et. al. | arxiv-cs.CL | 2024-03-04 |
655 | Using LLMs for The Extraction and Normalization of Product Attribute Values Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Web Data Commons – Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments. |
Alexander Brinkmann; Nick Baumann; Christian Bizer; | arxiv-cs.CL | 2024-03-04 |
656 | What Is Missing in Multilingual Visual Reasoning and How to Fix It Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: NLP models today strive for supporting multiple languages and modalities, improving accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal … |
Yueqi Song; Simran Khanuja; Graham Neubig; | ArXiv | 2024-03-03 |
657 | LM4OPT: Unveiling The Potential of Large Language Models in Formulating Mathematical Optimization Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the rapidly evolving field of natural language processing, the translation of linguistic descriptions into mathematical formulation of optimization problems presents a formidable challenge, demanding intricate understanding and processing capabilities from Large Language Models (LLMs). This study compares prominent LLMs, including GPT-3.5, GPT-4, and Llama-2-7b, in zero-shot and one-shot settings for this task. |
Tasnim Ahmed; Salimur Choudhury; | arxiv-cs.CL | 2024-03-02 |
658 | Analysis of Privacy Leakage in Federated Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need … |
Minh N. Vu; Truc D. T. Nguyen; Tre’ R. Jeter; My T. Thai; | International Conference on Artificial Intelligence and … | 2024-03-02 |
659 | Improving The Validity of Automatically Generated Feedback Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address both problems of automatically generating and evaluating feedback while considering both correctness and alignment. |
Alexander Scarlatos; Digory Smith; Simon Woodhead; Andrew Lan; | arxiv-cs.CL | 2024-03-02 |
660 | LAB: Large-Scale Alignment for ChatBots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. |
SHIVCHANDER SUDALAIRAJ et. al. | arxiv-cs.CL | 2024-03-01 |
661 | STP: Self-supervised Transfer Learning Based on Transformer for Noninvasive Blood Pressure Estimation Using Photoplethysmography Related Papers Related Patents Related Grants Related Venues Related Experts View |
CHENBIN MA et. al. | Expert Syst. Appl. | 2024-03-01 |
662 | LCDFormer: Long-term Correlations Dual-graph Transformer for Traffic Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jiongbiao Cai; Chia-Hung Wang; Kun Hu; | Expert Syst. Appl. | 2024-03-01 |
663 | Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese. |
Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari; | arxiv-cs.CL | 2024-03-01 |
664 | 2-D Transformer-Based Approach for Process Monitoring of Metal 3-D Printing Via Coaxial High-Speed Imaging Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Defects in the metal 3-D printing process exhibit randomness and low frequency, making them difficult to predict and control. This severely hinders the application of this … |
WEIHAO ZHANG et. al. | IEEE Transactions on Industrial Informatics | 2024-03-01 |
665 | GTMFuse: Group-attention Transformer-driven Multiscale Dense Feature-enhanced Network for Infrared and Visible Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View |
LIYE MEI et. al. | Knowl. Based Syst. | 2024-03-01 |
666 | A Systematic Evaluation of Large Language Models for Generating Programming Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4 employing the optimal prompt strategy outperforms 85 percent of human participants. |
Wenpin Hou; Zhicheng Ji; | arxiv-cs.SE | 2024-03-01 |
667 | K-NN Attention-based Video Vision Transformer for Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Weirong Sun; Yujun Ma; Ruili Wang; | Neurocomputing | 2024-03-01 |
668 | Multi-modal Person Re-identification Based on Transformer Relational Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIANGTIAN ZHENG et. al. | Inf. Fusion | 2024-03-01 |
669 | Query-OPT: Optimizing Inference of Large Language Models Via Multi-Query Instructions in Meeting Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, repeated calls to the LLM inference endpoints would significantly increase the costs of using them in production, making LLMs impractical for many real-world use cases. To address this problem, in this paper, we investigate whether combining the queries for the same input context in a single prompt to minimize repeated calls can be successfully used in meeting summarization. |
Md Tahmid Rahman Laskar; Elena Khasanova; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN; | arxiv-cs.CL | 2024-02-29 |
670 | Here’s A Free Lunch: Sanitizing Backdoored Models with Model Merge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared to multiple advanced defensive approaches, our method offers an effective and efficient inference-stage defense against backdoor attacks on classification and instruction-tuned tasks without additional resources or specific knowledge. |
ANSH ARORA et. al. | arxiv-cs.CL | 2024-02-29 |
671 | PeLLE: Encoder-based Language Models for Brazilian Portuguese Based on Open Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present PeLLE, a family of large language models based on the RoBERTa architecture, for Brazilian Portuguese, trained on curated, open data from the Carolina corpus. |
GUILHERME LAMARTINE DE MELLO et. al. | arxiv-cs.CL | 2024-02-29 |
672 | RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent. |
SHAOTENG LIU et. al. | arxiv-cs.AI | 2024-02-29 |
673 | PROC2PDDL: Open-Domain Planning Representations from Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. |
TIANYI ZHANG et. al. | arxiv-cs.CL | 2024-02-29 |
674 | Can GPT Improve The State of Prior Authorization Via Guideline Based Automated Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate whether GPT can validate numerous key factors, in turn helping health plans reach a decision drastically faster. |
Shubham Vatsal; Ayush Singh; Shabnam Tafreshi; | arxiv-cs.CL | 2024-02-28 |
675 | A Language Model Based Framework for New Concept Placement in Ontologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection. |
Hang Dong; Jiaoyan Chen; Yuan He; Yongsheng Gao; Ian Horrocks; | arxiv-cs.CL | 2024-02-27 |
676 | STC-ViT: Spatio Temporal Continuous Vision Transformer for Weather Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, transformers are discrete models which limit their ability to learn the continuous spatio-temporal features of the dynamical weather system. We address this issue with STC-ViT, a Spatio-Temporal Continuous Vision Transformer for weather forecasting. |
Hira Saleem; Flora Salim; Cormac Purcell; | arxiv-cs.LG | 2024-02-27 |
677 | Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters. |
Flavio Petruzzellis; Alberto Testolin; Alessandro Sperduti; | arxiv-cs.CL | 2024-02-27 |
678 | Variational Learning Is Effective for Large Deep Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show several new use cases of IVON where we improve finetuning and model merging in Large Language Models, accurately predict generalization error, and faithfully estimate sensitivity to data. |
YUESONG SHEN et. al. | arxiv-cs.LG | 2024-02-27 |
679 | Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The majority of the recent initiatives targeting medium to low-resource languages produced relatively small annotated datasets, with a skewed distribution, posing challenges for the development of sophisticated propaganda detection models. To address this challenge, we carefully develop the largest propaganda dataset to date, ArPro, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques. |
Maram Hasanain; Fatema Ahmed; Firoj Alam; | arxiv-cs.CL | 2024-02-27 |
680 | Latent Attention for Linear Time Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The time complexity of the standard attention mechanism in a transformer scales quadratically with the length of the sequence. We introduce a method to reduce this to linear scaling with time, based on defining attention via latent vectors. |
Rares Dolga; Marius Cobzarenco; David Barber; | arxiv-cs.CL | 2024-02-27 |
681 | CAPT: Category-level Articulation Estimation from A Single Point Cloud Using Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CAPT: category-level articulation estimation from a point cloud using Transformer. |
Lian Fu; Ryoichi Ishikawa; Yoshihiro Sato; Takeshi Oishi; | arxiv-cs.CV | 2024-02-27 |
682 | GeoLLM: Extracting Geospatial Knowledge from Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged for geospatial prediction tasks. |
ROHIN MANVI et. al. | iclr | 2024-02-26 |
683 | An LLM Can Fool Itself: A Prompt-Based Adversarial Attack IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an efficient tool to audit the LLM’s adversarial robustness via a prompt-based adversarial attack (PromptAttack). |
XILIE XU et. al. | iclr | 2024-02-26 |
684 | Massive Editing for Large Language Models Via Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameter using the normal equation. |
Chenmien Tan; Ge Zhang; Jie Fu; | iclr | 2024-02-26 |
685 | Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation can significantly benefit from large-scale video generative pre-training. |
HONGTAO WU et. al. | iclr | 2024-02-26 |
686 | CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we design a special Transformer, i.e., **C**hannel **A**ligned **R**obust Blen**d** Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting. |
XUE WANG et. al. | iclr | 2024-02-26 |
687 | MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks. |
PAN LU et. al. | iclr | 2024-02-26 |
688 | MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We believe that the enhanced multi-modal generation capabilities of GPT-4 stem from the utilization of sophisticated large language models (LLM). To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen advanced LLM, Vicuna, using one projection layer. |
Deyao Zhu; Jun Chen; Xiaoqian Shen; Xiang Li; Mohamed Elhoseiny; | iclr | 2024-02-26 |
689 | AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that vector graphics are typically encoded using low-level graphics primitives, generating them directly is difficult. To address this, we propose the use of TikZ, a well-known abstract graphics language that can be compiled to vector graphics, as an intermediate representation of scientific figures. |
Jonas Belouadi; Anne Lauscher; Steffen Eger; | iclr | 2024-02-26 |
690 | The Devil Is in The Neurons: Interpreting and Mitigating Social Biases in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we try to unveil the mystery of social bias inside language models by introducing the concept of {\sc Social Bias Neurons}. |
YAN LIU et. al. | iclr | 2024-02-26 |
691 | Transformer-VQ: Linear-Time Transformers Via Vector Quantization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time. |
Lucas Dax Lingle; | iclr | 2024-02-26 |
692 | Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the effect of code on enhancing LLMs’ reasoning capability by introducing different constraints on the Code Usage Frequency of GPT-4 Code Interpreter. |
AOJUN ZHOU et. al. | iclr | 2024-02-26 |
693 | If in A Crowdsourced Data Annotation Pipeline, A GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were … |
Zeyu He; Huang Chieh-Yang; C. C. Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang; | ArXiv | 2024-02-26 |
694 | Graph Transformers on EHRs: Better Representation Improves Downstream Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose GT-BEHRT, a new approach that leverages temporal visit embeddings extracted from a graph transformer and uses a BERT-based model to obtain more robust patient representations, especially on longer EHR sequences. |
Raphael Poulain; Rahmatollah Beheshti; | iclr | 2024-02-26 |
695 | The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity. |
Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Re; | iclr | 2024-02-26 |
696 | Looped Transformers Are Better at Learning Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures. |
Liu Yang; Kangwook Lee; Robert D Nowak; Dimitris Papailiopoulos; | iclr | 2024-02-26 |
697 | NOLA: Compressing LoRA Using Linear Combination of Random Basis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce NOLA, which overcomes the rank one lower bound present in LoRA. |
Soroush Abbasi Koohpayegani; Navaneet K L; Parsa Nooralinejad; Soheil Kolouri; Hamed Pirsiavash; | iclr | 2024-02-26 |
698 | Xformer: Hybrid X-Shaped Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a hybrid X-shaped vision Transformer, named Xformer, which performs notably on image denoising tasks. |
JIALE ZHANG et. al. | iclr | 2024-02-26 |
699 | DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT). |
XIANJUN YANG et. al. | iclr | 2024-02-26 |
700 | Masked Distillation Advances Self-Supervised Transformer Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a masked image modelling (MIM) based self-supervised neural architecture search method specifically designed for vision transformers, termed as MaskTAS, which completely avoids the expensive costs of data labeling inherited from supervised learning. |
CAIXIA YAN et. al. | iclr | 2024-02-26 |
701 | The Reversal Curse: LLMs Trained on “A Is B” Fail to Learn “B Is A” Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is worth noting, however, that if ”_A_ is _B_” appears _in-context_, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as ”Uriah Hawthorne is the composer of _Abyssal Melodies_” and showing that they fail to correctly answer ”Who composed _Abyssal Melodies? |
LUKAS BERGLUND et. al. | iclr | 2024-02-26 |
702 | Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We used an ablation study to show that joint training on neuronal responses and behavior boosted performance, highlighting the model’s ability to associate behavioral and neural representations in an unsupervised manner. |
Antonis Antoniades; Yiyi Yu; Joe S Canzano; William Yang Wang; Spencer Smith; | iclr | 2024-02-26 |
703 | Quantum Linear Algebra Is All You Need for Transformer Architectures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative machine learning methods such as large-language models are revolutionizing the creation of text and images. While these models are powerful they also harness a large … |
Naixu Guo; Zhan Yu; Aman Agrawal; P. Rebentrost; | ArXiv | 2024-02-26 |
704 | Is Self-Repair A Silver Bullet for Code Generation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze Code Llama, GPT-3.5 and GPT-4’s ability to perform self-repair on problems taken from HumanEval and APPS. |
Theo X. Olausson; Jeevana Priya Inala; Chenglong Wang; Jianfeng Gao; Armando Solar-Lezama; | iclr | 2024-02-26 |
705 | MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The recently released GPT-4 Code Interpreter has demonstrated remarkable proficiency in solving challenging math problems, primarily attributed to its ability to seamlessly reason with natural language, generate code, execute code, and continue reasoning based on the execution output. In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities. |
KE WANG et. al. | iclr | 2024-02-26 |
706 | If in A Crowdsourced Data Annotation Pipeline, A GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were … |
Zeyu He; Chieh-Yang Huang; Chien-Kuang Cornelia Ding; Shaurya Rohatgi; Ting-Hao ‘Kenneth’ Huang; | arxiv-cs.HC | 2024-02-26 |
707 | Test-Time Training on Nearest Neighbors for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We build a large-scale distributed index based on text embeddings of the Pile dataset. |
Moritz Hardt; Yu Sun; | iclr | 2024-02-26 |
708 | Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring The Design of Next-generation Neuromorphic Chips IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a general Transformer-based SNN architecture, termed as “Meta-SpikeFormer, whose goals are: (1) *Lower-power*, supports the spike-driven paradigm that there is only sparse addition in the network; (2) *Versatility*, handles various vision tasks; (3) *High-performance*, shows overwhelming performance advantages over CNN-based SNNs; (4) *Meta-architecture*, provides inspiration for future next-generation Transformer-based neuromorphic chip designs. |
MAN YAO et. al. | iclr | 2024-02-26 |
709 | Towards Open-ended Visual Quality Comparison IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Comparative settings (e.g. pairwise choice, listwise ranking) have been adopted by a wide range of subjective studies for image quality assessment (IQA), as it inherently … |
HAONING WU et. al. | ArXiv | 2024-02-26 |
710 | A Multi-Level Framework for Accelerating Training Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by a set of key observations of inter- and intra-layer similarities among feature maps and attentions that can be identified from typical training processes, we propose a multi-level framework for training acceleration. |
Longwei Zou; Han Zhang; Yangdong Deng; | iclr | 2024-02-26 |
711 | Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing (e.g., mathematical, causal) reasoning tasks. |
Murong Yue; Jie Zhao; Min Zhang; Liang Du; Ziyu Yao; | iclr | 2024-02-26 |
712 | HPE Transformer: Learning to Optimize Multi-Group Multicast Beamforming Under Nonconvex QoS Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate real-time implementations, this paper proposes a deep learning-based approach, which consists of a beamforming structure assisted problem transformation and a customized neural network architecture named hierarchical permutation equivariance (HPE) transformer. |
Yang Li; Ya-Feng Liu; | arxiv-cs.IT | 2024-02-25 |
713 | From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatility IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education. |
PRAVNEET KAUR et. al. | arxiv-cs.CL | 2024-02-25 |
714 | Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including semantic understanding, intelligent writing, and reasoning, paving the way for a more generalized form of artificial intelligence. |
Shuning Huo; Yafei Xiang; Hanyi Yu; Mengran Zhu; Yulu Gong; | arxiv-cs.CL | 2024-02-25 |
715 | SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection. |
Ayan Datta; Aryan Chandramania; Radhika Mamidi; | arxiv-cs.CL | 2024-02-24 |
716 | Increasing SAM Zero-Shot Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study develops and evaluates a novel multimodal medical image zero-shot segmentation algorithm named Text-Visual-Prompt SAM (TV-SAM) without any manual annotations. |
ZEKUN JIANG et. al. | arxiv-cs.CV | 2024-02-24 |
717 | A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the potential of large language models to generate patient summaries based on doctors’ notes and study the effect of training data on the faithfulness and quality of the generated summaries. |
STEFAN HEGSELMANN et. al. | arxiv-cs.CL | 2024-02-23 |
718 | Self-Supervised Pre-Training for Table Structure Recognition Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we resolve the issue by proposing a self-supervised pre-training (SSP) method for TSR transformers. |
ShengYun Peng; Seongmin Lee; Xiaojing Wang; Rajarajeswari Balasubramaniyan; Duen Horng Chau; | arxiv-cs.CV | 2024-02-23 |
719 | ArabianGPT: Native Arabic GPT-based Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, there is a theoretical and practical imperative for developing LLMs predominantly focused on Arabic linguistic elements. To address this gap, this paper proposes ArabianGPT, a series of transformer-based models within the ArabianLLM suite designed explicitly for Arabic. |
Anis Koubaa; Adel Ammar; Lahouari Ghouti; Omar Najar; Serry Sibaee; | arxiv-cs.CL | 2024-02-23 |
720 | Advancing Parameter Efficiency in Fine-tuning Via Representation Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing PEFT methods pose challenges in hyperparameter selection, such as choosing the rank for LoRA or Adapter, or specifying the length of soft prompts. To address these challenges, we propose a novel fine-tuning approach for neural models, named Representation EDiting (RED), which modifies the representations generated at some layers through the application of scaling and biasing operations. |
MULING WU et. al. | arxiv-cs.LG | 2024-02-23 |
721 | Towards Efficient Active Learning in NLP Via Pretrained Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. |
Artem Vysogorets; Achintya Gopal; | arxiv-cs.LG | 2024-02-23 |
722 | Multimodal Transformer With A Low-Computational-Cost Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multimodal Transformers significantly suffer from a quadratic complexity of the multi-head attention with the input sequence length, especially as the number of modalities increases. To address this, we introduce Low-Cost Multimodal Transformer (LoCoMT), a novel multimodal attention mechanism that aims to reduce computational cost during training and inference with minimal performance loss. |
Sungjin Park; Edward Choi; | arxiv-cs.LG | 2024-02-23 |
723 | A First Look at GPT Apps: Landscape and Vulnerability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. |
ZEJUN ZHANG et. al. | arxiv-cs.CR | 2024-02-23 |
724 | RoboScript: Code Generation for Free-Form Manipulation Tasks Across Real and Simulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies put much effort into general common sense reasoning and task planning capabilities of large-scale language or multi-modal models, relatively little effort on ensuring the deployability of generated code on real robots, and other fundamental components of autonomous robot systems including robot perception, motion planning, and control. To bridge this “ideal-to-real” gap, this paper presents \textbf{RobotScript}, a platform for 1) a deployable robot manipulation pipeline powered by code generation; and 2) a code generation benchmark for robot manipulation tasks in free-form natural language. |
JUNTING CHEN et. al. | arxiv-cs.RO | 2024-02-22 |
725 | Whose LLM Is It Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse inputs. |
Ariel Rosenfeld; Teddy Lazebnik; | arxiv-cs.CL | 2024-02-22 |
726 | Tokenization Counts: The Impact of Tokenization on Arithmetic in Frontier LLMs Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Tokenization, the division of input text into input tokens, is an often overlooked aspect of the large language model (LLM) pipeline and could be the source of useful or harmful … |
Aaditya K. Singh; DJ Strouse; | ArXiv | 2024-02-22 |
727 | OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. |
TIANYU ZHENG et. al. | arxiv-cs.SE | 2024-02-22 |
728 | Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker. |
Younghun Lee; Dan Goldwasser; Laura Schwab Reese; | arxiv-cs.CL | 2024-02-21 |
729 | Beyond Hate Speech: NLP’s Challenges and Opportunities in Uncovering Dehumanizing Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluates the performance of cutting-edge NLP models, including GPT-4, GPT-3.5, and LLAMA-2, in identifying dehumanizing language. |
Hezhao Zhang; Lasana Harris; Nafise Sadat Moosavi; | arxiv-cs.CL | 2024-02-21 |
730 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper highlights the best practices of the PGI, Persona, Grouping, and Intelligence, method, a strategic framework that achieved a remarkable error rate of only 3,15 percent across 4,000 responses generated by GPT in response to a real business challenge. |
Aline Ioste; | arxiv-cs.CL | 2024-02-21 |
731 | Knowledge Graph Enhanced Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of postedit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME. |
MENGQI ZHANG et. al. | arxiv-cs.CL | 2024-02-21 |
732 | TransGOP: Transformer-Based Gaze Object Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, this paper introduces Transformer into the fields of gaze object prediction and proposes an end-to-end Transformer-based gaze object prediction method named TransGOP. |
Binglu Wang; Chenxi Guo; Yang Jin; Haisheng Xia; Nian Liu; | arxiv-cs.CV | 2024-02-21 |
733 | On The Expressive Power of A Variant of The Looped Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide theoretical evidence of the expressive power of the AlgoFormer in solving some challenging problems, mirroring human-designed algorithms. |
YIHANG GAO et. al. | arxiv-cs.LG | 2024-02-21 |
734 | An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present an optimized, fine-tuned transformer-based DistilBERT model designed for the detection of phishing emails. |
Mohammad Amaz Uddin; Iqbal H. Sarker; | arxiv-cs.LG | 2024-02-21 |
735 | SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy. |
PRAKAMYA MISHRA et. al. | arxiv-cs.CL | 2024-02-21 |
736 | How Easy Is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 850 test samples divided into 6 categories, such as non-existent objects, count of objects, spatial relationship, and visual confusion. |
Yusu Qian; Haotian Zhang; Yinfei Yang; Zhe Gan; | arxiv-cs.CV | 2024-02-20 |
737 | RhythmFormer: Extracting RPPG Signals Based on Hierarchical Temporal Periodic Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose RhythmFormer, a fully end-to-end transformer-based method for extracting rPPG signals by explicitly leveraging the quasi-periodic nature of rPPG. |
Bochao Zou; Zizheng Guo; Jiansheng Chen; Huimin Ma; | arxiv-cs.CV | 2024-02-20 |
738 | Advancing GenAI Assisted Programming–A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to explore the best practices for utilizing GenAI as a programming tool, through a comparative analysis between GPT-4 and GLM-4. |
Angus Yang; Zehan Li; Jie Li; | arxiv-cs.SE | 2024-02-20 |
739 | Are ELECTRA’s Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We notice a significant drop in performance when using the ELECTRA discriminator’s last layer in comparison to earlier layers. We explore this drop and devise a way to repair ELECTRA’s embeddings, proposing a novel truncated model fine-tuning (TMFT) method. |
Ivan Rep; David Dukić; Jan Šnajder; | arxiv-cs.CL | 2024-02-20 |
740 | Transformer Tricks: Precomputing The First Layer Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This micro-paper describes a trick to speed up inference of transformers with RoPE (such as LLaMA, Mistral, PaLM, and Gemma). For these models, a large portion of the first … |
Nils Graef; | arxiv-cs.LG | 2024-02-20 |
741 | The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although there have been extensive studies on English in-context learning, multilingual in-context learning remains under-explored, and we lack an in-depth understanding of the role of demonstrations in this context. To address this gap, we conduct a multidimensional analysis of multilingual in-context learning, experimenting with 5 models from different model families, 9 datasets covering classification and generation tasks, and 56 typologically diverse languages. |
MIAORAN ZHANG et. al. | arxiv-cs.CL | 2024-02-20 |
742 | Can Large Language Models Be Used to Provide Psychological Counselling? An Analysis of GPT-4-Generated Responses Using Role-play Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For this study, we collected counseling dialogue data via role-playing scenarios involving expert counselors, and the utterances were annotated with the intentions of the counselors. |
Michimasa Inaba; Mariko Ukiyo; Keiko Takamizo; | arxiv-cs.CL | 2024-02-20 |
743 | Enhancing Large Language Models for Text-to-Testcase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: In this paper, we introduce a text-to-testcase generation approach based on a large language model (GPT-3.5) that is fine-tuned on our curated dataset with an effective prompt design. |
Saranya Alagarsamy; Chakkrit Tantithamthavorn; Chetan Arora; Aldeida Aleti; | arxiv-cs.SE | 2024-02-19 |
744 | Your Large Language Model Is Secretly A Fairness Proponent and You Should Prompt It Like One Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to this, we validate that prompting LLMs with specific roles can allow LLMs to express diverse viewpoints. Building on this insight and observation, we develop FairThinking, a pipeline designed to automatically generate roles that enable LLMs to articulate diverse perspectives for fair expressions. |
TIANLIN LI et. al. | arxiv-cs.CL | 2024-02-19 |
745 | Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking Over Larger Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages. |
Vinay Setty; | arxiv-cs.CL | 2024-02-19 |
746 | Enabling Weak LLMs to Judge Response Reliability Via Meta Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, to enable weak LLMs to effectively assess the reliability of LLM responses, we propose a novel cross-query-comparison-based method called $\textit{Meta Ranking}$ (MR). |
ZIJUN LIU et. al. | arxiv-cs.CL | 2024-02-19 |
747 | Evaluation of ChatGPT’s Smart Contract Auditing Capabilities Based on Chain of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of enhancing smart contract security audits using the GPT-4 model. |
Yuying Du; Xueyan Tang; | arxiv-cs.CR | 2024-02-19 |
748 | Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis. |
Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu; | arxiv-cs.CL | 2024-02-19 |
749 | Query-Based Adversarial Prompt Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks. |
Jonathan Hayase; Ema Borevkovic; Nicholas Carlini; Florian Tramèr; Milad Nasr; | arxiv-cs.CL | 2024-02-19 |
750 | Reflect-RL: Two-Player Online RL Fine-Tuning for LMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The benchmarks, dataset, and code involved in this work are publicly available: https://github.com/zhourunlong/Reflect-RL. |
Runlong Zhou; Simon S. Du; Beibin Li; | arxiv-cs.LG | 2024-02-19 |
751 | Tree-Planted Transformers: Unidirectional Transformer Language Models with Implicit Syntactic Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new method dubbed tree-planting: instead of explicitly generating syntactic structures, we plant trees into attention weights of unidirectional Transformer LMs to implicitly reflect syntactic structures of natural language. |
Ryo Yoshida; Taiga Someya; Yohei Oseki; | arxiv-cs.CL | 2024-02-19 |
752 | DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of LLMs for solving code-repair task. |
BERKAY BERABI et. al. | arxiv-cs.CR | 2024-02-19 |
753 | Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. |
Anna Martin-Boyle; Aahan Tyagi; Marti A. Hearst; Dongyeop Kang; | arxiv-cs.CL | 2024-02-19 |
754 | A Critical Evaluation of AI Feedback for Aligning Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation. |
ARCHIT SHARMA et. al. | arxiv-cs.LG | 2024-02-19 |
755 | Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a circuit discovery framework alternative to activation patching. |
ZHENGFU HE et. al. | arxiv-cs.LG | 2024-02-19 |
756 | Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conclusion: In this paper, we show that while GPT-4 is superior to open-source models in zero-shot report labeling, the implementation of few-shot prompting can bring open-source models on par with GPT-4. |
FELIX J. DORFNER et. al. | arxiv-cs.CL | 2024-02-19 |
757 | Creating A Fine Grained Entity Type Taxonomy Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy. |
Michael Gunn; Dohyun Park; Nidhish Kamath; | arxiv-cs.CL | 2024-02-19 |
758 | FinBen: A Holistic Financial Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. |
QIANQIAN XIE et. al. | arxiv-cs.CL | 2024-02-19 |
759 | A Curious Case of Searching for The Correlation Between Training Data and Adversarial Robustness of Transformer Textual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we want to prove that there is also a strong correlation between training data and model robustness. |
Cuong Dang; Dung D. Le; Thai Le; | arxiv-cs.LG | 2024-02-18 |
760 | Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive analysis of GPT-4, GPT-3.5 Turbo, and FLAN-T5 models in detecting framing in news headlines. |
Valeria Pastorino; Jasivan A. Sivakumar; Nafise Sadat Moosavi; | arxiv-cs.CL | 2024-02-18 |
761 | LongAgent: Scaling Language Models to 128k Context Through Multi-Agent Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose \textsc{LongAgent}, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. |
JUN ZHAO et. al. | arxiv-cs.CL | 2024-02-18 |
762 | Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, we propose a two-stage instruction tuning framework, in which VLMs are firstly finetuned on Vision-Flan and further tuned on GPT-4 synthesized data. We find this two-stage tuning framework significantly outperforms the traditional single-stage visual instruction tuning framework and achieves the state-of-the-art performance across a wide range of multi-modal evaluation benchmarks. |
ZHIYANG XU et. al. | arxiv-cs.CL | 2024-02-18 |
763 | Can Large Language Models Perform Relation-based Argument Mining? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that general-purpose Large Language Models (LLMs), appropriately primed and prompted, can significantly outperform the best performing (RoBERTa-based) baseline. |
Deniz Gorur; Antonio Rago; Francesca Toni; | arxiv-cs.CL | 2024-02-17 |
764 | Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike the traditional supervised learning approach in IR tasks, ChatGPT challenges existing paradigms, bringing forth new challenges and opportunities regarding text quality assurance, model bias, and efficiency. This paper seeks to examine the impact of ChatGPT on IR tasks and offer insights into its potential future developments. |
Yizheng Huang; Jimmy Huang; | arxiv-cs.IR | 2024-02-17 |
765 | Reasoning Before Comparison: LLM-Enhanced Semantic Similarity Metrics for Domain Specialized Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage LLM to enhance the semantic analysis and develop similarity metrics for texts, addressing the limitations of traditional unsupervised NLP metrics like ROUGE and BLEU. |
SHAOCHEN XU et. al. | arxiv-cs.CL | 2024-02-17 |
766 | Can Separators Improve Chain-of-Thought Prompting? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by human cognition, we introduce COT-SEP, a method that strategically employs separators at the end of each exemplar in CoT prompting. |
Yoonjeong Park; Hyunjin Kim; Chanyeol Choi; Junseong Kim; Jy-yong Sohn; | arxiv-cs.CL | 2024-02-16 |
767 | In Search of Needles in A 11M Haystack: Recurrent Memory Finds What LLMs Miss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. |
YURI KURATOV et. al. | arxiv-cs.CL | 2024-02-16 |
768 | Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios. To address this gap, we introduce a new benchmark, Conan, designed for extracting and analysing intricate character relation graphs from detective narratives. |
RUNCONG ZHAO et. al. | arxiv-cs.CL | 2024-02-16 |
769 | Enhancing ESG Impact Type Identification Through Early Fusion and Multilingual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the evolving landscape of Environmental, Social, and Corporate Governance (ESG) impact assessment, the ML-ESG-2 shared task proposes identifying ESG impact types. To address this challenge, we present a comprehensive system leveraging ensemble learning techniques, capitalizing on early and late fusion approaches. |
Hariram Veeramani; Surendrabikram Thapa; Usman Naseem; | arxiv-cs.CL | 2024-02-16 |
770 | Inference to The Best Explanation in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes IBE-Eval, a framework inspired by philosophical accounts on Inference to the Best Explanation (IBE) to advance the interpretation and evaluation of LLMs’ explanations. |
Dhairya Dalal; Marco Valentino; André Freitas; Paul Buitelaar; | arxiv-cs.CL | 2024-02-16 |
771 | WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. |
Chenhui Hu; Pengfei Cao; Yubo Chen; Kang Liu; Jun Zhao; | arxiv-cs.CL | 2024-02-16 |
772 | Fine-tuning Large Language Model (LLM) Artificial Intelligence Chatbots in Ophthalmology and LLM-based Evaluation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Notably, qualitative analysis and the glaucoma sub-analysis revealed clinical inaccuracies in the LLM-generated responses, which were appropriately identified by the GPT-4 evaluation. |
TING FANG TAN et. al. | arxiv-cs.AI | 2024-02-15 |
773 | L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a language agent with chain-of-3D-thoughts (L3GO), an inference-time approach that can reason about part-based 3D mesh generation of unconventional objects that current data-driven diffusion models struggle with. |
YUTARO YAMADA et. al. | arxiv-cs.AI | 2024-02-14 |
774 | Leveraging Large Language Models for Enhanced NLP Task Performance Through Knowledge Distillation and Optimized Training Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach presents a scalable methodology that reduces manual annotation costs and increases efficiency, making it especially pertinent in resource-limited and closed-network environments. |
Yining Huang; Keke Tang; Meilian Chen; | arxiv-cs.CL | 2024-02-14 |
775 | GPT-4’s Assessment of Its Performance in A USMLE-based Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates GPT-4’s assessment of its performance in healthcare applications. |
UTTAM DHAKAL et. al. | arxiv-cs.AI | 2024-02-14 |
776 | Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent years have witnessed a substantial increase in the use of deep learning to solve various natural language processing (NLP) problems. Early deep learning models were … |
JIAJIA WANG et. al. | ACM Computing Surveys | 2024-02-14 |
777 | Changes By Butterflies: Farsighted Forecasting with Group Reservoir Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Group Reservoir Transformer to predict long-term events more accurately and robustly by overcoming two challenges in Chaos: (1) the extensive historical sequences and (2) the sensitivity to initial conditions. |
Md Kowsher; Abdul Rafae Khan; Jia Xu; | arxiv-cs.LG | 2024-02-14 |
778 | An Analysis of Language Frequency and Error Correction for Esperanto Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current Grammar Error Correction (GEC) initiatives tend to focus on major languages, with less attention given to low-resource languages like Esperanto. In this article, we begin to bridge this gap by first conducting a comprehensive frequency analysis using the Eo-GP dataset, created explicitly for this purpose. |
Junhong Liang; | arxiv-cs.CL | 2024-02-14 |
779 | API Pack: A Massive Multi-Programming Language Dataset for API Call Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce API Pack, a massive multi-programming language dataset containing more than 1 million instruction-API call pairs to improve the API call generation capabilities of large language models. |
Zhen Guo; Adriana Meza Soria; Wei Sun; Yikang Shen; Rameswar Panda; | arxiv-cs.CL | 2024-02-14 |
780 | Research and Application of Transformer Based Anomaly Detection Model: A Literature Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To inspire research on Transformer-based anomaly detection, this review offers a fresh perspective on the concept of anomaly detection. |
Mingrui Ma; Lansheng Han; Chunjie Zhou; | arxiv-cs.LG | 2024-02-14 |
781 | Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The prevalence of stress-related disorders has increased significantly in recent years, necessitating scalable methods to identify affected individuals. This paper proposes a … |
AHMAD RADWAN et. al. | Int. J. Web Serv. Res. | 2024-02-14 |
782 | Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an innovative planning algorithm that integrates LLMs into the robotics context, enhancing task-focused execution and success rates. |
Vineet Bhat; Ali Umut Kaypak; Prashanth Krishnamurthy; Ramesh Karri; Farshad Khorrami; | arxiv-cs.RO | 2024-02-13 |
783 | The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in A Prospective Cardiac Rehabilitation Setting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigated the viability of using Large Language Models (LLMs) for triggering and personalizing content for Just-in-Time Adaptive Interventions (JITAIs) in digital health. |
DAVID HAAG et. al. | arxiv-cs.HC | 2024-02-13 |
784 | Auditing Counterfire: Evaluating Advanced Counterargument Generation with Evidence and Style Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We audited large language models (LLMs) for their ability to create evidence-based and stylistic counter-arguments to posts from the Reddit ChangeMyView dataset. We benchmarked … |
Preetika Verma; Kokil Jaidka; Svetlana Churina; | arxiv-cs.CL | 2024-02-13 |
785 | Measuring and Controlling Instruction (In)Stability in Language Model Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To combat attention decay and instruction drift, we propose a lightweight method called split-softmax, which compares favorably against two strong baselines. |
KENNETH LI et. al. | arxiv-cs.CL | 2024-02-13 |
786 | Combining Insights From Multiple Large Language Models Improves Diagnostic Accuracy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Background: Large language models (LLMs) such as OpenAI’s GPT-4 or Google’s PaLM 2 are proposed as viable diagnostic support tools or even spoken of as replacements for curbside consults. |
Gioele Barabucci; Victor Shia; Eugene Chu; Benjamin Harack; Nathan Fu; | arxiv-cs.AI | 2024-02-13 |
787 | Addressing Cognitive Bias in Medical Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we developed BiasMedQA, a benchmark for evaluating cognitive biases in LLMs applied to medical tasks. |
SAMUEL SCHMIDGALL et. al. | arxiv-cs.CL | 2024-02-12 |
788 | Lissard: Long and Simple Sequential Reasoning Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Lissard, a benchmark comprising seven tasks whose goal is to assess the ability of models to process and generate wide-range sequence lengths, requiring repetitive procedural execution. |
Mirelle Bueno; Roberto Lotufo; Rodrigo Nogueira; | arxiv-cs.CL | 2024-02-12 |
789 | CyberMetric: A Benchmark Dataset Based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To accurately test the general knowledge of LLMs in cybersecurity, the research community needs a diverse, accurate, and up-to-date dataset. To address this gap, we present CyberMetric-80, CyberMetric-500, CyberMetric-2000, and CyberMetric-10000, which are multiple-choice Q&A benchmark datasets comprising 80, 500, 2000, and 10,000 questions respectively. |
Norbert Tihanyi; Mohamed Amine Ferrag; Ridhi Jain; Tamas Bisztray; Merouane Debbah; | arxiv-cs.AI | 2024-02-12 |
790 | Investigating The Impact of Data Contamination of Large Language Models in Text-to-SQL Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the impact of Data Contamination on the performance of GPT-3.5 in the Text-to-SQL code-generating tasks. |
FEDERICO RANALDI et. al. | arxiv-cs.CL | 2024-02-12 |
791 | Transformer Mechanisms Mimic Frontostriatal Gating Operations When Trained on Human Working Memory Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the mechanisms that emerge within a vanilla attention-only Transformer trained on a simple sequence modeling task inspired by a task explicitly designed to study working memory gating in computational cognitive neuroscience. |
Aaron Traylor; Jack Merullo; Michael J. Frank; Ellie Pavlick; | arxiv-cs.AI | 2024-02-12 |
792 | Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long. We describe a pretraining data mixture which allows this encoder to process both short and long context sequences, and a finetuning approach that adapts this base model to retrieval with only single-sample batches. |
Jon Saad-Falcon; Daniel Y. Fu; Simran Arora; Neel Guha; Christopher Ré; | arxiv-cs.IR | 2024-02-12 |
793 | Enhancing Programming Error Messages in Real Time with Generative AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We extend this work by implementing feedback from ChatGPT for all programs submitted to our automated assessment tool, Athene, providing help for compiler, run-time, and logic errors. |
BAILEY KIMMEL et. al. | arxiv-cs.HC | 2024-02-12 |
794 | Enhancing Multi-Criteria Decision Analysis with AI: Integrating Analytic Hierarchy Process and GPT-4 for Automated Decision Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study presents a new framework that incorporates the Analytic Hierarchy Process (AHP) and Generative Pre-trained Transformer 4 (GPT-4) large language model (LLM), bringing novel approaches to cybersecurity Multiple-criteria Decision Making (MCDA). |
Igor Svoboda; Dmytro Lande; | arxiv-cs.AI | 2024-02-11 |
795 | Leveraging AI to Advance Science and Computing Education Across Africa: Challenges, Progress and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this chapter, we discuss challenges with using AI to advance education across Africa. |
George Boateng; | arxiv-cs.CY | 2024-02-11 |
796 | Gemini Goes to Med School: Exploring The Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis revealed that Gemini is highly susceptible to hallucinations, overconfidence, and knowledge gaps, which indicate risks if deployed uncritically. |
Ankit Pal; Malaikannan Sankarasubbu; | arxiv-cs.CL | 2024-02-10 |
797 | Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection Via Retrieval-Augmented GPT-4 and LLaMA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study details our approach for the CASE 2024 Shared Task on Climate Activism Stance and Hate Event Detection, focusing on Hate Speech Detection, Hate Speech Target Identification, and Stance Detection as classification challenges. |
MAREK ŠUPPA et. al. | arxiv-cs.CL | 2024-02-09 |
798 | UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction. |
Yansong Ning; Hao Liu; | arxiv-cs.AI | 2024-02-09 |
799 | Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduced three visual related tasks, i.e. caption classification, pairwise captioning, and culture tag selection, to systematically delve into fine-grained visual cultural evaluation. |
YONG CAO et. al. | arxiv-cs.CL | 2024-02-08 |
800 | Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, time series data are uniquely challenging due to significant distribution shifts and intrinsic noise levels. To address these two challenges,we introduce the Sparse Vector Quantized FFN-Free Transformer (Sparse-VQ). |
YANJUN ZHAO et. al. | arxiv-cs.LG | 2024-02-08 |
801 | FACT-GPT: Fact-Checking Augmentation Via Claim Matching with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking. |
Eun Cheol Choi; Emilio Ferrara; | arxiv-cs.CL | 2024-02-08 |
802 | Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach for building a Named Entity Recognition (NER) model built upon a Bidirectional Encoder Representations from Transformers (BERT) architecture, specifically utilizing the SlovakBERT model. |
Bibiána Lajčinová; Patrik Valábek; Michal Spišiak; | arxiv-cs.CL | 2024-02-08 |
803 | Limits of Transformer Language Models on Learning to Compose Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. |
JONATHAN THOMM et. al. | arxiv-cs.LG | 2024-02-08 |
804 | Generative Pre-Trained Transformer (GPT) in Research: A Systematic Review on Data Augmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: GPT (Generative Pre-trained Transformer) represents advanced language models that have significantly reshaped the academic writing landscape. These sophisticated language models … |
F. Sufi; | Inf. | 2024-02-08 |
805 | Traditional Machine Learning Models and Bidirectional Encoder Representations From Transformer (BERT)-Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: Our goal was to identify efficient machine learning models for categorizing tweets related to eating disorders. |
José Alberto Benítez-Andrades; José-Manuel Alija-Pérez; Maria-Esther Vidal; Rafael Pastor-Vargas; María Teresa García-Ordás; | arxiv-cs.CL | 2024-02-08 |
806 | Model Editing with Canonical Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and (3) deviation from an initial model is strictly limited. |
JOHN HEWITT et. al. | arxiv-cs.CL | 2024-02-08 |
807 | Efficient Models for The Detection of Hate, Abuse and Profanity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is unacceptable in civil discourse.The detection of Hate, Abuse and Profanity in text is a vital component of creating civil and unbiased LLMs, which is needed not only for English, but for all languages. In this article, we briefly describe the creation of HAP detectors and various ways of using them to make models civil and acceptable in the output they generate. |
Christoph Tillmann; Aashka Trivedi; Bishwaranjan Bhattacharjee; | arxiv-cs.CL | 2024-02-08 |
808 | Opening The AI Black Box: Program Synthesis Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. |
ERIC J. MICHAUD et. al. | arxiv-cs.LG | 2024-02-07 |
809 | Improving Cross-Domain Low-Resource Text Generation Through LLM Post-Editing: A Programmer-Interpreter Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the editing strategies in these methods are not optimally designed for text-generation tasks. To address these limitations, we propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output. |
Zhuang Li; Levon Haroutunian; Raj Tumuluri; Philip Cohen; Gholamreza Haffari; | arxiv-cs.CL | 2024-02-07 |
810 | Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, we are the first to introduce SNNs into the realm of rPPG, proposing a hybrid neural network (HNN) model, the Spiking-PhysFormer, aimed at reducing power consumption. |
MINGXUAN LIU et. al. | arxiv-cs.CV | 2024-02-07 |
811 | Behind The Screen: Investigating ChatGPT’s Dark Personality Traits and Conspiracy Beliefs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: ChatGPT is notorious for its intransparent behavior. This paper tries to shed light on this, providing an in-depth analysis of the dark personality traits and conspiracy beliefs of GPT-3.5 and GPT-4. |
Erik Weber; Jérôme Rutinowski; Markus Pauly; | arxiv-cs.CL | 2024-02-06 |
812 | The Use of A Large Language Model for Cyberbullying Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Several machine learning (ML) algorithms have been proposed for this purpose. |
Bayode Ogunleye; Babitha Dharmaraj; | arxiv-cs.CL | 2024-02-06 |
813 | The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity. |
Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Ré; | arxiv-cs.LG | 2024-02-06 |
814 | Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The popularization of the internet and the widespread use of smartphones have led to a rapid growth in the number of social media users. While information technology has brought … |
Shifeng Chen; Jialin Wang; Ketai He; | Inf. | 2024-02-06 |
815 | Grandmaster-Level Chess Without Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike traditional chess engines that rely on complex heuristics, explicit search, or a combination of both, we train a 270M parameter transformer model with supervised learning on a dataset of 10 million chess games. |
ANIAN RUOSS et. al. | arxiv-cs.LG | 2024-02-06 |
816 | CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on synthetic data generation and demonstrate the capability of training a GPT model using a particular patient representation derived from CEHR-BERT, enabling us to generate patient sequences that can be seamlessly converted to the Observational Medical Outcomes Partnership (OMOP) data format. |
CHAO PANG et. al. | arxiv-cs.LG | 2024-02-06 |
817 | A Survey on Transformer Compression Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV), specially for constructing large language models (LLM) and large vision … |
YEHUI TANG et. al. | ArXiv | 2024-02-05 |
818 | Identifying Reasons for Contraceptive Switching from Real-World Data Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate that GPT-4 can accurately extract reasons for contraceptive switching, outperforming baseline BERT-based models with microF1 scores of 0.849 and 0.881 for contraceptive start and stop extraction, respectively. |
BRENDA Y. MIAO et. al. | arxiv-cs.CL | 2024-02-05 |
819 | MobilityGPT: Enhanced Human Mobility Modeling with A GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we reformat human mobility modeling as an autoregressive generation task, leveraging Generative Pre-trained Transformer (GPT). To ensure its controllable generation to alleviate the above challenges, we propose a geospatially-aware generative model, MobilityGPT. |
Ammar Haydari; Dongjie Chen; Zhengfeng Lai; Chen-Nee Chuah; | arxiv-cs.LG | 2024-02-05 |
820 | Self-Discover: Large Language Models Self-Compose Reasoning Structures IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. |
PEI ZHOU et. al. | arxiv-cs.AI | 2024-02-05 |
821 | UniMem: Towards A Unified View of Long-Context Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We reformulate 16 existing methods based on UniMem and analyze four representative methods: Transformer-XL, Memorizing Transformer, RMT, and Longformer into equivalent UniMem forms to reveal their design principles and strengths. Based on these analyses, we propose UniMix, an innovative approach that integrates the strengths of these algorithms. |
JUNJIE FANG et. al. | arxiv-cs.CL | 2024-02-05 |
822 | Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods. |
Lingxiao Zhao; Xueying Ding; Leman Akoglu; | arxiv-cs.LG | 2024-02-05 |
823 | Conversation Reconstruction Attack Against GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We will responsibly disclose our findings to the suppliers of related large language models. |
Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang; | arxiv-cs.CR | 2024-02-05 |
824 | Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our major contribution is building PubCLogs by repurposing 5.6 million pairs of coclicked articles from PubMed’s user query logs. |
Ashley Shin; Qiao Jin; James Anibal; Zhiyong Lu; | arxiv-cs.IR | 2024-02-05 |
825 | SWAG: Storytelling With Action Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Storytelling With Action Guidance (SWAG), a novel approach to storytelling with LLMs. |
Zeeshan Patel; Karim El-Refai; Jonathan Pei; Tianle Li; | arxiv-cs.CL | 2024-02-05 |
826 | Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To investigate the effect language on the formation of abstractions, we implement a novel multimodal serial reproduction framework by asking people who receive a visual stimulus to reproduce it in a linguistic format, and vice versa. |
SREEJAN KUMAR et. al. | arxiv-cs.AI | 2024-02-05 |
827 | DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size — adding a few thousand parameters for large-scale models in the 100B parameters range. |
Matteo Pagliardini; Amirkeivan Mohtashami; Francois Fleuret; Martin Jaggi; | arxiv-cs.CL | 2024-02-04 |
828 | Evaluating Large Language Models in Analysing Classroom Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the application of Large Language Models (LLMs), specifically GPT-4, in the analysis of classroom dialogue, a crucial research task for both teaching diagnosis and quality improvement. |
Yun Long; Haifeng Luo; Yu Zhang; | arxiv-cs.CL | 2024-02-04 |
829 | Improving Assessment of Tutoring Practices Using Retrieval-Augmented Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize … |
ZIFEI HAN et. al. | ArXiv | 2024-02-04 |
830 | Data Quality Matters: Suicide Intention Detection on Social Media Posts Using A RoBERTa-CNN Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on identifying suicidal intentions in SuicideWatch Reddit posts and present a novel approach to suicide detection using the cutting-edge RoBERTa-CNN model, a variant of RoBERTa (Robustly optimized BERT approach). |
Emily Lin; Jian Sun; Hsingyu Chen; Mohammad H. Mahoor; | arxiv-cs.CL | 2024-02-03 |
831 | Spin: An Efficient Secure Computation Framework with GPU Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose optimized protocols for non-linear functions that are critical for machine learning, as well as several novel optimizations specific to attention that is the fundamental unit of Transformer models, allowing Spin to perform non-trivial CNNs training and Transformer inference without sacrificing security. |
WUXUAN JIANG et. al. | arxiv-cs.CR | 2024-02-03 |
832 | GPT-4V As Traffic Assistant: An In-depth Look at Vision Language Model on Complex Traffic Events Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The advent of large vision-language models (VLMs) such as GPT-4V, has introduced innovative approaches to addressing this issue. In this paper, we explore the ability of GPT-4V with a set of representative traffic incident videos and delve into the model’s capacity of understanding these complex traffic situations. |
Xingcheng Zhou; Alois C. Knoll; | arxiv-cs.CV | 2024-02-03 |
833 | User Intent Recognition and Satisfaction with Large Language Models: A User Study with ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on a fine-grained intent taxonomy and intent-based prompt reformulations, we analyze (1) the quality of intent recognition and (2) user satisfaction with answers from intent-based prompt reformulations for two recent ChatGPT models, GPT-3.5 Turbo and GPT-4 Turbo. |
Anna Bodonhelyi; Efe Bozkir; Shuo Yang; Enkelejda Kasneci; Gjergji Kasneci; | arxiv-cs.HC | 2024-02-03 |
834 | ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer. |
ZIHAN LI et. al. | arxiv-cs.CV | 2024-02-02 |
835 | LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: ChatGPT and other general large language models (LLMs) have achieved remarkable success, but they have also raised concerns about the misuse of AI-generated texts. |
RONGSHENG WANG et. al. | arxiv-cs.CL | 2024-02-02 |
836 | MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonPerplexity submission for the Shared Task on Multimodal Hate Speech Event Detection at CASE 2024 at EACL 2024. |
AMRITA GANGULY et. al. | arxiv-cs.CL | 2024-02-02 |
837 | Faster Inference of Integer SWIN Transformer By Removing The GELU Activation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we improve upon the inference latency of the state-of-the-art methods by removing the floating-point operations, which are associated with the GELU activation in Swin Transformer. |
Mohammadreza Tayaranian; Seyyed Hasan Mozafari; James J. Clark; Brett Meyer; Warren Gross; | arxiv-cs.CV | 2024-02-02 |
838 | COMET: Generating Commit Messages Using Delta Graph Context Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and leverages a transformer-based model to generate high-quality commit messages. |
Abhinav Reddy Mandli; Saurabhsingh Rajput; Tushar Sharma; | arxiv-cs.SE | 2024-02-02 |
839 | Ensemble of Ghost Convolution Block with Nested Transformer Encoder for Dense Object Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ponduri Vasanthi; L. Mohan; | Biomed. Signal Process. Control. | 2024-02-01 |
840 | Ultra Fast Transformers on FPGAs for Particle Physics Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we have implemented critical components of a transformer model, such as multi-head attention and softmax layers. |
ZHIXING JIANG et. al. | arxiv-cs.LG | 2024-02-01 |
841 | Rail Surface Defect Detection Using A Transformer-based Network Related Papers Related Patents Related Grants Related Venues Related Experts View |
Feng Guo; Jian Liu; Yu Qian; Quanyi Xie; | J. Ind. Inf. Integr. | 2024-02-01 |
842 | Intelligent Fault Diagnosis of Consumer Electronics Sensor in IoE Via Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: IoE era is coming with the development of information and communication technology. As a typical representative of the IoE intelligent era, consumer electronics products have … |
Wen-Chieh Lin; | IEEE Transactions on Consumer Electronics | 2024-02-01 |
843 | COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The development of Courses of Action (COAs) in military operations is traditionally a time-consuming and intricate process. Addressing this challenge, this study introduces COA-GPT, a novel algorithm employing Large Language Models (LLMs) for rapid and efficient generation of valid COAs. |
Vinicius G. Goecks; Nicholas Waytowich; | arxiv-cs.AI | 2024-02-01 |
844 | Comparative Study of Large Language Model Architectures on Frontier Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Employing the same materials science text corpus and a comprehensive end-to-end pipeline, we conduct a comparative analysis of their training and downstream performance. |
Junqi Yin; Avishek Bose; Guojing Cong; Isaac Lyngaas; Quentin Anthony; | arxiv-cs.DC | 2024-02-01 |
845 | HARDSEA: Hybrid Analog-ReRAM Clustering and Digital-SRAM In-Memory Computing Accelerator for Dynamic Sparse Self-Attention in Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Self-attention-based transformers have outperformed recurrent and convolutional neural networks (RNN/ CNNs) in many applications. Despite the effectiveness, calculating … |
SHIWEI LIU et. al. | IEEE Transactions on Very Large Scale Integration (VLSI) … | 2024-02-01 |
846 | Understanding The Expressive Power and Mechanisms of Transformer for Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. |
Mingze Wang; Weinan E; | arxiv-cs.LG | 2024-02-01 |
847 | Masked Siamese Prompt Tuning for Few-Shot Natural Language Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, prompt-based learning has shown excellent performance on few-shot scenarios. Using frozen language models to tune trainable continuous prompt embeddings has become a … |
Shiwen Ni; Hung-Yu Kao; | IEEE Transactions on Artificial Intelligence | 2024-02-01 |
848 | RobinNet: A Multimodal Speech Emotion Recognition System With Speaker Recognition for Social Interactions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: It is essential to understand the underlying emotions that are imparted through speech in order to study social communications as well as to generate seamless human–computer … |
Yash Khurana; Swamita Gupta; R. Sathyaraj; S. Raja; | IEEE Transactions on Computational Social Systems | 2024-02-01 |
849 | Lesion Identification in Fundus Images Via Convolutional Neural Network-vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jian Lian; Tianyu Liu; | Biomed. Signal Process. Control. | 2024-02-01 |
850 | HTC-Net: A Hybrid CNN-transformer Framework for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
HUI TANG et. al. | Biomed. Signal Process. Control. | 2024-02-01 |
851 | Generation, Distillation and Evaluation of Motivational Interviewing-Style Reflections with A Foundational Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method for distilling the generation of reflections from a Foundational Language Model (GPT-4) into smaller models. |
ANDREW BROWN et. al. | arxiv-cs.CL | 2024-02-01 |
852 | Self-Supervised Contrastive Pre-Training for Multivariate Point Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new paradigm for self-supervised learning for multivariate point processes using a transformer encoder. |
Xiao Shou; Dharmashankar Subramanian; Debarun Bhattacharjya; Tian Gao; Kristin P. Bennet; | arxiv-cs.LG | 2024-02-01 |
853 | Dendritic Learning-Incorporated Vision Transformer for Image Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
Zhiming Zhang; Zhenyu Lei; M. Omura; Hideyuki Hasegawa; Shangce Gao; | IEEE CAA J. Autom. Sinica | 2024-02-01 |
854 | Towards Integrated and Fine-grained Traffic Forecasting: A Spatio-Temporal Heterogeneous Graph Transformer Approach Related Papers Related Patents Related Grants Related Venues Related Experts View |
GUANGYUE LI et. al. | Inf. Fusion | 2024-02-01 |
855 | Mitigating The Problem of Strong Priors in LMs with Context Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply it to eleven models including GPT-2, GPT-3, Llama 2, and Mistral on four tasks, and find improvements in 41/44. |
Raymond Douglas; Andis Draguns; Tomáš Gavenčiak; | arxiv-cs.CL | 2024-01-31 |
856 | Evaluating The Capabilities of LLMs for Supporting Anticipatory Impact Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we demonstrate the potential for generating high-quality and diverse impacts of AI in society by fine-tuning completion models (GPT-3 and Mistral-7B) on a diverse sample of articles from news media and comparing those outputs to the impacts generated by instruction-based (GPT-4 and Mistral-7B-Instruct) models. |
Mowafak Allaham; Nicholas Diakopoulos; | arxiv-cs.CL | 2024-01-31 |
857 | Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Gyan AI Paramanu (atom), a family of novel language models for Indian languages. |
Mitodru Niyogi; Arnab Bhattacharya; | arxiv-cs.CL | 2024-01-31 |
858 | Global-Liar: Factuality of LLMs Over Time and Geographic Regions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ‘Global-Liar,’ a dataset uniquely balanced in terms of geographic and temporal representation, facilitating a more nuanced evaluation of LLM biases. |
Shujaat Mirza; Bruno Coelho; Yuyuan Cui; Christina Pöpper; Damon McCoy; | arxiv-cs.CL | 2024-01-31 |
859 | Evaluating The Effectiveness of GPT-4 Turbo in Creating Defeaters for Assurance Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Identifying defeaters arguments that refute these ACs is essential for improving the robustness and confidence in ACs. To automate this task, we introduce a novel method that leverages the capabilities of GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify defeaters within ACs formalized using the Eliminative Argumentation (EA) notation. |
KIMYA KHAKZAD SHAHANDASHTI et. al. | arxiv-cs.SE | 2024-01-31 |
860 | Fine-Tuning and Prompt Engineering for Large Language Models-based Code Review Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: We aim to investigate the performance of LLMs-based code review automation based on two contexts, i.e., when LLMs are leveraged by fine-tuning and prompting. |
Chanathip Pornprasit; Chakkrit Tantithamthavorn; | arxiv-cs.SE | 2024-01-31 |
861 | Spatial-Spectral BERT for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Several deep learning and transformer models have been recommended in previous research to deal with the classification of hyperspectral images (HSIs). Among them, one of the most … |
MAHMOOD ASHRAF et. al. | Remote. Sens. | 2024-01-31 |
862 | ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Apart from the non-linearity, the low arithmetic intensity greatly reduces the processing parallelism, which becomes the bottleneck especially when dealing with a longer context. To address this challenge, we propose Constant Softmax (ConSmax), a software-hardware co-design as an efficient Softmax alternative. |
SHIWEI LIU et. al. | arxiv-cs.AR | 2024-01-31 |
863 | Towards AI-Assisted Synthesis of Verified Dafny Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we demonstrate how to improve two pretrained models’ proficiency in the Dafny verification-aware language. |
Md Rakib Hossain Misu; Cristina V. Lopes; Iris Ma; James Noble; | arxiv-cs.SE | 2024-01-31 |
864 | Human-mediated Large Language Models for Robotic Intervention in Children with Autism Spectrum Disorders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This practice restricts the use of robots to limited, pre-mediated instructional curricula. In this paper, we increase robot autonomy in one such robotic intervention for children with ASD by implementing perspective-taking teaching. |
Ruchik Mishra; Karla Conn Welch; Dan O Popa; | arxiv-cs.RO | 2024-01-31 |
865 | Lightweight Transformer Image Feature Extraction Network IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the image feature extraction method based on Transformer has become a research hotspot. However, when using Transformer for image feature extraction, the model’s … |
Wenfeng Zheng; Siyu Lu; Youshuai Yang; Zhengtong Yin; Lirong Yin; | PeerJ Comput. Sci. | 2024-01-31 |
866 | BurstGPT: A Real-world Workload Dataset to Optimize LLM Serving Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents BurstGPT, an LLM serving workload with 5.29 million traces from regional Azure OpenAI GPT services over 121 days. |
YUXIN WANG et. al. | arxiv-cs.DC | 2024-01-31 |
867 | Scavenging Hyena: Distilling Transformers Into Long Convolution Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a pioneering approach to address the efficiency concerns associated with LLM pre-training, proposing the use of knowledge distillation for cross-architecture transfer. |
Tokiniaina Raharison Ralambomihanta; Shahrad Mohammadzadeh; Mohammad Sami Nur Islam; Wassim Jabbour; Laurence Liang; | arxiv-cs.CL | 2024-01-30 |
868 | SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a subarray-level processing-in-memory architecture named SAL-PIM, HBM-based PIM architecture for the end-to-end acceleration of transformer-based text generation. |
Wontak Han; Hyunjun Cho; Donghyuk Kim; Joo-Young Kim; | arxiv-cs.AR | 2024-01-30 |
869 | Arabic Tweet Act: A Weighted Ensemble Pre-Trained Transformer Model for Classifying Arabic Speech Acts on Twitter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Twitter dialectal Arabic speech act classification approach based on a transformer deep learning neural network. |
Khadejaa Alshehri; Areej Alhothali; Nahed Alowidi; | arxiv-cs.CL | 2024-01-30 |
870 | Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide a Transformer-based model and a baseline benchmark for the Turkish Language. |
Savas Yildirim; | arxiv-cs.CL | 2024-01-30 |
871 | More Than Meets The AI: Evaluating The Performance of GPT-4 on Computer Graphics Assessment Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies have showcased the exceptional performance of LLMs (Large Language Models) on assessment questions across various discipline areas. This can be helpful if used to … |
Tony Haoran Feng; Paul Denny; Burkhard C. Wünsche; Andrew Luxton-Reilly; Steffan Hooper; | Proceedings of the 26th Australasian Computing Education … | 2024-01-29 |
872 | Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we construct dialogue modules based on a CBT scenario focused on conventional Socratic questioning using two kinds of LLMs: a Transformer-based dialogue model further trained with a social media empathetic counseling dataset, provided by Osaka Prefecture (OsakaED), and GPT-4, a state-of-the art LLM created by OpenAI. |
KENTA IZUMI et. al. | arxiv-cs.CL | 2024-01-29 |
873 | TQCompressor: Improving Tensor Decomposition Methods in Neural Networks Via Permutations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce TQCompressor, a novel method for neural network model compression with improved tensor decompositions. |
V. ABRONIN et. al. | arxiv-cs.LG | 2024-01-29 |
874 | Breaking Free Transformer Models: Task-specific Context Attribution Promises Improved Generalizability Without Fine-tuning Pre-trained LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a framework that allows for maintaining generalizability, and enhances the performance on the downstream task by utilizing task-specific context attribution. |
Stepan Tytarenko; Mohammad Ruhul Amin; | arxiv-cs.CL | 2024-01-29 |
875 | PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the lack of specialized, high-quality benchmark impeded their development and precise evaluation. To address this, we introduce PathMMU, the largest and highest-quality expert-validated pathology benchmark for Large Multimodal Models (LMMs). |
YUXUAN SUN et. al. | arxiv-cs.CV | 2024-01-29 |
876 | 3DG: A Framework for Using Generative AI for Handling Sparse Learner Performance Data From Intelligent Tutoring Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Learning performance data (e.g., quiz scores and attempts) is significant for understanding learner engagement and knowledge mastery level. However, the learning performance data … |
Liang Zhang; Jionghao Lin; Conrad Borchers; Meng Cao; Xiangen Hu; | ArXiv | 2024-01-29 |
877 | You Tell Me: A Dataset of GPT-4-Based Behaviour Change Support Conversations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and … |
Selina Meyer; David Elsweiler; | arxiv-cs.HC | 2024-01-29 |
878 | Security Code Review By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (4) GPT-4 is more adept at identifying security defects in code files with fewer tokens, containing functional logic and written by developers with less involvement in the project. |
JIAXIN YU et. al. | arxiv-cs.SE | 2024-01-29 |
879 | Leveraging Professional Radiologists’ Expertise to Enhance LLMs’ Evaluation for Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Experimental results show that our Detailed GPT-4 (5-shot) model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our Regressed GPT-4 model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin. |
QINGQING ZHU et. al. | arxiv-cs.CL | 2024-01-29 |
880 | Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We apply a novel Mixture of Experts (MoE) extension pipeline to pretrained BERT models, where every multi-layer perceptron section is enlarged and copied into multiple distinct experts. |
Logan Hallee; Rohan Kapur; Arjun Patel; Jason P. Gleghorn; Bohdan Khomtchouk; | arxiv-cs.LG | 2024-01-28 |
881 | Evaluating LLM – Generated Multimodal Diagnosis from Medical Images and Symptom Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) constitute a breakthrough state-of-the-art Artificial Intelligence technology which is rapidly evolving and promises to aid in medical diagnosis. … |
Dimitrios P. Panagoulias; M. Virvou; G. Tsihrintzis; | ArXiv | 2024-01-28 |
882 | Identifying and Improving Disability Bias in GPT-Based Resume Screening Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, without examining the potential of bias, this may negatively impact marginalized populations, including people with disabilities. To address this important concern, we present a resume audit study, in which we ask ChatGPT (specifically, GPT-4) to rank a resume against the same resume enhanced with an additional leadership award, scholarship, panel presentation, and membership that are disability related. |
Kate Glazko; Yusuf Mohammed; Ben Kosa; Venkatesh Potluri; Jennifer Mankoff; | arxiv-cs.CY | 2024-01-28 |
883 | UnMASKed: Quantifying Gender Biases in Masked Language Models Through Linguistically Informed Job Market Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluated six prominent models: BERT, RoBERTa, DistilBERT, BERT-multilingual, XLM-RoBERTa, and DistilBERT-multilingual. |
Iñigo Parra; | arxiv-cs.CL | 2024-01-28 |
884 | Semantics of Multiword Expressions in Transformer-Based Models: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing this gap, we provide the first in-depth survey of MWE processing with transformer models. We overall find that they capture MWE semantics inconsistently, as shown by reliance on surface patterns and memorized information. |
Filip Miletić; Sabine Schulte im Walde; | arxiv-cs.CL | 2024-01-27 |
885 | Prompting Diverse Ideas: Increasing AI Idea Variance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Unlike routine tasks where consistency is prized, in creativity and innovation the goal is to create a diverse set of ideas. This paper delves into the burgeoning interest in … |
Lennart Meincke; Ethan R. Mollick; Christian Terwiesch; | ArXiv | 2024-01-27 |
886 | A New Method for Vehicle Logo Recognition Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we implement real-time VLR using Swin Transformer and fine-tune it for optimal performance. |
Yang Li; Doudou Zhang; Jianli Xiao; | arxiv-cs.CV | 2024-01-27 |
887 | Large Language Model for Vulnerability Detection: Emerging Results and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the effectiveness of LLMs in detecting software vulnerabilities is largely unexplored. This paper aims to bridge this gap by exploring how LLMs perform with various prompts, particularly focusing on two state-of-the-art LLMs: GPT-3.5 and GPT-4. |
Xin Zhou; Ting Zhang; David Lo; | arxiv-cs.SE | 2024-01-27 |
888 | Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Importantly, we find that coding fidelity improves considerably when the LLM is prompted to give rationale justifying its coding decisions (chain-of-thought reasoning). We present these and other findings along with a set of best practices for adapting traditional codebooks for LLMs. |
Zackary Okun Dunivin; | arxiv-cs.CL | 2024-01-26 |
889 | Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a transformer-based CL framework focusing on learning tasks that involve both vision and language, known as Vision-and-Language (VaL) tasks. |
Yuliang Cai; Mohammad Rostami; | arxiv-cs.CV | 2024-01-26 |
890 | From GPT-4 to Gemini and Beyond: Assessing The Landscape of MLLMs on Generalizability, Trustworthiness and Causality Through Four Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. |
CHAOCHAO LU et. al. | arxiv-cs.CV | 2024-01-26 |
891 | Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution. |
CHENG QIAN et. al. | arxiv-cs.CL | 2024-01-25 |
892 | Relative Value Biases in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Studies of reinforcement learning in humans and animals have demonstrated a preference for options that yielded relatively better outcomes in the past, even when those options are associated with lower absolute reward. |
William M. Hayes; Nicolas Yax; Stefano Palminteri; | arxiv-cs.CL | 2024-01-25 |
893 | Evaluating GPT-3.5’s Awareness and Summarization Abilities for European Constitutional Texts with Shared Topics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, using the renowned GPT-3.5, we leverage generative large language models to understand constitutional passages that transcend national boundaries. |
Candida M. Greco; A. Tagarelli; | arxiv-cs.CL | 2024-01-25 |
894 | MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. |
PATRICK LEE et. al. | arxiv-cs.CL | 2024-01-25 |
895 | (Chat)GPT V BERT: Dawn of Justice for Semantic Change Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we specifically focus on the temporal problem of semantic change, and evaluate their ability to solve two diachronic extensions of the Word-in-Context (WiC) task: TempoWiC and HistoWiC. |
Francesco Periti; Haim Dubossarsky; Nina Tahmasebi; | arxiv-cs.CL | 2024-01-25 |
896 | An In-Depth Review of ChatGPT’s Pros and Cons for Learning and Teaching in Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As technology progresses, there has been an increasing interest in using Chatbot GPT (Generative Pre-trained Transformer) in education. Chatbot GPT, or ChatGPT, gained one million … |
A. Samala; Xiaoming Zhai; Kumiko Aoki; Ljubiša Bojić; Simona Žikić; | Int. J. Interact. Mob. Technol. | 2024-01-25 |
897 | Unmasking and Quantifying Racial Bias of Large Language Models in Medical Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite few attempts in the past, the precise impact and extent of these biases remain uncertain. Through both qualitative and quantitative analyses, we find that these models tend to project higher costs and longer hospitalizations for White populations and exhibit optimistic views in challenging medical scenarios with much higher survival rates. |
Yifan Yang; Xiaoyu Liu; Qiao Jin; Furong Huang; Zhiyong Lu; | arxiv-cs.CL | 2024-01-24 |
898 | ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce ConTextual, a novel dataset featuring human-crafted instructions that require context-sensitive reasoning for text-rich images. |
Rohan Wadhawan; Hritik Bansal; Kai-Wei Chang; Nanyun Peng; | arxiv-cs.CV | 2024-01-24 |
899 | Automated Root Causing of Cloud Incidents Using In-Context Learning with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the high cost of fine-tuning LLM, we propose an in-context learning approach for automated root causing, which eliminates the need for fine-tuning. |
XUCHAO ZHANG et. al. | arxiv-cs.CL | 2024-01-24 |
900 | Evaluation of General Large Language Models in Contextually Assessing Semantic Concepts Extracted from Adult Critical Care Electronic Health Record Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: We sought to evaluate the performance of LLMs in the complex clinical context of adult critical care medicine using systematic and comprehensible analytic methods, including clinician annotation and adjudication. |
DARREN LIU et. al. | arxiv-cs.CL | 2024-01-24 |
901 | A Comparative Study of Zero-shot Inference with Large Language Models and Supervised Modeling in Breast Cancer Pathology Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explored whether recent LLMs can reduce the need for large-scale data annotations. |
MADHUMITA SUSHIL et. al. | arxiv-cs.CL | 2024-01-24 |
902 | Discovering Mathematical Formulas from Data Via GPT-guided Monte Carlo Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To optimize the trade-off between efficiency and versatility, we introduce SR-GPT, a novel algorithm for symbolic regression that integrates Monte Carlo Tree Search (MCTS) with a Generative Pre-Trained Transformer (GPT). |
YANJIE LI et. al. | arxiv-cs.LG | 2024-01-24 |
903 | Can GPT-3.5 Generate and Code Discharge Summaries? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We report micro- and macro-F1 scores on the full codeset, generation codes, and their families. |
MATÚŠ FALIS et. al. | arxiv-cs.CL | 2024-01-24 |
904 | Convolutional Initialization for Data-Efficient Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast, convolutional neural networks (CNNs) can achieve state-of-the-art performance by leveraging their architectural inductive bias. In this paper, we investigate whether this inductive bias can be reinterpreted as an initialization bias within a vision transformer network. |
Jianqiao Zheng; Xueqian Li; Simon Lucey; | arxiv-cs.CV | 2024-01-23 |
905 | TAT-LLM: A Specialized Language Model for Discrete Reasoning Over Tabular and Textual Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address question answering (QA) over a hybrid of tabular and textual data that are very common content on the Web (e.g. SEC filings), where discrete reasoning capabilities are often required. |
FENGBIN ZHU et. al. | arxiv-cs.CL | 2024-01-23 |
906 | Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on Subtask A & B. Each subtask is supported by three datasets for training, development, and testing. |
FENG XIONG et. al. | arxiv-cs.CL | 2024-01-22 |
907 | Contrastive Learning in Distilled Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. |
Valerie Lim; Kai Wen Ng; Kenneth Lim; | arxiv-cs.CL | 2024-01-22 |
908 | Enhancing In-context Learning Via Linear Probe Calibration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input. |
MOMIN ABBAS et. al. | arxiv-cs.CL | 2024-01-22 |
909 | Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fault protection mechanism that incurs zero space cost. |
BINGBING LI et. al. | arxiv-cs.LG | 2024-01-21 |
910 | Freely Long-Thinking Transformer (FraiLT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Freely Long-Thinking Transformer (FraiLT) is an improved transformer model designed to enhance processing capabilities without scaling up size. It utilizes a recursive approach, iterating over a subset of layers multiple times, and introduces iteration encodings to maintain awareness across these cycles. |
Akbay Tabak; | arxiv-cs.LG | 2024-01-21 |
911 | Revolutionizing Finance with LLMs: An Overview of Applications and Insights IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide a comprehensive overview of the emerging integration of LLMs into various financial tasks. |
HUAQIN ZHAO et. al. | arxiv-cs.CL | 2024-01-21 |
912 | CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Free-text radiology reports present a rich data source for various medical tasks, but effectively labeling these texts remains challenging. |
JAWOOK GU et. al. | arxiv-cs.CL | 2024-01-21 |
913 | Unfair TOS: An Automated Approach Using Customized BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present SOTA(State of The Art) results on unfair clause detection from ToS documents based on unprecedented custom BERT Fine-tuning in conjunction with SVC(Support Vector Classifier). |
Bathini Sai Akash; Akshara Kupireddy; Lalita Bhanu Murthy; | arxiv-cs.CL | 2024-01-20 |
914 | Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints. |
Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu; | arxiv-cs.IR | 2024-01-20 |
915 | Visualization Generation with Large Language Models: An Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the capability of a large language model to generate visualization specifications on the task of natural language to visualization (NL2VIS). |
GUOZHENG LI et. al. | arxiv-cs.HC | 2024-01-20 |
916 | Enhancing Large Language Models for Clinical Decision Support By Incorporating Clinical Practice Guidelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods We develop three distinct methods for incorporating CPGs into LLMs: Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought-Few-Shot Prompting (CoT-FSP). |
DAVID ONIANI et. al. | arxiv-cs.CL | 2024-01-20 |
917 | Evaluating and Enhancing Large Language Models Performance in Domain-specific Medicine: Osteoarthritis Management with DocOA Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The efficacy of large language models (LLMs) in domain-specific medicine, particularly for managing complex diseases such as osteoarthritis (OA), remains largely unexplored. This … |
XI CHEN et. al. | ArXiv | 2024-01-20 |
918 | DB-GPT: Large Language Model Meets Database Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xuanhe Zhou; Zhaoyan Sun; Guoliang Li; | Data Sci. Eng. | 2024-01-19 |
919 | Custom Developer GPT for Ethical AI Solutions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main goal of this project is to create a new software artefact: a custom Generative Pre-trained Transformer (GPT) for developers to discuss and solve ethical issues through AI engineering. |
Lauren Olson; | arxiv-cs.SE | 2024-01-19 |
920 | Cross-lingual Editing in Multilingual Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For more comprehensive information, the dataset used in this research and the associated code are publicly available at the following URL\url{https://github.com/lingo-iitgn/XME}. |
Himanshu Beniwal; Kowsik Nandagopan D; Mayank Singh; | arxiv-cs.CL | 2024-01-19 |
921 | Mining Experimental Data from Materials Science Literature with Large Language Models: An Evaluation Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel methodology for the comparative analysis of intricate material expressions, emphasising the standardisation of chemical formulas to tackle the complexities inherent in materials science information assessment. |
Luca Foppiano; Guillaume Lambard; Toshiyuki Amagasa; Masashi Ishii; | arxiv-cs.CL | 2024-01-19 |
922 | Speech Swin-Transformer: Exploring A Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In speech signals, emotional information is distributed across different scales of speech features, e.\,g., word, phrase, and utterance. Drawing above inspiration, this paper presents a hierarchical speech Transformer with shifted windows to aggregate multi-scale emotion features for speech emotion recognition (SER), called Speech Swin-Transformer. |
YONG WANG et. al. | arxiv-cs.CL | 2024-01-19 |
923 | Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a comparative analysis of state-of-the-art Pre-trained Language Models (PTMs) for fine-grained emotion classification on two benchmark datasets from GitHub and Stack Overflow. |
Mia Mohammad Imran; | arxiv-cs.SE | 2024-01-19 |
924 | ChatQA: Surpassing GPT-4 on Conversational QA and RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce ChatQA, a suite of models that outperform GPT-4 on retrieval-augmented generation (RAG) and conversational question answering (QA). |
ZIHAN LIU et. al. | arxiv-cs.CL | 2024-01-18 |
925 | Improving The Accuracy of Analog-Based In-Memory Computing Accelerators Post-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose two Post-Training (PT) optimization methods to improve accuracy after training is performed. |
COREY LAMMIE et. al. | arxiv-cs.ET | 2024-01-18 |
926 | Gender Bias in Machine Translation and The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This chapter examines the role of Machine Translation in perpetuating gender bias, highlighting the challenges posed by cross-linguistic settings and statistical dependencies. |
Eva Vanmassenhove; | arxiv-cs.CL | 2024-01-18 |
927 | GPT in Sheep’s Clothing: The Risk of Customized GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We aim to raise awareness of the fact that GPTs can be used maliciously, posing privacy and security risks to their users. |
SAGIV ANTEBI et. al. | arxiv-cs.CR | 2024-01-17 |
928 | Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article provides a step-by-step generalizable guideline to identify and classify different forms of racist discourse in large corpora. In our approach, we start by conceptualizing racism and its different manifestations. |
Diana Davila Gordillo; Joan Timoneda; Sebastian Vallejo Vera; | arxiv-cs.CL | 2024-01-17 |
929 | Efficient Slot Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a lightweight method which performs on par or better than the state-of-the-art PLM-based methods, while having almost 10x less trainable parameters. |
Vladimir Vlasov; | arxiv-cs.CL | 2024-01-17 |
930 | Land Cover Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We compare convolutional neural networks (CNN) against transformer-based methods, showcasing their applications and advantages in LC studies. |
Antonio Rangel; Juan Terven; Diana M. Cordova-Esparza; E. A. Chavez-Urbiola; | arxiv-cs.CV | 2024-01-17 |
931 | RAG Vs Fine-tuning: Pipelines, Tradeoffs, and A Case Study on Agriculture IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a pipeline for fine-tuning and RAG, and present the tradeoffs of both for multiple popular LLMs, including Llama2-13B, GPT-3.5, and GPT-4. |
ANGELS BALAGUER et. al. | arxiv-cs.CL | 2024-01-16 |
932 | Human Vs. LMMs: Exploring The Discrepancy in Emoji Interpretation and Usage in Digital Communication Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Leveraging Large Multimodal Models (LMMs) to simulate human behaviors when processing multimodal information, especially in the context of social media, has garnered immense interest due to its broad potential and far-reaching implications. |
Hanjia Lyu; Weihong Qi; Zhongyu Wei; Jiebo Luo; | arxiv-cs.CV | 2024-01-16 |
933 | Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This communication bottleneck exacerbates the already complex computational landscape, hindering the efficient utilization of high-performance computing resources. In this paper, we propose a lightweight optimization technique called ExFlow, to largely accelerate the inference of these MoE models. |
Jinghan Yao; Quentin Anthony; Aamir Shafi; Hari Subramoni; Dhabaleswar K.; | arxiv-cs.LG | 2024-01-16 |
934 | Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V’s high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows. |
QIAO JIN et. al. | arxiv-cs.CV | 2024-01-16 |
935 | Enhancing Robustness of LLM-Synthetic Text Detectors for Academic Writing: A Comprehensive Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a comprehensive analysis of the impact of prompts on the text generated by LLMs and highlight the potential lack of robustness in one of the current state-of-the-art GPT detectors. |
Zhicheng Dou; Yuchen Guo; Ching-Chun Chang; Huy H. Nguyen; Isao Echizen; | arxiv-cs.CL | 2024-01-15 |
936 | The Chronicles of RAG: The Retriever, The Chunk and The Generator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given all these challenges, every day a new technique to improve RAG appears, making it unfeasible to experiment with all combinations for your problem. In this context, this paper presents good practices to implement, optimize, and evaluate RAG for the Brazilian Portuguese language, focusing on the establishment of a simple pipeline for inference and experiments. |
PAULO FINARDI et. al. | arxiv-cs.LG | 2024-01-15 |
937 | Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models (LMs). |
Saptarshi Sengupta; Connor Heaton; Prasenjit Mitra; Soumalya Sarkar; | arxiv-cs.CL | 2024-01-15 |
938 | Cascaded Cross-Modal Transformer for Audio-Textual Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To attain superior classification performance, we propose to harness the inherent value of multimodal representations by transcribing speech using automatic speech recognition (ASR) models and translating the transcripts into different languages via pretrained translation models. |
Nicolae-Catalin Ristea; Andrei Anghel; Radu Tudor Ionescu; | arxiv-cs.CL | 2024-01-15 |
939 | Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we tackle the challenge of classifying the object category in point clouds, which previous works like PointCLIP struggle to address due to the inherent limitations of the CLIP architecture. |
Qi Sun; Xiao Cui; Wengang Zhou; Houqiang Li; | arxiv-cs.CV | 2024-01-15 |
940 | Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level operating system microkernel, seL4. |
Lichen Zhang; Shuai Lu; Nan Duan; | arxiv-cs.SE | 2024-01-15 |
941 | Interference-Robust Millimeter-Wave Radar-Based Dynamic Hand Gesture Recognition Using 2-D CNN-Transformer Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Dynamic gesture recognition using millimeter-wave radar has a broad application prospect in the industrial Internet of Things (IoT) field. However, the existing methods in the … |
Biao Jin; Xiao Ma; Zhenkai Zhang; Zhuxian Lian; Biao Wang; | IEEE Internet of Things Journal | 2024-01-15 |
942 | Transformer-based Approach for Ethereum Price Prediction Using Crosscurrency Correlation and Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The model employs a transformer architecture for several setups from single-feature scenarios to complex configurations incorporating volume, sentiment, and correlated cryptocurrency prices. |
Shubham Singh; Mayur Bhat; | arxiv-cs.LG | 2024-01-15 |
943 | SemEval-2017 Task 4: Sentiment Analysis in Twitter Using BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper uses the BERT model, which is a transformer-based architecture, to solve task 4A, English Language, Sentiment Analysis in Twitter of SemEval2017. |
Rupak Kumar Das; Dr. Ted Pedersen; | arxiv-cs.CL | 2024-01-15 |
944 | Active Learning for NLP with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the accuracy and cost of using LLMs (GPT-3.5 and GPT-4) to label samples on 3 different datasets. |
Xuesong Wang; | arxiv-cs.CL | 2024-01-14 |
945 | Leveraging The Power of Transformers for Guilt Detection in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the applicability of three transformer-based language models for detecting guilt in text and compare their performance for general emotion detection and guilt detection. |
Abdul Gafar Manuel Meque; Jason Angel; Grigori Sidorov; Alexander Gelbukh; | arxiv-cs.CL | 2024-01-14 |
946 | Killer Apps: Low-Speed, Large-Scale AI Weapons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the concept of AI weapons, their deployment, detection, and potential countermeasures. |
Philip Feldman; Aaron Dant; James R. Foulds; | arxiv-cs.CY | 2024-01-14 |
947 | Harnessing Large Language Models Over Transformer Models for Detecting Bengali Depressive Social Media Text: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work provides full architecture details for each model and a methodical way to assess their performance in Bengali depressive text categorization using zero-shot and few-shot learning techniques. |
AHMADUL KARIM CHOWDHURY et. al. | arxiv-cs.CL | 2024-01-14 |
948 | MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel map-guided GPT-based agent, dubbed MapGPT, which introduces an online linguistic-formed map to encourage global exploration. |
JIAQI CHEN et. al. | arxiv-cs.AI | 2024-01-14 |
949 | Fuzzy Swin Transformer for Land Use/ Land Cover Change Detection Using LISS-III Satellite Data Related Papers Related Patents Related Grants Related Venues Related Experts View |
Sam Navin Mohanrajan; L. Agilandeeswari; P. Manoharan; Farhan A. Alenizi; | Earth Science Informatics | 2024-01-13 |
950 | A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a multi-stage prompting approach (MSP) for the generation of multiple choice questions (MCQs), harnessing the capabilities of GPT models such as text-davinci-003 and GPT-4, renowned for their excellence across various NLP tasks. |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | arxiv-cs.CL | 2024-01-13 |
951 | Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. |
Tyler Vergho; Jean-Francois Godbout; Reihaneh Rabbany; Kellin Pelrine; | arxiv-cs.CL | 2024-01-12 |
952 | How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety By Humanizing LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As large language models (LLMs) become increasingly common and competent, non-expert users can also impose risks during daily interactions. This paper introduces a new perspective to jailbreak LLMs as human-like communicators, to explore this overlooked intersection between everyday language interaction and AI safety. |
YI ZENG et. al. | arxiv-cs.CL | 2024-01-12 |
953 | DevEval: Evaluating Code Generation in Practical Software Projects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new benchmark named DevEval, aligned with Developers’ experiences in practical projects. |
JIA LI et. al. | arxiv-cs.SE | 2024-01-12 |
954 | Mission: Impossible Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we develop a set of synthetic impossible languages of differing complexity, each designed by systematically altering English data with unnatural word orders and grammar rules. |
Julie Kallini; Isabel Papadimitriou; Richard Futrell; Kyle Mahowald; Christopher Potts; | arxiv-cs.CL | 2024-01-12 |
955 | Transformer for Object Re-Identification: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering the trending unsupervised Re-ID, we propose a new Transformer baseline, UntransReID, achieving state-of-the-art performance on both single-/cross modal tasks. |
MANG YE et. al. | arxiv-cs.CV | 2024-01-12 |
956 | Intention Analysis Makes LLMs A Good Jailbreak Defender Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present a simple yet highly effective defense strategy, i.e., Intention Analysis ($\mathbb{IA}$). |
Yuqi Zhang; Liang Ding; Lefei Zhang; Dacheng Tao; | arxiv-cs.CL | 2024-01-12 |
957 | Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This research focuses on representing documents across languages by using Transformer Leveraged Document Representations (TLDRs) that are mapped to a cross-lingual domain. |
Tsegaye Misikir Tashu; Eduard-Raul Kontos; Matthia Sabatelli; Matias Valdenegro-Toro; | arxiv-cs.CL | 2024-01-12 |
958 | An Investigation of Structures Responsible for Gender Bias in BERT and DistilBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A crucial issue is the fairness of the predictions made by both PLMs and their distilled counterparts. In this paper, we propose an empirical exploration of this problem by formalizing two questions: (1) Can we identify the neural mechanism(s) responsible for gender bias in BERT (and by extension DistilBERT)? |
Thibaud Leteno; Antoine Gourru; Charlotte Laclau; Christophe Gravier; | arxiv-cs.CL | 2024-01-12 |
959 | Prompt-based Mental Health Screening from Social Media Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. |
Wesley Ramos dos Santos; Ivandre Paraboni; | arxiv-cs.CL | 2024-01-11 |
960 | OTAS: An Elastic Transformer Serving System Via Token Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce OTAS, the first elastic serving system specially tailored for transformer models by exploring lightweight token management. |
JINYU CHEN et. al. | arxiv-cs.DC | 2024-01-10 |
961 | The Benefits of A Concise Chain of Thought on Problem-Solving in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Concise Chain-of-Thought (CCoT) prompting. |
Matthew Renze; Erhan Guven; | arxiv-cs.CL | 2024-01-10 |
962 | EmMixformer: Mix Transformer for Eye Movement Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although deep neural networks, such as convolutional neural network (CNN), have recently achieved promising performance, current solutions fail to capture local and global temporal dependencies within eye movement data. To overcome this problem, we propose in this paper a mixed transformer termed EmMixformer to extract time and frequency domain information for eye movement recognition. |
HUAFENG QIN et. al. | arxiv-cs.CV | 2024-01-10 |
963 | Reinforcement Learning for Optimizing RAG for Domain Chatbots Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the advent of Large Language Models (LLM), conversational assistants have become prevalent for domain use cases. LLMs acquire the ability to contextual question answering … |
Mandar Kulkarni; Praveen Tangarajan; Kyung Kim; Anusua Trivedi; | ArXiv | 2024-01-10 |
964 | Monte Carlo Tree Search for Recipe Generation Using GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose RecipeMC, a text generation method using GPT-2 that relies on Monte Carlo Tree Search (MCTS). |
Karan Taneja; Richard Segal; Richard Goodwin; | arxiv-cs.CL | 2024-01-10 |
965 | DepressionEmo: A Novel Dataset for Multilabel Classification of Depression Emotions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel dataset named DepressionEmo designed to detect 8 emotions associated with depression by 6037 examples of long Reddit user posts. |
Abu Bakar Siddiqur Rahman; Hoang-Thang Ta; Lotfollah Najjar; Azad Azadmanesh; Ali Saffet Gönül; | arxiv-cs.CL | 2024-01-09 |
966 | An Assessment on Comprehending Mental Health Through Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, across various applications, an outstanding question involves the capacity of large language models to comprehend expressions of human mental health conditions in natural language. This study presents an initial evaluation of large language models in addressing this gap. |
Mihael Arcan; David-Paul Niland; Fionn Delahunty; | arxiv-cs.CL | 2024-01-09 |
967 | Can AI Keep You Safe? A Study of Large Language Models for Phishing Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Phishing attacks continue to be a pervasive challenge in cybersecurity, with threat actors constantly developing new strategies to penetrate email inboxes and compromise sensitive … |
Robin Chataut; P. Gyawali; Yusuf Usman; | 2024 IEEE 14th Annual Computing and Communication Workshop … | 2024-01-08 |
968 | MARG: Multi-Agent Review Generation for Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion. |
Mike D’Arcy; Tom Hope; Larry Birnbaum; Doug Downey; | arxiv-cs.CL | 2024-01-08 |
969 | LLM4PlC: Harnessing Large Language Models for Verifiable Programming of PlCs in Industrial Control Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Although Large Language Models (LLMs) have established pre-dominance in automated code generation, they are not devoid of shortcomings. The pertinent issues primarily relate to … |
MOHAMAD FAKIH et. al. | 2024 IEEE/ACM 46th International Conference on Software … | 2024-01-08 |
970 | Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Audio and video are two most common modalities in the mainstream media platforms, e.g., YouTube. To learn from multimodal videos effectively, in this work, we propose a novel … |
Wentao Zhu; | ArXiv | 2024-01-08 |
971 | Distortions in Judged Spatial Relations in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a benchmark for assessing the capability of Large Language Models (LLMs) to discern intercardinal directions between geographic locations and apply it to three prominent LLMs: GPT-3.5, GPT-4, and Llama-2. |
Nir Fulman; Abdulkadir Memduhoğlu; Alexander Zipf; | arxiv-cs.CL | 2024-01-08 |
972 | Mixtral of Experts IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. |
ALBERT Q. JIANG et. al. | arxiv-cs.LG | 2024-01-08 |
973 | MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: At the same time, Mixture of Experts (MoE) has significantly improved Transformer-based LLMs, including recent state-of-the-art open-source models. We propose that to unlock the potential of SSMs for scaling, they should be combined with MoE. |
Maciej Pióro; Kamil Ciebiera; Krystian Król; Jan Ludziejewski; Sebastian Jaszczur; | arxiv-cs.LG | 2024-01-08 |
974 | Effectiveness of ChatGPT in Coding: A Comparative Analysis of Popular Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study explores the effectiveness and efficiency of the popular OpenAI model ChatGPT, powered by GPT-3.5 and GPT-4, in programming tasks to understand its impact on … |
Carlos Eduardo Andino Coello; Mohammed Nazeh Alimam; Rand Kouatly; | Digit. | 2024-01-08 |
975 | GloTSFormer: Global Video Text Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Global Video Text Spotting Transformer GloTSFormer to model the tracking problem as global associations and utilize the Gaussian Wasserstein distance to guide the morphological correlation between frames. |
Han Wang; Yanjie Wang; Yang Li; Can Huang; | arxiv-cs.CV | 2024-01-08 |
976 | InFoBench: Evaluating Instruction Following Ability in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces the Decomposed Requirements Following Ratio (DRFR), a new metric for evaluating Large Language Models’ (LLMs) ability to follow instructions. |
YIWEI QIN et. al. | arxiv-cs.CL | 2024-01-07 |
977 | CharPoet: A Chinese Classical Poetry Generation System Based on Token-free LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) improve content control by allowing unrestricted user instructions, but the token-by-token generation process frequently makes format errors. Motivated by this, we propose CharPoet, a Chinese classical poetry generation system based on token-free LLM, which provides effective control over both format and content. |
Chengyue Yu; Lei Zang; Jiaotuan Wang; Chenyi Zhuang; Jinjie Gu; | arxiv-cs.CL | 2024-01-07 |
978 | RoBERTurk: Adjusting RoBERTa for Turkish Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We pretrain RoBERTa on a Turkish corpora using BPE tokenizer. Our model outperforms BERTurk family models on the BOUN dataset for the POS task while resulting in underperformance … |
Nuri Tas; | arxiv-cs.CL | 2024-01-07 |
979 | Using Large Language Models to Assess Tutors’ Performance in Reacting to Students Making Math Errors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capacity of generative AI to evaluate real-life tutors’ performance in responding to students making math errors. |
Sanjit Kakarla; Danielle Thomas; Jionghao Lin; Shivang Gupta; Kenneth R. Koedinger; | arxiv-cs.HC | 2024-01-06 |
980 | PIXAR: Auto-Regressive Language Modeling in Pixel Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce PIXAR, the first pixel-based autoregressive LLM that performs text generation. |
Yintao Tai; Xiyang Liao; Alessandro Suglia; Antonio Vergari; | arxiv-cs.CL | 2024-01-06 |
981 | PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets. |
GANG LIU et. al. | arxiv-cs.CL | 2024-01-05 |
982 | TinyLlama: An Open-Source Small Language Model IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. |
Peiyuan Zhang; Guangtao Zeng; Tianduo Wang; Wei Lu; | arxiv-cs.CL | 2024-01-04 |
983 | Re-evaluating The Memory-balanced Pipeline Parallelism: BPipe Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it suffers from imbalanced memory consumption, leading to insufficient memory utilization. The BPipe technique was proposed to address this issue and has proven effective in the GPT-3 model. |
MINCONG HUANG et. al. | arxiv-cs.LG | 2024-01-04 |
984 | Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we carry out the preliminary and comprehensive case study of utilizing GPT-4V for marine analysis. |
ZIQIANG ZHENG et. al. | arxiv-cs.CL | 2024-01-04 |
985 | Shayona@SMM4H23: COVID-19 Self Diagnosis Classification Using BERT and LightGBM Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes approaches and results for shared Task 1 and 4 of SMMH4-23 by Team Shayona. |
Rushi Chavda; Darshan Makwana; Vraj Patel; Anupam Shukla; | arxiv-cs.CL | 2024-01-04 |
986 | Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel text augmentation method that leverages the Fill-Mask feature of the transformer-based BERT model. |
Himmet Toprak Kesgin; Mehmet Fatih Amasyali; | arxiv-cs.CL | 2024-01-03 |
987 | Close to Human-Level Agreement: Tracing Journeys of Violent Speech in Incel Posts with GPT-4-Enhanced Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the prevalence of violent language on incels.is. |
Daniel Matter; Miriam Schirmer; Nir Grinberg; Jürgen Pfeffer; | arxiv-cs.SI | 2024-01-03 |
988 | MULTI-CASE: A Transformer-based Ethics-aware Multimodal Investigative Intelligence Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the challenge of multimodal analytics, we present MULTI-CASE, a holistic visual analytics framework tailored towards ethics-aware and multimodal intelligence exploration, designed in collaboration with domain experts. |
Maximilian T. Fischer; Yannick Metz; Lucas Joos; Matthias Miller; Daniel A. Keim; | arxiv-cs.HC | 2024-01-03 |
989 | MLPs Compass: What Is Learned When MLPs Are Combined with PLMs? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information. |
LI ZHOU et. al. | arxiv-cs.CL | 2024-01-03 |
990 | A Transformer-based Network Intrusion Detection Approach for Cloud Security Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZHENYUE LONG et. al. | J. Cloud Comput. | 2024-01-02 |
991 | Combating Fake News on Social Media: A Fusion Approach for Improved Detection and Interpretability Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The proliferation of fake news on social media prompted research groups to develop statistical and learning methods to combat this menace. Deep learning techniques could not model … |
Yasmine Khalid Zamil; N. M. Charkari; | IEEE Access | 2024-01-01 |
992 | SMTF: Sparse Transformer with Multiscale Contextual Fusion for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
XICHU ZHANG et. al. | Biomed. Signal Process. Control. | 2024-01-01 |
993 | Completed Part Transformer for Person Re-Identification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, part information of pedestrian images has been demonstrated to be effective for person re-identification (ReID), but the part interaction is ignored when using … |
Zhong Zhang; Di He; Shuang Liu; Baihua Xiao; T. Durrani; | IEEE Transactions on Multimedia | 2024-01-01 |
994 | Skip Connection Aggregation Transformer for Occluded Person Reidentification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The occlusion problem is a significant challenge for person reidentification. Recently, transformer-based methods have been introduced to solve the occlusion problem and achieve … |
Huijie Fan; Xiaotong Wang; Qiang Wang; Shengpeng Fu; Yandong Tang; | IEEE Transactions on Industrial Informatics | 2024-01-01 |
995 | MSGformer: A Multi-scale Grid Transformer Network for 12-lead ECG Arrhythmia Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
CHANGQING JI et. al. | Biomed. Signal Process. Control. | 2024-01-01 |
996 | Devising and Detecting Phishing Emails Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI programs, built using large language models, make it possible to automatically create phishing emails based on a few data points about a user. The V-Triad is a set of rules for … |
Fredrik Heiding; Bruce Schneier; Arun Vishwanath; Jeremy Bernstein; Peter S. Park; | IEEE Access | 2024-01-01 |
997 | Dual Attention Transformer Network for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhenqiu Shu; Yuyang Wang; Zhengtao Yu; | Eng. Appl. Artif. Intell. | 2024-01-01 |
998 | Transformer Fusion and Pixel-Level Contrastive Learning for RGB-D Salient Object Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Current RGB-D salient object detection (RGB-D SOD) methods mainly develop a generalizable model trained by binary cross-entropy (BCE) loss based on convolutional or Transformer … |
Jiesheng Wu; Fangwei Hao; Weiyun Liang; Jing Xu; | IEEE Transactions on Multimedia | 2024-01-01 |
999 | A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce LogicAsker, an automatic approach that comprehensively evaluates and improves the logical reasoning abilities of LLMs under a set of atomic reasoning skills based on propositional and predicate logic. |
YUXUAN WAN et. al. | arxiv-cs.SE | 2024-01-01 |
1000 | Improving Visual Grounding with Multi-scale Discrepancy Information and Centralized-transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jie Wu; Chunlei Wu; Fuyan Wang; Leiquan Wang; Yiwei Wei; | Expert Syst. Appl. | 2024-01-01 |
1001 | Improving RGB-infrared Object Detection with Cascade Alignment-guided Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Maoxun Yuan; Xiaorong Shi; Nan Wang; Yinyan Wang; Xingxing Wei; | Inf. Fusion | 2024-01-01 |
1002 | Counting-Stars: A Simple, Efficient, and Reasonable Strategy for Evaluating Long-Context Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: While recent research endeavors have concentrated on developing Large Language Models (LLMs) with robust long-context capabilities, due to the lack of appropriate evaluation … |
Mingyang Song; Mao Zheng; Xuan Luo; | ArXiv | 2024-01-01 |
1003 | MSViT: Training Multiscale Vision Transformers for Image Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The recently developed vision transformer (ViT) has achieved promising results on image retrieval compared to convolutional neural networks. However, most of these vision … |
Xue Li; Jiong Yu; Shaochen Jiang; Hongchun Lu; Ziyang Li; | IEEE Transactions on Multimedia | 2024-01-01 |
1004 | GraphGST: Graph Generative Structure-Aware Transformer for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Transformer holds significance in deep learning (DL) research. Node embedding (NE) and positional encoding (PE) are usually two indispensable components in a Transformer. The … |
MENGYING JIANG et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2024-01-01 |
1005 | Co-Training Transformer for Remote Sensing Image Classification, Segmentation, and Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Several fundamental remote sensing (RS) image processing tasks, including classification, segmentation, and detection, have been set to serve for manifold applications. In the RS … |
Qingyun Li; Yushi Chen; Xingyu He; Lin Huang; | IEEE Transactions on Geoscience and Remote Sensing | 2024-01-01 |
1006 | WF-Transformer: Learning Temporal Features for Accurate Anonymous Traffic Identification By Using Transformer Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Website Fingerprinting (WF) is a network traffic mining technique for anonymous traffic identification, which enables a local adversary to identify the target website that an … |
Qiang Zhou; Liangmin Wang; Huijuan Zhu; Tong Lu; Victor S. Sheng; | IEEE Transactions on Information Forensics and Security | 2024-01-01 |
1007 | Hierarchical Attention Transformer for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Hyperspectral image (HSI) data contain rich spectral–spatial information, which can be useful for various applications. Many methods have been proposed to classify the HSIs. … |
Tahir Arshad; Junping Zhang; | IEEE Geoscience and Remote Sensing Letters | 2024-01-01 |
1008 | HAU-Net: Hybrid CNN-transformer for Breast Ultrasound Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
HUAIKUN ZHANG et. al. | Biomed. Signal Process. Control. | 2024-01-01 |
1009 | U²-Former: Nested U-Shaped Transformer for Image Restoration Via Multi-View Contrastive Learning IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: While Transformer has achieved remarkable performance in various high-level vision tasks, it is still challenging to exploit the full potential of Transformer in image … |
XIN FENG et. al. | IEEE Transactions on Circuits and Systems for Video … | 2024-01-01 |
1010 | Progressive Source-Aware Transformer for Generalized Source-Free Domain Adaptation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Source-free domain adaptation (SFDA) tends to forget the source domain, suffering from limitations in real-world scenarios. Recently, generalized source-free domain adaptation … |
SONG TANG et. al. | IEEE Transactions on Multimedia | 2024-01-01 |
1011 | FET-FGVC: Feature-enhanced Transformer for Fine-grained Visual Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
HUAZHEN CHEN et. al. | Pattern Recognit. | 2024-01-01 |
1012 | Transformer-Based High-Fidelity Facial Displacement Completion for Detailed 3D Face Reconstruction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we tackle a special face completion task, facial displacement completion, which can offer a key component for many single image 3D face reconstruction systems. To … |
Renshuai Liu; Yao Cheng; Sifei Huang; Chengyang Li; Xuan Cheng; | IEEE Transactions on Multimedia | 2024-01-01 |
1013 | SecFormer: Towards Fast and Accurate Privacy-Preserving Inference for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is largely due to the multitude of nonlinear operations in the Transformer architecture, which are not well-suited to SMPC and difficult to circumvent or optimize effectively. To address this concern, we introduce an advanced optimization framework called SecFormer, to achieve fast and accurate PPI for Transformer models. |
JINGLONG LUO et. al. | arxiv-cs.LG | 2024-01-01 |
1014 | Temporal-spatial Transformer Based Motor Imagery Classification for BCI Using Independent Component Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View |
ADEL HAMEED et. al. | Biomed. Signal Process. Control. | 2024-01-01 |
1015 | Multiscale 3-D–2-D Mixed CNN and Lightweight Attention-Free Transformer for Hyperspectral and LiDAR Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The effective combination of hyperspectral image (HSI) and light detection and ranging (LiDAR) data can be used for land cover classification. Recently, deep-learning-based … |
Le Sun; Xinyu Wang; Yuhui Zheng; Zebin Wu; Liyong Fu; | IEEE Transactions on Geoscience and Remote Sensing | 2024-01-01 |
1016 | TCCU-Net: Transformer and CNN Collaborative Unmixing Network for Hyperspectral Image Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, deep-learning-based hyperspectral unmixing techniques have garnered increasing attention and made significant advancements. However, relying solely on the use of … |
JIANFENG CHEN et. al. | IEEE Journal of Selected Topics in Applied Earth … | 2024-01-01 |
1017 | DBCTNet: Double Branch Convolution-Transformer Network for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Currently, deep learning (DL) methods represented by convolutional neural networks (CNNs) or Transformers are of great interest in hyperspectral image (HSI) classification. And … |
RUI XU et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2024-01-01 |
1018 | MRIformer: A Multi-resolution Interactive Transformer for Wind Speed Multi-step Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View |
Chengqing Yu; Guangxi Yan; Chengming Yu; Xinwei Liu; Xiwei Mi; | Inf. Sci. | 2024-01-01 |
1019 | Advancing Fake News Detection: Hybrid Deep Learning With FastText and Explainable AI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The widespread propagation of misinformation on social media platforms poses a significant concern, prompting substantial endeavors within the research community to develop robust … |
Ehtesham Hashmi; Sule YAYILGAN YILDIRIM; M. Yamin; Subhan Ali; Mohamed Abomhara; | IEEE Access | 2024-01-01 |
1020 | ViTFSL-Baseline: A Simple Baseline of Vision Transformer Network for Few-Shot Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Few-shot image classification, whose goal is to generalize to unseen tasks with scarce labeled data, has developed rapidly over the years. However, in traditional few-shot … |
GUANGPENG WANG et. al. | IEEE Access | 2024-01-01 |
1021 | RISTRA: Recursive Image Super-Resolution Transformer With Relativistic Assessment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Many recent image restoration methods use Transformer as the backbone network and redesign the Transformer blocks. Differently, we explore the parameter-sharing mechanism over … |
Xiaoqiang Zhou; Huaibo Huang; Zilei Wang; Ran He; | IEEE Transactions on Multimedia | 2024-01-01 |
1022 | Efficient Classification of Malicious URLs: M-BERT—A Modified BERT Variant for Enhanced Semantic Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Malicious websites present a substantial threat to the security and privacy of individuals using the internet. Traditional approaches for identifying these malicious sites have … |
BOYANG YU et. al. | IEEE Access | 2024-01-01 |
1023 | A Dual-Branch Multiscale Transformer Network for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, convolutional neural networks (CNNs) have achieved great success in hyperspectral image (HSI) classification tasks. CNNs focus more on the local features of HSIs. … |
Cuiping Shi; Shuheng Yue; Liguo Wang; | IEEE Transactions on Geoscience and Remote Sensing | 2024-01-01 |
1024 | Refining One-class Representation: A Unified Transformer for Unsupervised Time-series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Guoxiang Zhong; Fagui Liu; Jun Jiang; Bin Wang; C.L. Philip Chen; | Inf. Sci. | 2024-01-01 |
1025 | Multilevel Class Token Transformer With Cross TokenMixer for Hyperspectral Images Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The transformer has become a prominent technique for hyperspectral image (HSI) classification, attributed to its capability to model global dependencies between features. … |
LEIQUAN WANG et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2024-01-01 |
1026 | Modified Distance Protection for Transmission Line with Hexagonal Phase-shifting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
F. Aboshady; | International Journal of Electrical Power & Energy Systems | |
1027 | Opening A Pandora’s Box: Things You Should Know in The Era of Custom GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct a comprehensive analysis of the security and privacy issues arising from the custom GPT platform. |
GUANHONG TAO et. al. | arxiv-cs.CR | 2023-12-31 |
1028 | A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Two-stream Hybrid CNN-Transformer Network (THCT-Net), which exploits the local specificity of CNN and models global dependencies through the Transformer. |
Ruoqi Yin; Jianqin Yin; | arxiv-cs.CV | 2023-12-31 |
1029 | GraphGPT: Graph Learning with Generative Pre-trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce \textit{GraphGPT}, a novel model for Graph learning by self-supervised Generative Pre-training Transformers. |
Qifang Zhao; Weidong Ren; Tianyu Li; Xiaoxiao Xu; Hong Liu; | arxiv-cs.LG | 2023-12-31 |
1030 | Trace and Edit Relation Associations in GPT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study introduces a novel approach for analyzing and modifying entity relationships in GPT models, diverging from ROME’s entity-focused methods. We develop a relation tracing … |
Jiahang Li; Taoyu Chen; Yuanli Wang; | ArXiv | 2023-12-30 |
1031 | Advancing TTP Analysis: Harnessing The Power of Encoder-Only and Decoder-Only Language Models with Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The state-of-the-art LLMs have shown to be prone to hallucination by providing inaccurate information, which is problematic in critical domains like cybersecurity. Therefore, we propose the use of Retrieval Augmented Generation (RAG) techniques to extract relevant contexts for each cyberattack procedure for decoder-only LLMs (without fine-tuning). |
Reza Fayyazi; Rozhina Taghdimi; Shanchieh Jay Yang; | arxiv-cs.CR | 2023-12-30 |
1032 | Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a framework to procedurally generate numerical questions and puzzles, and compare the results with and without the application of several red teaming techniques. |
ALEKSANDER BUSZYDLIK et. al. | arxiv-cs.CL | 2023-12-30 |
1033 | MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce MosaicBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining. |
JACOB PORTES et. al. | arxiv-cs.CL | 2023-12-29 |
1034 | FlashVideo: A Framework for Swift Inference in Text-to-Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FlashVideo, a novel framework tailored for swift Text-to-Video generation. |
Bin Lei; le Chen; Caiwen Ding; | arxiv-cs.CV | 2023-12-29 |
1035 | DB-GPT: Empowering Database Interactions with Private Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user experience and accessibility. |
SIQIAO XUE et. al. | arxiv-cs.DB | 2023-12-28 |
1036 | SentinelLMs: Encrypted Input Adaptation and Fine-tuning of Language Models for Private and Secure Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this introduces two fundamental risks: (a) the transmission of user inputs to the server via the network gives rise to interception vulnerabilities, and (b) privacy concerns emerge as organizations that deploy such models store user data with restricted context. To address this, we propose a novel method to adapt and fine-tune transformer-based language models on passkey-encrypted user-specific text. |
Abhijit Mishra; Mingda Li; Soham Deo; | arxiv-cs.CR | 2023-12-28 |
1037 | ClST: A Convolutional Transformer Framework for Automatic Modulation Recognition By Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, insufficient training signal data in complicated channel environments and large-scale DL models are critical factors that make DL methods difficult to deploy in practice. Aiming to these problems, we propose a novel neural network named convolution-linked signal transformer (ClST) and a novel knowledge distillation method named signal knowledge distillation (SKD). |
Dongbin Hou; Lixin Li; Wensheng Lin; Junli Liang; Zhu Han; | arxiv-cs.LG | 2023-12-28 |
1038 | BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose BEAt tracking Streaming Transformer (BEAST), an online joint beat and downbeat tracking system based on the streaming Transformer. |
Chih-Cheng Chang; Li Su; | arxiv-cs.SD | 2023-12-28 |
1039 | On The Rate of Convergence of An Over-parametrized Transformer Classifier Learned By Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here it is not only important what kind of models these network can approximate, or how they can generalize their knowledge learned by choosing the best possible approximation to a concrete data set, but also how well optimization of such transformer network based on concrete data set works. In this article we consider all these three different aspects simultaneously and show a theoretical upper bound on the missclassification probability of a transformer network fitted to the observed data. |
Michael Kohler; Adam Krzyzak; | arxiv-cs.LG | 2023-12-28 |
1040 | SCTNet: Single-Branch CNN with Transformer Semantic Information for Real-Time Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the additional branch incurs undesirable computational overhead and slows inference speed. To eliminate this dilemma, we propose SCTNet, a single branch CNN with transformer semantic information for real-time segmentation. |
ZHENGZE XU et. al. | arxiv-cs.CV | 2023-12-28 |
1041 | Evaluating The Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in undergraduate admissions exams proposed by the National Polytechnic Institute in Mexico. |
Sabino Miranda; Obdulia Pichardo-Lagunas; Bella Martínez-Seis; Pierre Baldi; | arxiv-cs.CL | 2023-12-28 |
1042 | Gemini Pro Defeated By GPT-4V: Evidence from Education IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study compared the classification performance of Gemini Pro and GPT-4V in educational settings. Employing visual question answering (VQA) techniques, the study examined both … |
Gyeong-Geon Lee; Ehsan Latif; Lehong Shi; Xiaoming Zhai; | ArXiv | 2023-12-27 |
1043 | Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces 26 guiding principles designed to streamline the process of querying and prompting large language models. |
Sondos Mahmoud Bsharat; Aidar Myrzakhan; Zhiqiang Shen; | arxiv-cs.CL | 2023-12-26 |
1044 | Transformer-based Few-shot Object Detection in Traffic Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View |
Erjun Sun; Di Zhou; Yan Tian; Zhaocheng Xu; Xun Wang; | Appl. Intell. | 2023-12-26 |
1045 | SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SecQA, a novel dataset tailored for evaluating the performance of Large Language Models (LLMs) in the domain of computer security. |
Zefang Liu; | arxiv-cs.CL | 2023-12-25 |
1046 | On The Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advent of large language models (LLMs) has heightened interest in their potential for multimodal applications that integrate language and vision. This paper explores the … |
CHENJIAO TAN et. al. | ArXiv | 2023-12-23 |
1047 | Fairness-Aware Structured Pruning in Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The increasing size of large language models (LLMs) has introduced challenges in their training and inference. Removing model components is perceived as a solution to tackle the … |
Abdelrahman Zayed; Goncalo Mordido; Samira Shabanian; Ioana Baldini; Sarath Chandar; | arxiv-cs.CL | 2023-12-23 |
1048 | Building Real-World Meeting Summarization Systems Using Large Language Models: A Practical Perspective IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies how to effectively build meeting summarization systems for real-world usage using large language models (LLMs). |
Md Tahmid Rahman Laskar; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN; | emnlp | 2023-12-22 |
1049 | Zero-Shot Multi-Label Topic Inference with Sentence Encoders and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conducted a comprehensive study with the latest Sentence Encoders and Large Language Models (LLMs) on the challenging task of �definition-wild zero-shot topic inference�, where users define or provide the topics of interest in real-time. |
Souvika Sarkar; Dongji Feng; Shubhra Kanti Karmaker Santu; | emnlp | 2023-12-22 |
1050 | Large Language Models Are Complex Table Parsers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to incorporate GPT-3. 5 to address such challenges, in which complex tables are reconstructed into tuples and specific prompt designs are employed for dialogues. |
BOWEN ZHAO et. al. | emnlp | 2023-12-22 |
1051 | Document-Level Machine Translation with Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The study focuses on three aspects: 1) Effects of Context-Aware Prompts, where we investigate the impact of different prompts on document-level translation quality and discourse phenomena; 2) Comparison of Translation Models, where we compare the translation performance of ChatGPT with commercial MT systems and advanced document-level MT methods; 3) Analysis of Discourse Modelling Abilities, where we further probe discourse knowledge encoded in LLMs and shed light on impacts of training techniques on discourse modeling. |
LONGYUE WANG et. al. | emnlp | 2023-12-22 |
1052 | Beware of Model Collapse! Fast and Stable Test-time Adaptation for Robust Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we delve into why TTA causes model collapse and find that the imbalanced label distribution inherent in QA is the reason for it. |
Yi Su; Yixin Ji; Juntao Li; Hai Ye; Min Zhang; | emnlp | 2023-12-22 |
1053 | Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we carry out a data archaeology to infer books that are known to ChatGPT and GPT-4 using a name cloze membership inference query. |
Kent Chang; Mackenzie Cramer; Sandeep Soni; David Bamman; | emnlp | 2023-12-22 |
1054 | Gemini Vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The rapidly evolving sector of Multi-modal Large Language Models (MLLMs) is at the forefront of integrating linguistic and visual processing in artificial intelligence. This paper … |
ZHANGYANG QI et. al. | ArXiv | 2023-12-22 |
1055 | Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Dynosaur, a dynamic growth paradigm for the automatic curation of instruction-tuning data. |
DA YIN et. al. | emnlp | 2023-12-22 |
1056 | Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the ArgTersely benchmark for sentence-level counter-argument generation, drawing from a manually annotated dataset from the ChangeMyView debate forum. |
JIAYU LIN et. al. | emnlp | 2023-12-22 |
1057 | API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, three pivotal questions remain unanswered: (1) How effective are current LLMs in utilizing tools? (2) How can we enhance LLMs? ability to utilize tools? (3) What obstacles need to be overcome to leverage tools? To address these questions, we introduce API-Bank, a groundbreaking benchmark, specifically designed for tool-augmented LLMs. |
MINGHAO LI et. al. | emnlp | 2023-12-22 |
1058 | Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the advancements of T2I models, a common issue encountered by users is the need for repetitive editing of input prompts in order to receive a satisfactory image, which is time-consuming and labor-intensive. Given the demonstrated text generation power of large-scale language models, such as GPT-k, we investigate the potential of utilizing such models to improve the prompt editing process for T2I generation. |
WANRONG ZHU et. al. | emnlp | 2023-12-22 |
1059 | Cabbage Sweeter Than Cake? Analysing The Potential of Large Language Models for Learning Conceptual Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These quality dimensions are usually learned from human judgements, which means that applications of conceptual spaces tend to be limited to narrow domains (e. g. modelling colour or taste). Encouraged by recent findings about the ability of Large Language Models (LLMs) to learn perceptually grounded representations, we explore the potential of such models for learning conceptual spaces. |
Usashi Chatterjee; Amit Gajbhiye; Steven Schockaert; | emnlp | 2023-12-22 |
1060 | LINC: A Neurosymbolic Approach for Logical Reasoning By Combining Language Models with First-Order Logic Provers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing in subtle and unpredictable ways. In this work, we investigate the validity of instead reformulating such tasks as modular neurosymbolic programming, which we call LINC: Logical Inference via Neurosymbolic Computation. |
THEO OLAUSSON et. al. | emnlp | 2023-12-22 |
1061 | Do Transformers Parse While Predicting The Masked Word? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Some doubts have been raised whether the models are doing parsing or only some computation weakly correlated with it. Concretely: (a) Is it possible to explicitly describe transformers with realistic embedding dimensions, number of heads, etc. that are capable of doing parsing ? or even approximate parsing? (b) Why do pre-trained models capture parsing structure? This paper takes a step toward answering these questions in the context of generative modeling with PCFGs. We show that masked language models like BERT or RoBERTa of moderate sizes can approximately execute the Inside-Outside algorithm for the English PCFG (Marcus et al. , 1993). |
Haoyu Zhao; Abhishek Panigrahi; Rong Ge; Sanjeev Arora; | emnlp | 2023-12-22 |
1062 | The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper provides a comprehensive analysis of the divergence between academic research in NLP and the needs of real-world NLP applications via a large-scale collection of user-GPT conversations. |
SIRU OUYANG et. al. | emnlp | 2023-12-22 |
1063 | Conceptor-Aided Debiasing of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two methods of applying conceptors (1) bias subspace projection by post-processing by the conceptor NOT operation; and (2) a new architecture, conceptor-intervened BERT (CI-BERT), which explicitly incorporates the conceptor projection into all layers during training. |
Li Yifei; Lyle Ungar; Jo�o Sedoc; | emnlp | 2023-12-22 |
1064 | Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, it is still an open question: shall we pretrain large autoregressive LMs with retrieval? To answer it, we perform a comprehensive study on a scalable pre-trained retrieval-augmented LM (i. e. , RETRO) compared with standard GPT and retrieval-augmented GPT incorporated at fine-tuning or inference stages. |
BOXIN WANG et. al. | emnlp | 2023-12-22 |
1065 | INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although recent learned metrics show high correlation with human judgement, these metrics do not provide explicit explanation of their verdict, nor associate the scores with defects in the generated text. To address this limitation, we present INSTRUCTSCORE, a fine-grained explainable evaluation metric for text generation. |
WENDA XU et. al. | emnlp | 2023-12-22 |
1066 | Let GPT Be A Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models. |
ZHENWEN LIANG et. al. | emnlp | 2023-12-22 |
1067 | Deep Natural Language Feature Learning for Interpretable Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task. |
Felipe Urrutia; Cristian Calderon; Valentin Barriere; | emnlp | 2023-12-22 |
1068 | GPT-RE: In-context Learning for Relation Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose GPT-RE to successfully address the aforementioned issues by (1) incorporating task-aware representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic. |
ZHEN WAN et. al. | emnlp | 2023-12-22 |
1069 | JASMINE: Arabic GPT Models for Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using our novel benchmark, we evaluate JASMINE extensively showing powerful performance intrinsically as well as in few-shot learning on a wide range of NLP tasks. We aim to responsibly release our models and evaluation benchmark with interested researchers, along with code for experimenting with them. |
El Moatez Billah Nagoudi; Muhammad Abdul-Mageed; AbdelRahim Elmadany; Alcides Inciarte; Md Tawkat Islam Khondaker; | emnlp | 2023-12-22 |
1070 | MRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present mRedditSum, the first multimodal discussion summarization dataset. |
Keighley Overbay; Jaewoo Ahn; Fatemeh Pesaran zadeh; Joonsuk Park; Gunhee Kim; | emnlp | 2023-12-22 |
1071 | NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this servingtime requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer�s FLOPs per document and can be served using commodity CPUs. |
Livio Soares; Daniel Gillick; Jeremy Cole; Tom Kwiatkowski; | emnlp | 2023-12-22 |
1072 | Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the table-to-text capabilities of different LLMs using four datasets within two real-world information seeking scenarios. |
YILUN ZHAO et. al. | emnlp | 2023-12-22 |
1073 | TheoremQA: A Theorem-driven Question Answering Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce TheoremQA, the first theorem-driven question-answering dataset designed to evaluate AI models� capabilities to apply theorems to solve challenging science problems. |
WENHU CHEN et. al. | emnlp | 2023-12-22 |
1074 | Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present novel experimental insights into the resilience of LLMs, particularly GPT-4, when subjected to extensive character-level permutations. |
Qi Cao; Takeshi Kojima; Yutaka Matsuo; Yusuke Iwasawa; | emnlp | 2023-12-22 |
1075 | Automatic Transcription of Handwritten Old Occitan Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an innovative HTR approach that leverages the Transformer architecture for recognizing handwritten Old Occitan language. |
Esteban Arias; Vallari Pai; Matthias Sch�ffel; Christian Heumann; Matthias Aenmacher; | emnlp | 2023-12-22 |
1076 | Towards Detecting Cascades of Biased Medical Claims on Twitter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a machine learning framework that uses two models in tandem: RoBERTa to detect medical claims and DistilBERT to classify bias. |
Libby Tiderman; Juan Sanchez Mercedes; Fiona Romanoschi; Fabricio Murai; | arxiv-cs.SI | 2023-12-22 |
1077 | Bootstrapping Small & High Performance Language Models with Unmasking-Removal Training Policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: BabyBERTa, a language model trained on small-scale child-directed speech while none of the words are unmasked during training, has been shown to achieve a level of grammaticality comparable to that of RoBERTa-base, which is trained on 6,000 times more words and 15 times more parameters. Relying on this promising result, we explore in this paper the performance of BabyBERTa-based models in downstream tasks, focusing on Semantic Role Labeling (SRL) and two Extractive Question Answering tasks, with the aim of building more efficient systems that rely on less data and smaller models. |
Yahan Yang; Elior Sulem; Insup Lee; Dan Roth; | emnlp | 2023-12-22 |
1078 | Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This ubiquitous layer of language models is often overlooked. We find that similarities between these input embeddings are highly interpretable and that the geometry of these embeddings differs between model families. |
Andrea Wen-Yi; David Mimno; | emnlp | 2023-12-22 |
1079 | Disentangling Transformer Language Models As Superposed Topic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics. |
Jia Peng Lim; Hady Lauw; | emnlp | 2023-12-22 |
1080 | Evaluation Metrics in The Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to improve the understanding of current models� performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three NLP benchmarks: text summarisation, text simplification and grammatical error correction (GEC), using both automatic and human evaluation. |
Andrea Sottana; Bin Liang; Kai Zou; Zheng Yuan; | emnlp | 2023-12-22 |
1081 | Knowledge Rumination for Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, despite the promising outcome, we empirically observe that PLMs may have already encoded rich knowledge in their pre-trained parameters but fails to fully utilize them when applying to knowledge-intensive tasks. In this paper, we propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize that related latent knowledge without retrieving them from the external corpus. |
YUNZHI YAO et. al. | emnlp | 2023-12-22 |
1082 | Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs Without Fine-tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Inference-time Policy Adapters (IPA), which efficiently tailors a language model such as GPT-3 without fine-tuning it. |
XIMING LU et. al. | emnlp | 2023-12-22 |
1083 | Large Language Models Are Biased to Overestimate Profoundness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We found a significant statement-to-statement correlation between the LLMs and humans, irrespective of the type of statements and the prompting technique used. |
Eugenio Herrera-Berg; Tom�s Browne; Pablo Le�n-Villagr�; Marc-Llu�s Vives; Cristian Calderon; | emnlp | 2023-12-22 |
1084 | VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces VIEScore, a Visual Instruction-guided Explainable metric for evaluating any conditional image generation tasks. |
Max Ku; Dongfu Jiang; Cong Wei; Xiang Yue; Wenhu Chen; | arxiv-cs.CV | 2023-12-22 |
1085 | Editing Common Sense in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether commonsense judgments are causally associated with localized, editable parameters in Transformers, and we provide an affirmative answer. |
ANSHITA GUPTA et. al. | emnlp | 2023-12-22 |
1086 | EELBERT: Tiny Models Through Dynamic Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce EELBERT, an approach for compression of transformer-based models (e. g. , BERT), with minimal impact on the accuracy of downstream tasks. |
Gabrielle Cohn; Rishika Agarwal; Deepanshu Gupta; Siddharth Patwardhan; | emnlp | 2023-12-22 |
1087 | Refining GPT-3 Embeddings with A Siamese Structure for Technical Post Duplicate Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we attempt to employ and refine the GPT-3 embeddings for the duplicate detection task. |
Xingfang Wu; Heng Li; Nobukazu Yoshioka; Hironori Washizaki; Foutse Khomh; | arxiv-cs.SE | 2023-12-22 |
1088 | IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an Induction-Augmented Generation (IAG) framework that utilizes inductive knowledge along with the retrieved documents for implicit reasoning. |
ZHEBIN ZHANG et. al. | emnlp | 2023-12-22 |
1089 | Effects of Sub-word Segmentation on Performance of Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms for morphological segmentation � Morfessor and StateMorph. |
Jue Hou; Anisia Katinskaia; Anh-Duc Vu; Roman Yangarber; | emnlp | 2023-12-22 |
1090 | Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose focusing on generalization, uncertainty, and how to leverage recent large language models, in order to create more practical tools to evaluate information veracity in contexts where perfect classification is impossible. |
KELLIN PELRINE et. al. | emnlp | 2023-12-22 |
1091 | Explicit Planning Helps Language Models in Logical Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose LEAP, a novel system that uses language models to perform multi-step logical reasoning and incorporates explicit planning into the inference procedure. |
Hongyu Zhao; Kangrui Wang; Mo Yu; Hongyuan Mei; | emnlp | 2023-12-22 |
1092 | Unsupervised Auditory and Semantic Entrainment Models with Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an unsupervised deep learning framework that derives meaningful representation from textual features for developing semantic entrainment. |
Jay Kejriwal; Stefan Benus; Lina M. Rojas-Barahona; | arxiv-cs.CL | 2023-12-22 |
1093 | SAMRank: Unsupervised Keyphrase Extraction Using Self-Attention Map in BERT and GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel unsupervised keyphrase extraction approach, called SAMRank, which uses only a self-attention map in a pre-trained language model (PLM) to determine the importance of phrases. |
Byungha Kang; Youhyun Shin; | emnlp | 2023-12-22 |
1094 | Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its importance to journalists and human fact-checkers, it remains a severely understudied problem, and the scarce research on this topic so far has only focused on English. Here we aim to bridge this gap by creating a novel dataset, X-CLAIM, consisting of 7K real-world claims collected from numerous social media platforms in five Indian languages and English. |
Shubham Mittal; Megha Sundriyal; Preslav Nakov; | emnlp | 2023-12-22 |
1095 | Sparse Universal Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is mainly because scaling UT parameters is more compute and memory intensive than scaling up a VT. This paper proposes the Sparse Universal Transformer (SUT), which leverages Sparse Mixture of Experts (SMoE) to reduce UT�s computation complexity while retaining its parameter efficiency and generalization ability. |
Shawn Tan; Yikang Shen; Zhenfang Chen; Aaron Courville; Chuang Gan; | emnlp | 2023-12-22 |
1096 | LLM-powered Data Augmentation for Enhanced Cross-lingual Performance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores the potential of leveraging Large Language Models (LLMs) for data augmentation in multilingual commonsense reasoning datasets where the available training data is extremely limited. |
Chenxi Whitehouse; Monojit Choudhury; Alham Aji; | emnlp | 2023-12-22 |
1097 | Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: i. e, generating large-scale harmful and misleading content*). To combat this emerging risk of LLMs, we propose a novel �***Fighting Fire with Fire***� (F3) strategy that harnesses modern LLMs� generative and emergent reasoning capabilities to counter human-written and LLM-generated disinformation. |
JASON LUCAS et. al. | emnlp | 2023-12-22 |
1098 | Harnessing Black-Box Control to Boost Commonsense in LM�s Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a computation-efficient framework that steers a frozen Pre-Trained Language Model (PTLM) towards more commonsensical generation (i. e. , producing a plausible output that incorporates a list of concepts in a meaningful way). |
Yufei Tian; Felix Zhang; Nanyun Peng; | emnlp | 2023-12-22 |
1099 | Axiomatic Preference Modeling for Longform Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The contributions of this work include: training a standalone preference model that can score human- and LLM-generated answers on the same scale; developing an axiomatic framework for generating training data pairs tailored to certain principles; and showing that a small amount of axiomatic signals can help small models outperform GPT-4 in preference scoring. |
Corby Rosset; Guoqing Zheng; Victor Dibia; Ahmed Awadallah; Paul Bennett; | emnlp | 2023-12-22 |
1100 | Exploring The Boundaries of GPT-4 in Radiology IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models. |
QIANCHU LIU et. al. | emnlp | 2023-12-22 |
1101 | Retrofitting Light-weight Language Models for Emotions Using Supervised Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel retrofitting method to induce emotion aspects into pre-trained language models (PLMs) such as BERT and RoBERTa. |
Sapan Shah; Sreedhar Reddy; Pushpak Bhattacharyya; | emnlp | 2023-12-22 |
1102 | ChatGPT As A Commenter to The News: Can LLMs Generate Human-like Opinions? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research we investigate to what extent GPT-3.5 can generate human-like comments on Dutch news articles. |
Rayden Tseng; Suzan Verberne; Peter van der Putten; | arxiv-cs.CL | 2023-12-21 |
1103 | Exploiting Novel GPT-4 APIs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that fine-tuning a model on as few as 15 harmful examples or 100 benign examples can remove core safeguards from GPT-4, enabling a range of harmful outputs. |
Kellin Pelrine; Mohammad Taufeeque; Michał Zając; Euan McLean; Adam Gleave; | arxiv-cs.CR | 2023-12-21 |
1104 | Generative Pretraining at Scale: Transformer-Based Encoding of Transactional Behavior for Fraud Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce an innovative autoregressive model leveraging Generative Pretrained Transformer (GPT) architectures, tailored for fraud detection in payment systems. |
Ze Yu Zhao; Zheng Zhu; Guilin Li; Wenhan Wang; Bo Wang; | arxiv-cs.LG | 2023-12-21 |
1105 | Efficacy of Machine-Generated Instructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large instruction-tuned language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. |
Samaksh Gulati; Anshit Verma; Manoj Parmar; Palash Chaudhary; | arxiv-cs.CL | 2023-12-21 |
1106 | Automated DevOps Pipeline Generation for Code Repositories Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a detailed investigation into the use of Large Language Models (LLMs) specifically, GPT 3.5 and GPT 4 to generate and evaluate GitHub Action workflows for DevOps tasks. |
Deep Mehta; Kartik Rawool; Subodh Gujar; Bowen Xu; | arxiv-cs.SE | 2023-12-20 |
1107 | AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their advancements, challenges in balancing code snippet generation with effective test case generation and execution persist. To address these issues, this paper introduces Multi-Agent Assistant Code Generation (AgentCoder), a novel solution comprising a multi-agent framework with specialized agents: the programmer agent, the test designer agent, and the test executor agent. |
DONG HUANG et. al. | arxiv-cs.CL | 2023-12-20 |
1108 | Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: A Focused Study on Chemical Entities of Biological Interest Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the era of foundational language models, this study compares and analyzes three NLP paradigms for curation tasks: in-context learning (ICL), fine-tuning (FT), and supervised learning (ML). |
EMILY GROVES et. al. | arxiv-cs.LG | 2023-12-20 |
1109 | HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As the number of hardware attacks on Internet of Things (IoT) devices continues to rapidly increase, we present the Hardware Vulnerability to Weakness Mapping (HW-V2W-Map) Framework, which is a Machine Learning (ML) framework focusing on hardware vulnerabilities and IoT security. |
YU-ZHENG LIN et. al. | arxiv-cs.CR | 2023-12-20 |
1110 | Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI’s LLM with Open Source SLMs in Production Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a systematic evaluation methodology and a characterization of modern open-source SLMs and their trade-offs when replacing proprietary LLMs for a real-world product feature. |
CHANDRA IRUGALBANDARA et. al. | arxiv-cs.SE | 2023-12-20 |
1111 | Advancing SQL Injection Detection for High-Speed Data Centers: A Novel Approach Using Cascaded NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel cascade SQLi detection method, blending classical and transformer-based NLP models, achieving a 99.86% detection accuracy with significantly lower computational demands-20 times faster than using transformer-based models alone. |
KASIM TASDEMIR et. al. | arxiv-cs.CR | 2023-12-20 |
1112 | A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a preliminary exploration of Gemini Pro’s visual understanding proficiency, which comprehensively covers four domains: fundamental perception, advanced cognition, challenging vision tasks, and various expert capacities. |
CHAOYOU FU et. al. | arxiv-cs.CV | 2023-12-19 |
1113 | Large Language Models in Medical Term Classification and Unexpected Misalignment Between Response and Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study assesses the ability of state-of-the-art large language models (LLMs) including GPT-3.5, GPT-4, Falcon, and LLaMA 2 to identify patients with mild cognitive impairment … |
XIAODAN ZHANG et. al. | ArXiv | 2023-12-19 |
1114 | Founder-GPT: Self-play to Evaluate The Founder-Idea Fit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research introduces an innovative evaluation method for the founder-idea fit in early-stage startups, utilizing advanced large language model techniques to assess founders’ profiles against their startup ideas to enhance decision-making. |
Sichao Xiong; Yigit Ihlamur; | arxiv-cs.CL | 2023-12-19 |
1115 | Can Transformers Learn Sequential Function Classes In Context? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel sliding window sequential function class and employ toy-sized transformers with a GPT-2 architecture to conduct our experiments. |
Ryan Campbell; Emma Guo; Evan Hu; Reya Vir; Ethan Hsiao; | arxiv-cs.LG | 2023-12-19 |
1116 | MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, most existing methods neglect the crucial significance of LLMs utilizing external tools and model collaboration. To address these challenges, we introduce MAC-SQL, a novel LLM-based multi-agent collaborative framework. |
BING WANG et. al. | arxiv-cs.CL | 2023-12-18 |
1117 | Stronger Graph Transformer with Regularized Attention Scores Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel version of edge regularization technique that alleviates the need for Positional Encoding and ultimately alleviate GT’s out of memory issue. |
Eugene Ku; | arxiv-cs.LG | 2023-12-18 |
1118 | A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage chiplet-based heterogeneous integration (HI) to design a high-performance and energy-efficient multi-chiplet platform to accelerate transformer workloads. |
Harsh Sharma; Pratyush Dhingra; Janardhan Rao Doppa; Umit Ogras; Partha Pratim Pande; | arxiv-cs.AR | 2023-12-18 |
1119 | An In-depth Look at Gemini’s Language Abilities IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we do an in-depth exploration of Gemini’s language abilities, making two contributions. First, we provide a third-party, objective comparison of the abilities of the OpenAI GPT and Google Gemini models with reproducible code and fully transparent results. |
SYEDA NAHIDA AKTER et. al. | arxiv-cs.CL | 2023-12-18 |
1120 | APIDocBooster: An Extract-Then-Abstract Framework Leveraging Large Language Models for Augmenting API Documentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce APIDocBooster, an extract-then-abstract framework that seamlessly fuses the advantages of both extractive (i.e., enabling faithful summaries without length limitation) and abstractive summarization (i.e., producing coherent and concise summaries). |
CHENGRAN YANG et. al. | arxiv-cs.SE | 2023-12-18 |
1121 | Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we evaluate the conversational reasoning capabilities of the current state-of-the-art LLM (GPT-4) on knowledge graphs (KGs). |
Yuxuan Huang; Lida Shi; Anqi Liu; Hao Xu; | arxiv-cs.CL | 2023-12-18 |
1122 | Time-Transformer: Integrating Local and Global Features for Better Time Series Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing generative models have failed to effectively learn both the local and global properties of time series data. To address this open problem, we propose a novel time series generative model named ‘Time-Transformer AAE’, which consists of an adversarial autoencoder (AAE) and a newly designed architecture named ‘Time-Transformer’ within the decoder. |
YUANSAN LIU et. al. | arxiv-cs.LG | 2023-12-18 |
1123 | T2M-HiFiGPT: Generating High Quality Human Motion from Textual Descriptions with Residual Discrete Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce T2M-HiFiGPT, a novel conditional generative framework for synthesizing human motion from textual descriptions. |
Congyi Wang; | arxiv-cs.CV | 2023-12-17 |
1124 | Decoding Concerns: Multi-label Classification of Vaccine Sentiments in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However the situation involves a mix of perspectives, with skepticism towards vaccines prevailing for various reasons such as political dynamics, apprehensions about side effects, and more. The paper addresses the challenge of comprehensively understanding and categorizing these diverse concerns expressed in the context of vaccination. |
Somsubhra De; Shaurya Vats; | arxiv-cs.SI | 2023-12-17 |
1125 | Multi-Label Classification of COVID-Tweets Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tried three different models-(a) Supervised BERT-large-uncased, (b) Supervised HateXplain model, and (c) Zero-Shot GPT-3.5 Turbo model. |
Aniket Deroy; Subhankar Maity; | arxiv-cs.CL | 2023-12-17 |
1126 | AI Gender Bias, Disparities, and Fairness: Does Training Data Matter? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The study employs three distinct techniques for bias analysis: Scoring accuracy difference to evaluate bias, mean score gaps by gender (MSG) to evaluate disparity, and Equalized Odds (EO) to evaluate fairness. |
Ehsan Latif; Xiaoming Zhai; Lei Liu; | arxiv-cs.CY | 2023-12-17 |
1127 | An Evaluation of GPT-4V and Gemini in Online VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct fine-grained analysis by generating seven types of metadata for nearly 2,000 visual questions, such as image type and the required image processing capabilities. |
Mengchen Liu; Chongyan Chen; Danna Gurari; | arxiv-cs.CV | 2023-12-17 |
1128 | Can Persistent Homology Whiten Transformer-based Black-box Models? A Case Study on BERT Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we propose Optimus BERT Compression and Explainability (OBCE), a methodology to bring explainability to BERT models using persistent homology, aiming to measure the importance of each neuron by studying the topological characteristics of their outputs. |
Luis Balderas; Miguel Lastra; José M. Benítez; | arxiv-cs.LG | 2023-12-17 |
1129 | Cross-Domain Robustness of Transformer-based Keyphrase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the effectiveness of abstractive text summarization models for keyphrase selection. |
Anna Glazkova; Dmitry Morozov; | arxiv-cs.CL | 2023-12-17 |
1130 | DeepArt: A Benchmark to Advance Fidelity Research in AI-Generated Content Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores the image synthesis capabilities of GPT-4, a leading multi-modal large language model. |
Wentao Wang; Xuanyao Huang; Tianyang Wang; Swalpa Kumar Roy; | arxiv-cs.CV | 2023-12-16 |
1131 | Cross-Linguistic Offensive Language Detection: BERT-Based Analysis of Bengali, Assamese, & Bodo Conversational Hateful Content from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we used BERT models, including XML-Roberta, L3-cube, IndicBERT, BenglaBERT, and BanglaHateBERT. |
Jhuma Kabir Mim; Mourad Oussalah; Akash Singhal; | arxiv-cs.CL | 2023-12-16 |
1132 | A Comparative Analysis of Large Language Models for Code Documentation Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive comparative analysis of Large Language Models (LLMs) for generation of code documentation. |
Shubhang Shekhar Dvivedi; Vyshnav Vijay; Sai Leela Rahul Pujari; Shoumik Lodh; Dhruv Kumar; | arxiv-cs.SE | 2023-12-16 |
1133 | SPT: Fine-Tuning Transformer-based Language Models Efficiently with Sparsification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the SPT system to fine-tune Transformer-based models efficiently by introducing sparsity. |
Yuntao Gui; Xiao Yan; Peiqi Yin; Han Yang; James Cheng; | arxiv-cs.DC | 2023-12-16 |
1134 | Exploring Automatic Text Simplification of German Narrative Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we apply transformer-based Natural Language Generation (NLG) techniques to the problem of text simplification. |
Thorben Schomacker; Tillmann Dönicke; Marina Tropmann-Frick; | arxiv-cs.CL | 2023-12-15 |
1135 | Red AI? Inconsistent Responses from GPT3.5 Models on Political Issues in The US and China Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We posed the same question about high-profile political issues in the United States and China to GPT in both English and simplified Chinese, and our analysis of the bilingual responses revealed that GPT’s bilingual models’ political knowledge (content) and the political attitude (sentiment) are significantly more inconsistent on political issues in China. |
Di Zhou; Yinxian Zhang; | arxiv-cs.CL | 2023-12-15 |
1136 | Towards Automated Regulatory Compliance Verification in Financial Auditing with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The auditing of financial documents, historically a labor-intensive process, stands on the precipice of transformation. AI-driven solutions have made inroads into streamlining … |
ARMIN BERGER et. al. | 2023 IEEE International Conference on Big Data (BigData) | 2023-12-15 |
1137 | Distilling Large Language Models for Matching Patients to Clinical Trials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are significant challenges associated with using closed-source proprietary LLMs like GPT-3.5 in practical healthcare applications, such as cost, privacy and reproducibility concerns. To address these issues, this study presents the first systematic examination of the efficacy of both proprietary (GPT-3.5, and GPT-4) and open-source LLMs (LLAMA 7B,13B, and 70B) for the task of patient-trial matching. |
Mauro Nievas; Aditya Basu; Yanshan Wang; Hrituraj Singh; | arxiv-cs.AI | 2023-12-15 |
1138 | LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the possibility of Language Adaptation for LLaMA models, explicitly focusing on addressing the challenge of Italian Language coverage. |
PIERPAOLO BASILE et. al. | arxiv-cs.CL | 2023-12-15 |
1139 | Algorithms for Automatic Intents Extraction and Utterances Classification for Goal-oriented Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A method for preprocessing dialog data sets in JSON format is described. |
Leonid Legashev; Alexander Shukhman; Vadim Badikov; | arxiv-cs.AI | 2023-12-15 |
1140 | Transformer-based LLMs in Cybersecurity: An In-depth Study on Log Anomaly Detection and Conversational Defense Mechanisms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the advancement of conversational AI and Large Language Models (LLMs), interactive chatbots are emerging as pivotal assets for connecting with users across various sectors, … |
Prasasthy Balasubramanian; Justin Seby; Panos Kostakos; | 2023 IEEE International Conference on Big Data (BigData) | 2023-12-15 |
1141 | Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present BinSum, a comprehensive benchmark and dataset of over 557K binary functions and introduce a novel method for prompt synthesis and optimization. |
Xin Jin; Jonathan Larson; Weiwei Yang; Zhiqiang Lin; | arxiv-cs.CR | 2023-12-15 |
1142 | 3DAxiesPrompts: Unleashing The 3D Spatial Task Capabilities of GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a new visual prompting method called 3DAxiesPrompts (3DAP) to unleash the capabilities of GPT-4V in performing 3D spatial tasks. |
DINGNING LIU et. al. | arxiv-cs.AI | 2023-12-15 |
1143 | Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly supervise superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model? |
COLLIN BURNS et. al. | arxiv-cs.CL | 2023-12-14 |
1144 | Weaving Pathways for Justice with GPT: LLM-driven Automated Drafting of Interactive Legal Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we describe 3 approaches to automating the completion of court forms: a generative AI approach that uses GPT-3 to iteratively prompt the user to answer questions, a constrained template-driven approach that uses GPT-4-turbo to generate a draft of questions that are subject to human review, and a hybrid method. |
Quinten Steenhuis; David Colarusso; Bryce Willey; | arxiv-cs.AI | 2023-12-14 |
1145 | TinyGSM: Achieving >80% on GSM8k with Small Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce \texttt{TinyGSM}, a synthetic dataset of 12.3M grade school math problems paired with Python solutions, generated fully by GPT-3.5. |
BINGBIN LIU et. al. | arxiv-cs.LG | 2023-12-14 |
1146 | VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Vision-Language Generative Pre-trained Transformer (VL-GPT), a transformer model proficient at concurrently perceiving and generating visual and linguistic data. |
JINGUO ZHU et. al. | arxiv-cs.CV | 2023-12-14 |
1147 | No-Skim: Towards Efficiency Robustness Evaluation on Skimming-based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose No-Skim, a general framework to help the owners of skimming-based LLM to understand and measure the robustness of their acceleration scheme. |
Shengyao Zhang; Mi Zhang; Xudong Pan; Min Yang; | arxiv-cs.CR | 2023-12-14 |
1148 | Heterogeneous Graph Neural Architecture Search with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a new GPT-4 based HGNAS model to improve the search efficiency and search accuracy of HGNAS. |
Haoyuan Dong; Yang Gao; Haishuai Wang; Hong Yang; Peng Zhang; | arxiv-cs.AI | 2023-12-14 |
1149 | Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators. |
Mohanad Odema; Hyoukjun Kwon; Mohammad Abdullah Al Faruque; | arxiv-cs.AR | 2023-12-14 |
1150 | Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Further, the concept drift phenomenon of API calls is prominent. To tackle these issues, we introduce a prompt engineering-assisted malware dynamic analysis using GPT-4. |
Pei Yan; Shunquan Tan; Miaohui Wang; Jiwu Huang; | arxiv-cs.CR | 2023-12-13 |
1151 | Assessing GPT4-V on Structured Reasoning Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-modality promises to unlock further uses for large language models. Recently, the state-of-the-art language model GPT-4 was enhanced with vision capabilities. We carry out a … |
Mukul Singh; J. Cambronero; Sumit Gulwani; Vu Le; Gust Verbruggen; | ArXiv | 2023-12-13 |
1152 | Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the application of multimodal ChatGPT within the realm of dietary assessment. |
FRANK P. -W. LO et. al. | arxiv-cs.CV | 2023-12-13 |
1153 | Native Language Identification with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first experiments on Native Language Identification (NLI) using LLMs such as GPT-4. |
Wei Zhang; Alexandre Salle; | arxiv-cs.CL | 2023-12-12 |
1154 | Towards Equipping Transformer with The Ability of Systematic Compositionality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tentatively provide a successful implementation of a multi-layer CAT on the basis of the especially popular BERT. |
Chen Huang; Peixin Qin; Wenqiang Lei; Jiancheng Lv; | arxiv-cs.CL | 2023-12-12 |
1155 | Abusive Span Detection for Vietnamese Narrative Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are limited studies on applying natural language processing (NLP) in this field in Vietnam. Therefore, we aim to contribute by building a human-annotated Vietnamese dataset for detecting abusive content in Vietnamese narrative texts. |
Nhu-Thanh Nguyen; Khoa Thi-Kim Phan; Duc-Vu Nguyen; Ngan Luu-Thuy Nguyen; | arxiv-cs.CL | 2023-12-12 |
1156 | How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our findings delineate GPT-4V’s capability boundaries in distribution shifts, shedding light on its strengths and limitations across various scenarios. Importantly, this investigation contributes to our understanding of how AI foundation models generalize to distribution shifts, offering pivotal insights into their adaptability and robustness. |
ZHONGYI HAN et. al. | arxiv-cs.LG | 2023-12-12 |
1157 | Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recognizing the significance of this development, this paper explores the integration of Large Language Models (LLMs) like Generative pre-trained transformer (GPT) into human-robot teaming environments to facilitate variable autonomy through the means of verbal human-robot communication. In this paper, we introduce a novel framework for such a GPT-powered multi-robot testbed environment, based on a Unity Virtual Reality (VR) setting. |
Younes Lakhnati; Max Pascher; Jens Gerken; | arxiv-cs.HC | 2023-12-12 |
1158 | Scaling Culture in Blockchain Gaming: Generative AI and Pseudonymous Engagement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Managing rapidly growing decentralized gaming communities brings unique challenges at the nexus of cultural economics and technology. This paper introduces a streamlined analytical framework that utilizes Large Language Models (LLMs), in this instance open-access generative pre-trained transformer (GPT) models, offering an efficient solution with deeper insights into community dynamics. |
Henrik Axelsen; Sebastian Axelsen; Valdemar Licht; Jason Potts; | arxiv-cs.HC | 2023-12-12 |
1159 | Transformer Attractors for Robust and Efficient End-to-End Neural Diarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to replace EDA with a transformer-based attractor calculation (TA) module. |
Lahiru Samarakoon; Samuel J. Broughton; Marc Harkönen; Ivan Fung; | arxiv-cs.SD | 2023-12-11 |
1160 | Genixer: Empowering Multimodal Large Language Models As A Powerful Data Generator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Genixer, a holistic data generation pipeline consisting of four key steps: (i) instruction data collection, (ii) instruction template design, (iii) empowering MLLMs, and (iv) data generation and filtering. |
Henry Hengyuan Zhao; Pan Zhou; Mike Zheng Shou; | arxiv-cs.CV | 2023-12-11 |
1161 | Pre-Trained Models for Intent Classification in Chatbot: Comparative Study and Critical Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of pre-trained models based on deep learning has considerably enhanced the development of many applications, such as chatbots. These models can be refined for … |
Adnane Souha; Charaf Ouaddi; Lamya Benaddi; Abdeslam Jakimi; | 2023 6th International Conference on Advanced Communication … | 2023-12-11 |
1162 | Revisiting The Role of Label Smoothing in Enhanced Text Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill in the gap, this article performs a set of in-depth analyses on eight datasets for text sentiment classification and three deep learning architectures: TextCNN, BERT, and RoBERTa, under two learning schemes: training from scratch and fine-tuning. |
Yijie Gao; Shijing Si; Hua Luo; Haixia Sun; Yugui Zhang; | arxiv-cs.CL | 2023-12-11 |
1163 | U-MixFormer: UNet-like Transformer with Mix-Attention for Efficient Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a novel transformer decoder, U-MixFormer, built upon the U-Net structure, designed for efficient semantic segmentation. |
Seul-Ki Yeom; Julian von Klitzing; | arxiv-cs.CV | 2023-12-11 |
1164 | AI Control: Improving Safety Despite Intentional Subversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop and evaluate pipelines of safety techniques (protocols) that are robust to intentional subversion. |
Ryan Greenblatt; Buck Shlegeris; Kshitij Sachan; Fabien Roger; | arxiv-cs.LG | 2023-12-11 |
1165 | From Text to Motion: Grounding GPT-4 in A Humanoid Robot Alter3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We report the development of Alter3, a humanoid robot capable of generating spontaneous motion using a Large Language Model (LLM), specifically GPT-4. |
Takahide Yoshida; Atsushi Masumori; Takashi Ikegami; | arxiv-cs.RO | 2023-12-11 |
1166 | SM70: A Large Language Model for Medical Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are introducing SM70, a 70 billion-parameter Large Language Model that is specifically designed for SpassMed’s medical devices under the brand name ‘JEE1’ (pronounced as G1 and means ‘Life’). |
Anubhav Bhatti; Surajsinh Parmar; San Lee; | arxiv-cs.CL | 2023-12-11 |
1167 | Generative Large Language Models Are All-purpose Text Analytics Engines: Text-to-text Learning Is All Your Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. Methods We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 billion parameters. |
CHENG PENG et. al. | arxiv-cs.CL | 2023-12-10 |
1168 | Image and Data Mining in Reticular Chemistry Using GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study demonstrates the remarkable ability of GPT-4V to navigate and obtain complex data for metal-organic frameworks, especially from graphical sources. |
ZHILING ZHENG et. al. | arxiv-cs.AI | 2023-12-09 |
1169 | FP8-BERT: Post-Training Quantization for Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we empirically validate the effectiveness of FP8 as a way to do Post-Training Quantization without significant loss of accuracy, with a simple calibration and format conversion process. |
Jianwei Li; Tianchi Zhang; Ian En-Hsu Yen; Dongkuan Xu; | arxiv-cs.AI | 2023-12-09 |
1170 | GPT-4 and Safety Case Generation: An Exploratory Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While their potential is undeniable across various domains, this paper sets out on a captivating expedition to investigate their uncharted territory, the exploration of generating safety cases. In this paper, our primary objective is to delve into the existing knowledge base of GPT-4, focusing specifically on its understanding of the Goal Structuring Notation (GSN), a well-established notation allowing to visually represent safety cases. |
Mithila Sivakumar; Alvine Boaye Belle; Jinjun Shan; Kimya Khakzad Shahandashti; | arxiv-cs.SE | 2023-12-09 |
1171 | A Review of Hybrid and Ensemble in Deep Learning for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review presents a comprehensive exploration of hybrid and ensemble deep learning models within Natural Language Processing (NLP), shedding light on their transformative potential across diverse tasks such as Sentiment Analysis, Named Entity Recognition, Machine Translation, Question Answering, Text Classification, Generation, Speech Recognition, Summarization, and Language Modeling. |
Jianguo Jia; Wen Liang; Youzhi Liang; | arxiv-cs.AI | 2023-12-09 |
1172 | Sim-GPT: Text Similarity Via GPT Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Due to the lack of a large collection of high-quality labeled sentence pairs with textual similarity scores, existing approaches for Semantic Textual Similarity (STS) mostly rely on unsupervised techniques or training signals that are only partially correlated with textual similarity, e.g., NLI-based datasets. To tackle this issue, in this paper, we propose the strategy of measuring text similarity via GPT annotated data (Sim-GPT for short). |
SHUHE WANG et. al. | arxiv-cs.CL | 2023-12-09 |
1173 | Illicit Darkweb Classification Via Natural-language Processing: Classifying Illicit Content of Webpages Based on Textual Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work aims at expanding previous works done in the context of illegal activities classification, performing three different steps. |
Giuseppe Cascavilla; Gemma Catolino; Mirella Sangiovanni; | arxiv-cs.IR | 2023-12-08 |
1174 | Exploring The Limits of ChatGPT in Software Security Applications Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have undergone rapid evolution and achieved remarkable results in recent times. OpenAI’s ChatGPT, backed by GPT-3.5 or GPT-4, has gained instant … |
FANGZHOU WU et. al. | ArXiv | 2023-12-08 |
1175 | Hijacking Context in Large Multi-modal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we identify a new limitation of off-the-shelf LMMs where a small fraction of incoherent images or text descriptions mislead LMMs to only generate biased output about the hijacked context, not the originally intended context. To address this, we propose a pre-filtering method that removes irrelevant contexts via GPT-4V, based on its robustness towards distribution shift within the contexts. |
Joonhyun Jeong; | arxiv-cs.AI | 2023-12-07 |
1176 | On Sarcasm Detection with OpenAI GPT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the applications of the Generative Pretrained Transformer (GPT) models, including GPT-3, InstructGPT, GPT-3.5, and GPT-4, in detecting sarcasm in natural language. |
Montgomery Gole; Williams-Paul Nwadiugwu; Andriy Miranskyy; | arxiv-cs.CL | 2023-12-07 |
1177 | Leveraging Transformer-based Language Models to Automate Requirements Satisfaction Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we leverage recent advances in natural language processing to deliver significantly more accurate results. |
Amrit Poudel; Jinfeng Lin; Jane Cleland-Huang; | arxiv-cs.SE | 2023-12-07 |
1178 | Enhancing Medical Task Performance in GPT-4V: A Comprehensive Study on Prompt Engineering Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From our comprehensive evaluations, we distilled 10 effective prompt engineering techniques, each fortifying GPT-4V’s medical acumen. |
PENGCHENG CHEN et. al. | arxiv-cs.CL | 2023-12-07 |
1179 | User-Aware Prefix-Tuning Is A Good Learner for Personalized Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, they need to update the entire caption model parameters when meeting new samples, which is time-consuming and calculation-intensive. To address this challenge, we propose a novel personalized image captioning framework that leverages user context to consider personality factors. |
Xuan Wang; Guanhong Wang; Wenhao Chai; Jiayu Zhou; Gaoang Wang; | arxiv-cs.CV | 2023-12-07 |
1180 | GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recently, GPT-4 with Vision (GPT-4V) has demonstrated remarkable visual capabilities across various tasks, but its performance in emotion recognition has not been fully evaluated. To bridge this gap, we present the quantitative evaluation results of GPT-4V on 21 benchmark datasets covering 6 tasks: visual sentiment analysis, tweet sentiment analysis, micro-expression recognition, facial emotion recognition, dynamic facial emotion recognition, and multimodal emotion recognition. |
ZHENG LIAN et. al. | arxiv-cs.CV | 2023-12-07 |
1181 | JAMMIN-GPT: Text-based Improvisation Using LLMs in Ableton Live Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a system that allows users of Ableton Live to create MIDI-clips by naming them with musical descriptions. |
Sven Hollowell; Tashi Namgyal; Paul Marshall; | arxiv-cs.SD | 2023-12-06 |
1182 | DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose DocBinFormer (Document Binarization Transformer), a novel two-level vision transformer (TL-ViT) architecture based on vision transformers for effective document image binarization. |
Risab Biswas; Swalpa Kumar Roy; Ning Wang; Umapada Pal; Guang-Bin Huang; | arxiv-cs.CV | 2023-12-06 |
1183 | Exploring The Reversal Curse and Other Deductive Logical Reasoning in BERT and GPT-Based Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our study, we examined a bidirectional LLM, BERT, and found that it is immune to the reversal curse. |
Da Wu; Jingye Yang; Kai Wang; | arxiv-cs.CL | 2023-12-06 |
1184 | Transformer-Powered Surrogates Close The ICF Simulation-Experiment Gap with Extremely Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenarios, where sparse experimental data is supplemented with simulation data. |
MATTHEW L. OLSON et. al. | arxiv-cs.LG | 2023-12-06 |
1185 | A Text-to-Text Model for Multilingual Offensive Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the majority of these models are limited in their capabilities due to their encoder-only architecture, which restricts the number and types of labels in downstream tasks. Addressing these limitations, this study presents the first pre-trained model with encoder-decoder architecture for offensive language identification with text-to-text transformers (T5) trained on two large offensive language identification datasets; SOLID and CCTK. |
Tharindu Ranasinghe; Marcos Zampieri; | arxiv-cs.CL | 2023-12-06 |
1186 | Net-GPT: A LLM-Empowered Man-in-the-Middle Chatbot for Unmanned Aerial Vehicle Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the dynamic realm of AI, integrating Large Language Models (LLMs) with security systems reshape cybersecurity. LLMs bolster defense against cyber threats but also introduce … |
BRETT PIGGOTT et. al. | 2023 IEEE/ACM Symposium on Edge Computing (SEC) | 2023-12-06 |
1187 | Enhancing Novelty in ChatGPT Responses: Incorporating Random Word Brainstorming Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a new prompting approach for increasing the novelty in ChatGPT responses. ChatGPT has proven to be effective in generating natural language responses; however, … |
Pittawat Taveekitworachai; R. Thawonmas; | Proceedings of the 13th International Conference on … | 2023-12-06 |
1188 | Empathy and Distress Detection Using Ensembles of Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approach for the WASSA 2023 Empathy, Emotion and Personality Shared Task. |
Tanmay Chavan; Kshitij Deshpande; Sheetal Sonawane; | arxiv-cs.CL | 2023-12-05 |
1189 | Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, it raises the concern that the current research findings only hold for GPT models but not LLM in general. In this work, we lift this pre-condition and build for the first time effective listwise rerankers without any form of dependency on GPT. |
Xinyu Zhang; Sebastian Hofstätter; Patrick Lewis; Raphael Tang; Jimmy Lin; | arxiv-cs.CL | 2023-12-05 |
1190 | RankZephyr: Effective and Robust Zero-Shot Listwise Reranking Is A Breeze! IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the gap between open-source and closed models persists, with reliance on proprietary, non-transparent models constraining reproducibility. Addressing this gap, we introduce RankZephyr, a state-of-the-art, open-source LLM for listwise zero-shot reranking. |
Ronak Pradeep; Sahel Sharifymoghaddam; Jimmy Lin; | arxiv-cs.IR | 2023-12-05 |
1191 | A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyzed the capability of GPT-4 to produce multiple-choice questions (MCQs) aligned with specific learning objectives (LOs) from Python programming classes in higher education. |
JACOB DOUGHTY et. al. | arxiv-cs.CY | 2023-12-05 |
1192 | Jellyfish: A Large Language Model for Data Preprocessing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Whereas the use of LLMs has sparked interest in devising universal solutions to DP, recent initiatives in this domain typically rely on GPT APIs, raising inevitable data breach concerns. Unlike these approaches, we consider instruction-tuning local LLMs (7 — 13B models) as universal DP task solvers that operate on a local, single, and low-priced GPU, ensuring data security and enabling further customization. |
Haochen Zhang; Yuyang Dong; Chuan Xiao; Masafumi Oyamada; | arxiv-cs.AI | 2023-12-04 |
1193 | Panini: A Transformer-based Grammatical Error Correction Method for Bangla Related Papers Related Patents Related Grants Related Venues Related Experts View |
Nahid Hossain; Mehedi Hasan Bijoy; Salekul Islam; Swakkhar Shatabda; | Neural Comput. Appl. | 2023-12-04 |
1194 | Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, due to the inherent unreliability and high operational cost of LLMs, their practical applicability is quite limited. To address these issues, this paper introduces MobileGPT, an innovative LLM-based mobile task automator equipped with a human-like app memory. |
SUNJAE LEE et. al. | arxiv-cs.HC | 2023-12-04 |
1195 | On Significance of Subword Tokenization for Low Resource and Efficient Named Entity Recognition: A Case Study in Marathi Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on NER for low-resource language and present our case study in the context of the Indian language Marathi. |
HARSH CHAUDHARI et. al. | arxiv-cs.CL | 2023-12-03 |
1196 | A Ripple in Time: A Discontinuity in American History Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this note we use the State of the Union Address (SOTU) dataset from Kaggle to make some surprising (and some not so surprising) observations pertaining to the general timeline of American history, and the character and nature of the addresses themselves. |
Alexander Kolpakov; Igor Rivin; | arxiv-cs.CL | 2023-12-02 |
1197 | Harnessing The Power of Prompt-based Techniques for Generating School-Level Questions Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions. |
Subhankar Maity; Aniket Deroy; Sudeshna Sarkar; | arxiv-cs.CL | 2023-12-02 |
1198 | From Voices to Validity: Leveraging Large Language Models (LLMs) for Textual Analysis of Policy Stakeholder Interviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the integration of Large Language Models (LLMs)–like GPT-4–with human expertise to enhance text analysis of stakeholder interviews regarding K-12 education policy within one U.S. state. |
Alex Liu; Min Sun; | arxiv-cs.HC | 2023-12-02 |
1199 | Swarm-GPT: Combining Large Language Models with Safe Motion Planning for Robot Choreography Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning – offering an automated and novel approach to deployable drone swarm choreography. |
AORAN JIAO et. al. | arxiv-cs.RO | 2023-12-02 |
1200 | An Information Fusion Based Approach to Context-based Fine-tuning of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View |
Toan Nguyen Mau; Anh-Cuong Le; Duc-Hong Pham; Van-Nam Huynh; | Inf. Fusion | 2023-12-01 |
1201 | Instruction-ViT: Multi-modal Prompts for Instruction Learning in Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZHE XIAO et. al. | Inf. Fusion | 2023-12-01 |
1202 | GenAI4Sustainability: GPT and Its Potentials For Achieving UN’s Sustainable Development Goals Related Papers Related Patents Related Grants Related Venues Related Experts View |
Rui Wang; Chaojie Li; Xiangyu Li; Rong Deng; Z. Dong; | IEEE CAA J. Autom. Sinica | 2023-12-01 |
1203 | Transformer-based Hierarchical Dynamic Decoders for Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
QINGPING ZHENG et. al. | Knowl. Based Syst. | 2023-12-01 |
1204 | Graphformer: Adaptive Graph Correlation Transformer for Multivariate Long Sequence Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yijie Wang; Hao Long; Linjiang Zheng; Jiaxing Shang; | Knowl. Based Syst. | 2023-12-01 |
1205 | GhostFormer: Efficiently Amalgamated CNN-transformer Architecture for Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xin Xie; Dengquan Wu; Mingye Xie; Zixi Li; | Pattern Recognit. | 2023-12-01 |
1206 | MCRformer: Morphological Constraint Reticular Transformer for 3D Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
JUN YU LI et. al. | Expert Syst. Appl. | 2023-12-01 |
1207 | TNN-IDS: Transformer Neural Network-based Intrusion Detection System for MQTT-enabled IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View |
SAFI ULLAH et. al. | Comput. Networks | 2023-12-01 |
1208 | Dual-resolution Transformer Combined with Multi-layer Separable Convolution Fusion Network for Real-time Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Kaidi Hu; Zongxia Xie; Qinghua Hu; | Comput. Graph. | 2023-12-01 |
1209 | Quick Back-Translation for Unsupervised Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a two-for-one improvement to Transformer back-translation: Quick Back-Translation (QBT). |
Benjamin Brimacombe; Jiawei Zhou; | arxiv-cs.CL | 2023-12-01 |
1210 | AI Chatbot for Tourist Recommendations: A Case Study in Vietnam Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Living standards are rising due to a more developed society, and recreation, particularly tourism, is becoming more critical. Expanding the tourist industry is one of the most … |
Hai Thanh Nguyen; Thien Thanh Tran; Phat Tan Nham; Nhi Uyen Bui Nguyen; A. D. Le; | Applied Computer Systems | 2023-12-01 |
1211 | Transformer-based Multi-attention Hybrid Networks for Skin Lesion Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhiwei Dong; Jinjiang Li; Zhen Hua; | Expert Syst. Appl. | 2023-12-01 |
1212 | QTN: Quaternion Transformer Network for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Numerous state-of-the-art transformer-based techniques with self-attention mechanisms have recently been demonstrated to be quite effective in the classification of hyperspectral … |
Xiaofei Yang; Weijia Cao; Yao Lu; Yicong Zhou; | IEEE Transactions on Circuits and Systems for Video … | 2023-12-01 |
1213 | Strawberry Ripeness Detection Based on YOLOv8 Algorithm Fused with LW-Swin Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Shizhong Yang; Wei Wang; Sheng Gao; Zhaopeng Deng; | Comput. Electron. Agric. | 2023-12-01 |
1214 | AMDGT: Attention Aware Multi-modal Fusion Using A Dual Graph Transformer for Drug-disease Associations Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View |
JUNKAI LIU et. al. | Knowl. Based Syst. | 2023-12-01 |
1215 | GIFT: Generative Interpretable Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Generative Interpretable Fine-Tuning (GIFT) for parameter-efficient fine-tuning of pretrained Transformer backbones, which can be formulated as a simple factorized matrix multiplication in the parameter space or equivalently in the activation/representation space, and thus embraces built-in interpretability. |
Chinmay Savadikar; Xi Song; Tianfu Wu; | arxiv-cs.CV | 2023-12-01 |
1216 | Autonomous Agents in Software Development: A Vision Paper Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have shown in our initial experimental analysis for simple software (e.g., Snake Game, Tic-Tac-Toe, Notepad) that multiple GPT agents can produce high-quality code and document it carefully. |
ZEESHAN RASHEED et. al. | arxiv-cs.SE | 2023-11-30 |
1217 | Applying Large Language Models and Chain-of-Thought for Automatic Scoring IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. |
Gyeong-Geon Lee; Ehsan Latif; Xuansheng Wu; Ninghao Liu; Xiaoming Zhai; | arxiv-cs.CL | 2023-11-30 |
1218 | CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most … |
PEI KE et. al. | ArXiv | 2023-11-30 |
1219 | MultiResFormer: Transformer with Adaptive Multi-Resolution Modeling for General Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MultiResFormer, which dynamically models temporal variations by adaptively choosing optimal patch lengths. |
LINFENG DU et. al. | arxiv-cs.LG | 2023-11-30 |
1220 | TransOpt: Transformer-based Representation Learning for Optimization Problem Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a representation of optimization problem instances using a transformer-based neural network architecture trained for the task of problem classification of the 24 problem classes from the Black-box Optimization Benchmarking (BBOB) benchmark. |
Gjorgjina Cenikj; Gašper Petelin; Tome Eftimov; | arxiv-cs.LG | 2023-11-29 |
1221 | Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This ubiquitous layer of language models is often overlooked. We find that similarities between these input embeddings are highly interpretable and that the geometry of these embeddings differs between model families. |
Andrea W Wen-Yi; David Mimno; | arxiv-cs.CL | 2023-11-29 |
1222 | Extrapolatable Transformer Pre-training for Ultra Long Time-Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present Timely Generative Pre-trained Transformer (TimelyGPT). |
Ziyang Song; Qincheng Lu; Hao Xu; David L. Buckeridge; Yue Li; | arxiv-cs.LG | 2023-11-29 |
1223 | TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Wide us of this language on social media platforms such as Twitter, Instagram, or Tiktok and strategic position of the country in the world politics makes it appealing for the social network researchers and industry. To address this need, we introduce TurkishBERTweet, the first large scale pre-trained language model for Turkish social media built using almost 900 million tweets. |
Ali Najafi; Onur Varol; | arxiv-cs.CL | 2023-11-29 |
1224 | LLVMs4Protest: Harnessing The Power of Large Language and Vision Models for Deciphering Protests in The News Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: UCLA-protest project contains labeled imagery data with information such as protest, violence, and sign. |
Yongjun Zhang; | arxiv-cs.CV | 2023-11-29 |
1225 | Improving The Robustness of Transformer-based Large Language Models with Dynamic Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method called dynamic attention, tailored for the transformer architecture, to enhance the inherent robustness of the model itself against various adversarial attacks. |
LUJIA SHEN et. al. | arxiv-cs.CL | 2023-11-29 |
1226 | RoKEPG: RoBERTa and Knowledge Enhancement for Prescription Generation of Traditional Chinese Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a RoBERTa and Knowledge Enhancement model for Prescription Generation of Traditional Chinese Medicine (RoKEPG). |
Hua Pu; Jiacong Mi; Shan Lu; Jieyue He; | arxiv-cs.CL | 2023-11-28 |
1227 | Biomedical Knowledge Graph-optimized Prompt Generation for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce a token-optimized and robust Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging a massive biomedical KG (SPOKE) with LLMs such as Llama-2-13b, GPT-3.5-Turbo and GPT-4, to generate meaningful biomedical text rooted in established knowledge. |
KARTHIK SOMAN et. al. | arxiv-cs.CL | 2023-11-28 |
1228 | General-Purpose Vs. Domain-Adapted Large Language Models for Extraction of Structured Data from Chest Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, variability in style limits usage. Study compares system using domain-adapted language model (RadLing) and general-purpose LLM (GPT-4) in extracting relevant features from chest radiology reports and standardizing them to common data elements (CDEs). |
ALI H. DHANALIWALA et. al. | arxiv-cs.CL | 2023-11-28 |
1229 | SEED-Bench-2: Benchmarking Multimodal Large Language Models IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Multimodal large language models (MLLMs), building upon the foundation of powerful large language models (LLMs), have recently demonstrated exceptional capabilities in generating … |
BOHAO LI et. al. | ArXiv | 2023-11-28 |
1230 | GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper does not present a novel method. |
WENHAO WU et. al. | arxiv-cs.CV | 2023-11-27 |
1231 | BERT Goes Off-Topic: Investigating The Domain Transfer Challenge Using Genre Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For example, a genre classifier trained on \textit{political} topics often fails when tested on documents about \textit{sport} or \textit{medicine}. In this work, we quantify this phenomenon empirically with a large corpus and a large set of topics. |
Dmitri Roussinov; Serge Sharoff; | arxiv-cs.CL | 2023-11-27 |
1232 | MEDITRON-70B: Scaling Medical Pretraining for Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we improve access to large-scale medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain. |
ZEMING CHEN et. al. | arxiv-cs.CL | 2023-11-27 |
1233 | OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D Occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes. |
WENZHAO ZHENG et. al. | arxiv-cs.CV | 2023-11-27 |
1234 | Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Medprompt, based on a composition of several prompting strategies. |
HARSHA NORI et. al. | arxiv-cs.CL | 2023-11-27 |
1235 | Comparative Analysis of ChatGPT, GPT-4, and Microsoft Bing Chatbots for GRE Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research paper presents an analysis of how well three artificial intelligence chatbots: Bing, ChatGPT, and GPT-4, perform when answering questions from standardized tests. |
Mohammad Abu-Haifa; Bara’a Etawi; Huthaifa Alkhatatbeh; Ayman Ababneh; | arxiv-cs.CL | 2023-11-26 |
1236 | Machine-Generated Text Detection Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our research focuses on the crucial challenge of discerning text produced by Large Language Models (LLMs) from human-generated text, which holds significance for various applications. With ongoing discussions about attaining a model with such functionality, we present supporting evidence regarding the feasibility of such models. |
Raghav Gaggar; Ashish Bhagchandani; Harsh Oza; | arxiv-cs.CL | 2023-11-26 |
1237 | FlowMind: Automatic Workflow Generation with LLMs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapidly evolving field of Robotic Process Automation (RPA) has made significant strides in automating repetitive processes, yet its effectiveness diminishes in scenarios … |
ZHEN ZENG et. al. | Proceedings of the Fourth ACM International Conference on … | 2023-11-25 |
1238 | AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering. |
Xiuyuan Chen; Yuan Lin; Yuchen Zhang; Weiran Huang; | arxiv-cs.CV | 2023-11-24 |
1239 | CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing medical dialogue models are mostly based on BERT and pre-trained on English corpora, but there is a lack of high-performing models on the task of Chinese medical dialogue generation. To solve the above problem, this paper proposes CMed-GPT, which is the GPT pre-training language model based on Chinese medical domain text. |
Zhijie Qu; Juan Li; Zerui Ma; Jianqiang Li; | arxiv-cs.CL | 2023-11-24 |
1240 | LLamol: A Dynamic Multi-Conditional Generative Transformer for De Novo Molecular Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative models have demonstrated substantial promise in Natural Language Processing (NLP) and have found application in designing molecules, as seen in General Pretrained Transformer (GPT) models. In our efforts to develop such a tool for exploring the organic chemical space in search of potentially electro-active compounds, we present LLamol, a single novel generative transformer model based on the LLama 2 architecture, which was trained on a 13M superset of organic compounds drawn from diverse public sources. |
Niklas Dobberstein; Astrid Maass; Jan Hamaekers; | arxiv-cs.LG | 2023-11-24 |
1241 | GPT Struct Me: Probing GPT Models on Narrative Entity Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such effectiveness raises a pertinent question: Can these models be leveraged for the extraction of structured information? In this work, we address this question by evaluating the capabilities of two state-of-the-art language models — GPT-3 and GPT-3.5, commonly known as ChatGPT — in the extraction of narrative entities, namely events, participants, and temporal expressions. |
Hugo Sousa; Nuno Guimarães; Alípio Jorge; Ricardo Campos; | arxiv-cs.CL | 2023-11-24 |
1242 | GPT-4V Takes The Wheel: Promises and Challenges for Pedestrian Behavior Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate GPT-4V(ision) on publicly available pedestrian datasets: JAAD and WiDEVIEW. |
Jia Huang; Peng Jiang; Alvika Gautam; Srikanth Saripalli; | arxiv-cs.CV | 2023-11-24 |
1243 | Benchmarking Large Language Models for Log Analysis, Security, and Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The best-performing fine-tuned sequence classification model (DistilRoBERTa) outperforms the current state-of-the-art; with an average F1-Score of 0.998 across six datasets from both web application and system log sources. To achieve this, we propose and implement a new experimentation pipeline (LLM4Sec) which leverages LLMs for log analysis experimentation, evaluation, and analysis. |
Egil Karlsen; Xiao Luo; Nur Zincir-Heywood; Malcolm Heywood; | arxiv-cs.NI | 2023-11-24 |
1244 | Evaluating GPT-4’s Vision Capabilities on Brazilian University Admission Exams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing studies often overlook questions that require the integration of visual comprehension, thus compromising the full spectrum and complexity inherent in real-world scenarios. To address this gap, we present a comprehensive framework to evaluate language models on entrance exams, which incorporates both textual and visual elements. |
Ramon Pires; Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira; | arxiv-cs.CL | 2023-11-23 |
1245 | Towards Explainable Strategy Templates Using NLP Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging traditional Natural Language Processing (NLP) techniques and Large Language Models (LLMs) equipped with Transformers, we outline how parts of DRL strategies composed of parts within strategy templates can be transformed into user-friendly, human-like English narratives. To achieve this, we present a top-level algorithm that involves parsing mathematical expressions of strategy templates, semantically interpreting variables and structures, generating rule-based primary explanations, and utilizing a Generative Pre-trained Transformer (GPT) model to refine and contextualize these explanations. |
Pallavi Bagga; Kostas Stathis; | arxiv-cs.AI | 2023-11-23 |
1246 | Forecasting Cryptocurrency Prices Using Deep Learning: Integrating Financial, Blockchain, and Text Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper explores the application of Machine Learning (ML) and Natural Language Processing (NLP) techniques in cryptocurrency price forecasting, specifically Bitcoin (BTC) and … |
Vincent Gurgul; Stefan Lessmann; W. Härdle; | ArXiv | 2023-11-23 |
1247 | Cultural Bias and Cultural Alignment of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a disaggregated evaluation of cultural bias for five widely used large language models (OpenAI’s GPT-4o/4-turbo/4/3.5-turbo/3) by comparing the models’ responses to nationally representative survey data. |
Yan Tao; Olga Viberg; Ryan S. Baker; Rene F. Kizilcec; | arxiv-cs.CL | 2023-11-23 |
1248 | MLLM-Bench, Evaluating Multi-modal LLMs Using GPT-4V Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In the pursuit of Artificial General Intelligence (AGI), the integration of vision in language models has marked a significant milestone. The advent of vision-language models … |
WENTAO GE et. al. | ArXiv | 2023-11-23 |
1249 | A Cross Attention Approach to Diagnostic Explainability Using Clinical Practice Guidelines for Depression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an application-specific language model called ProcesS knowledge-infused cross ATtention (PSAT), which incorporates CPGs when computing attention. |
SUMIT DALAL et. al. | arxiv-cs.AI | 2023-11-23 |
1250 | Current Topological and Machine Learning Applications for Bias Detection in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, few previous studies investigate large language model embeddings and geometric models of biased text data to understand geometry’s impact on bias modeling accuracy. To overcome this issue, this study utilizes the RedditBias database to analyze textual biases. |
COLLEEN FARRELLY et. al. | arxiv-cs.CY | 2023-11-22 |
1251 | Surpassing GPT-4 Medical Coding with A Two-Stage Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the GPT-4 LLM predicts an excessive number of ICD codes for medical coding tasks, leading to high recall but low precision. To tackle this challenge, we introduce LLM-codex, a two-stage approach to predict ICD codes that first generates evidence proposals using an LLM and then employs an LSTM-based verification stage. |
Zhichao Yang; Sanjit Singh Batra; Joel Stremmel; Eran Halperin; | arxiv-cs.CL | 2023-11-22 |
1252 | Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper employs different transformer models, and train for Named Entity Recognition (NER) in the context of Australian construction SCRM. |
Milad Baghalzadeh Shishehgarkhaneh; Robert C. Moehler; Yihai Fang; Amer A. Hijazi; Hamed Aboutorab; | arxiv-cs.CL | 2023-11-22 |
1253 | Comparison of Pipeline, Sequence-to-sequence, and GPT Models for End-to-end Relation Extraction: Experiments with The Rare Disease Use-case Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to compare three prevailing paradigms for E2ERE using a complex dataset focused on rare diseases involving discontinuous and nested entities. |
Shashank Gupta; Xuguang Ai; Ramakanth Kavuluru; | arxiv-cs.CL | 2023-11-22 |
1254 | Detecting Out-of-distribution Text Using Topological Features of Transformer-based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate our proposed TDA-based approach for out-of-distribution detection on BERT, a transformer-based language model, and compare the to a more traditional OOD approach based on BERT CLS embeddings. We found that our TDA approach outperforms the CLS embedding approach at distinguishing in-distribution data (politics and entertainment news articles from HuffPost) from far out-of-domain samples (IMDB reviews), but its effectiveness deteriorates with near out-of-domain (CNN/Dailymail) or same-domain (business news articles from HuffPost) datasets. |
Andres Pollano; Anupam Chaudhuri; Anj Simmons; | arxiv-cs.CL | 2023-11-21 |
1255 | AI and Veterinary Medicine: Performance of Large Language Models on The North American Licensing Examination Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study aimed to assess the performance of Large Language Models on the North American Veterinary Licensing Examination (NAVLE) and to analyze the impact of artificial … |
MIRANA ANGEL et. al. | 2023 Tenth International Conference on Social Networks … | 2023-11-21 |
1256 | NERIF: GPT-4V for Automatic Scoring of Drawn Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We randomly selected a set of balanced data (N = 900) that includes student-drawn models for six modeling assessment tasks. |
Gyeong-Geon Lee; Xiaoming Zhai; | arxiv-cs.AI | 2023-11-21 |
1257 | Visual Analytics for Generative Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel visual analytical framework to support the analysis of transformer-based generative networks. |
RAYMOND LI et. al. | arxiv-cs.CL | 2023-11-21 |
1258 | Interpretation of The Transformer and Improvement of The Extractor Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: It has been over six years since the Transformer architecture was put forward. Surprisingly, the vanilla Transformer architecture is still widely used today. One reason is that … |
Zhe Chen; | arxiv-cs.LG | 2023-11-21 |
1259 | InterPrompt: Interpretable Prompting for Interrelated Interpersonal Risk Factors in Reddit Posts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce an Interpretable Prompting (InterPrompt)} method to boost the attention mechanism by fine-tuning the GPT-3 model. |
MSVPJ Sathvik; Surjodeep Sarkar; Chandni Saxena; Sunghwan Sohn; Muskan Garg; | arxiv-cs.CL | 2023-11-21 |
1260 | From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This gap is addressed with the release of multimodal vision language models, such as GPT-4V, enabling AI to impact many more types of tasks. In light of these advancements, this paper presents a comprehensive evaluation of GPT-4V, a vision language model, across a wide spectrum of engineering design tasks, categorized into four main areas: Conceptual Design, System-Level and Detailed Design, Manufacturing and Inspection, and Engineering Education Tasks. |
CYRIL PICARD et. al. | arxiv-cs.AI | 2023-11-21 |
1261 | GPT4Motion: Scripting Physical Motions in Text-to-Video Generation Via Blender-Oriented GPT Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they usually encounter high computational costs and often struggle to produce videos with coherent physical motions. To tackle these issues, we propose GPT4Motion, a training-free framework that leverages the planning capability of large language models such as GPT, the physical simulation strength of Blender, and the excellent image generation ability of text-to-image diffusion models to enhance the quality of video synthesis. |
JIAXI LV et. al. | arxiv-cs.CV | 2023-11-21 |
1262 | Looped Transformers Are Better at Learning Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures. |
Liu Yang; Kangwook Lee; Robert Nowak; Dimitris Papailiopoulos; | arxiv-cs.LG | 2023-11-21 |
1263 | Extracting Definienda in Mathematical Scholarly Articles with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the development of transformer-based natural language processing applications, we pose the problem as (a) a token-level classification task using fine-tuned pre-trained transformers; and (b) a question-answering task using a generalist large language model (GPT). |
Shufan Jiang; Pierre Senellart; | arxiv-cs.AI | 2023-11-21 |
1264 | PhayaThaiBERT: Enhancing A Pretrained Thai Language Model with Unassimilated Loanwords Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While WangchanBERTa has become the de facto standard in transformer-based Thai language modeling, it still has shortcomings in regard to the understanding of foreign words, most notably English words, which are often borrowed without orthographic assimilation into Thai in many contexts. We identify the lack of foreign vocabulary in WangchanBERTa’s tokenizer as the main source of these shortcomings. |
Panyut Sriwirote; Jalinee Thapiang; Vasan Timtong; Attapol T. Rutherford; | arxiv-cs.CL | 2023-11-21 |
1265 | White-Box Transformers Via Sparse Rate Reduction: Compression Is All There Is? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we contend that a natural objective of representation learning is to compress and transform the distribution of the data, say sets of tokens, towards a low-dimensional Gaussian mixture supported on incoherent subspaces. |
YAODONG YU et. al. | arxiv-cs.LG | 2023-11-21 |
1266 | A Novel Transformer-based Approach for Soil Temperature Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel approach using transformer models for the purpose of forecasting soil temperature prediction. |
Muhammet Mucahit Enes Yurtsever; Ayhan Kucukmanisa; Zeynep Hilal Kilimci; | arxiv-cs.LG | 2023-11-20 |
1267 | Which AI Technique Is Better to Classify Requirements? An Experiment with SVM, LSTM, and ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, Large Language Models like ChatGPT have demonstrated remarkable proficiency in various Natural Language Processing tasks. |
Abdelkarim El-Hajjami; Nicolas Fafin; Camille Salinesi; | arxiv-cs.AI | 2023-11-20 |
1268 | Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer-based Large Language Models (LLMs) have been applied in diverse areas such as knowledge bases, human interfaces, and dynamic agents, and marking a stride towards achieving Artificial General Intelligence (AGI). |
YUNPENG HUANG et. al. | arxiv-cs.CL | 2023-11-20 |
1269 | GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V(ision), to facilitate one-shot visual teaching for robotic manipulation. |
Naoki Wake; Atsushi Kanehira; Kazuhiro Sasabuchi; Jun Takamatsu; Katsushi Ikeuchi; | arxiv-cs.RO | 2023-11-20 |
1270 | Financial Sentiment Analysis: Classic Methods Vs. Deep Learning Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment Analysis, also known as Opinion Mining, gained prominence in the early 2000s alongside the emergence of internet forums, blogs, and social media platforms. Researchers … |
A. Karanikola; Gregory Davrazos; C. M. Liapis; S. Kotsiantis; | Intell. Decis. Technol. | 2023-11-20 |
1271 | GPT in Data Science: A Practical Exploration of Model Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to elucidate and express the factors and assumptions guiding GPT-4’s model selection recommendations. |
Nathalia Nascimento; Cristina Tavares; Paulo Alencar; Donald Cowan; | arxiv-cs.AI | 2023-11-19 |
1272 | Assessing Prompt Injection Risks in 200+ Custom GPTs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through prompt injection, an adversary can not only extract the customized system prompts but also access the uploaded files. This paper provides a first-hand analysis of the prompt injection, alongside the evaluation of the possible mitigation of such attacks. |
JIAHAO YU et. al. | arxiv-cs.CR | 2023-11-19 |
1273 | GPT for The Metaverse in Smart Cities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The metaverse is a virtual space that blends elements of augmented reality, virtual reality, and many other technologies, offering a tailored and immersive experience where … |
RAMALINGAM M et. al. | 2023 26th International Symposium on Wireless Personal … | 2023-11-19 |
1274 | Zero-Shot Question Answering Over Financial Documents Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce a large language model (LLM) based approach to answer complex questions requiring multi-hop numerical reasoning over financial reports. While LLMs have exhibited … |
Karmvir Singh Phogat; Chetan Harsha; Sridhar Dasaratha; Shashishekar Ramakrishna; Sai Akhil Puranam; | ArXiv | 2023-11-19 |
1275 | Inspecting Explainability of Transformer Models with Additional Statistical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformer becomes more popular in the vision domain in recent years so there is a need for finding an effective way to interpret the Transformer model by visualizing it. |
Hoang C. Nguyen; Haeil Lee; Junmo Kim; | arxiv-cs.CV | 2023-11-19 |
1276 | Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on attention heads, a major component of the Transformer architecture, and propose a bias analysis framework to explore and identify a small set of biased heads that are found to contribute to a PLM’s stereotypical bias. |
Yi Yang; Hanyu Duan; Ahmed Abbasi; John P. Lalor; Kar Yan Tam; | arxiv-cs.CL | 2023-11-17 |
1277 | A Full-Scale Connected CNN-Transformer Network for Remote Sensing Image Change Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies have introduced transformer modules into convolutional neural networks (CNNs) to solve the inherent limitations of CNNs in global modeling and have achieved … |
MIN CHEN et. al. | Remote. Sens. | 2023-11-16 |
1278 | On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility — the softmax bottleneck. |
TING-RUI CHIANG et. al. | arxiv-cs.CL | 2023-11-16 |
1279 | Diagnosing and Debiasing Corpus-Based Political Bias and Insults in GPT2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We aim to contribute to the ongoing effort of investigating the ethical and social implications of human-AI interaction. |
Ambri Ma; Arnav Kumar; Brett Zeligson; | arxiv-cs.CL | 2023-11-16 |
1280 | Towards Autonomous Hypothesis Verification Via Language Models with Minimal Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research automation efforts usually employ AI as a tool to automate specific tasks within the research process. |
Shiro Takagi; Ryutaro Yamauchi; Wataru Kumagai; | arxiv-cs.AI | 2023-11-16 |
1281 | Self-Contradictory Reasoning Evaluation and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate self-contradictory (Self-Contra) reasoning, where the model reasoning does not support predictions. |
Ziyi Liu; Isabelle Lee; Yongkang Du; Soumya Sanyal; Jieyu Zhao; | arxiv-cs.CL | 2023-11-16 |
1282 | ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Despite Large Language Models (LLMs) like GPT-4 achieving impressive results in function-level code generation, they struggle with repository-scale code understanding (e.g., … |
XIANGRU TANG et. al. | arxiv-cs.CL | 2023-11-16 |
1283 | TransCrimeNet: A Transformer-Based Model for Text-Based Crime Prediction in Criminal Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents TransCrimeNet, a novel transformer-based model for predicting future crimes in criminal networks from textual data. Criminal network analysis has become vital … |
Chen Yang; | ArXiv | 2023-11-16 |
1284 | Understanding The Effectiveness of Large Language Models in Detecting Security Vulnerabilities IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Security vulnerabilities in modern software are prevalent and harmful. While automated vulnerability detection tools have made promising progress, their scalability and … |
AVISHREE KHARE et. al. | ArXiv | 2023-11-16 |
1285 | Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This advancement in Generative AI presents a wealth of exciting opportunities and, simultaneously, unprecedented challenges. Throughout this paper, we have explored these state-of-the-art models, the diverse array of tasks they can accomplish, the challenges they pose, and the promising future of Generative Artificial Intelligence. |
STAPHORD BENGESI et. al. | arxiv-cs.LG | 2023-11-16 |
1286 | Multi-View Spectrogram Transformer for Respiratory Sound Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, a Multi-View Spectrogram Transformer (MVST) is proposed to embed different views of time-frequency characteristics into the vision transformer. |
Wentao He; Yuchen Yan; Jianfeng Ren; Ruibin Bai; Xudong Jiang; | arxiv-cs.SD | 2023-11-16 |
1287 | Generative AI for Hate Speech Detection: Evaluation and Findings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this chapter, we provide a review of relevant methods, experimental setups and evaluation of this approach. |
Sagi Pendzel; Tomer Wullach; Amir Adler; Einat Minkov; | arxiv-cs.CL | 2023-11-16 |
1288 | We Demand Justice!: Towards Social Context Grounding of Political Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two challenging datasets that require an understanding of the real-world context of the text. |
Rajkumar Pujari; Chengfei Wu; Dan Goldwasser; | arxiv-cs.CL | 2023-11-15 |
1289 | Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News Articles Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Misinformation, fake news, and various propaganda techniques are increasingly used in digital media. It becomes challenging to uncover propaganda as it works with the systematic … |
Deptii D. Chaudhari; A. V. Pawar; | Big Data Cogn. Comput. | 2023-11-15 |
1290 | Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a holistic end-to-end solution for annotating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels concerning the verifiability and factual inconsistencies found in LLM outputs. |
YUXIA WANG et. al. | arxiv-cs.CL | 2023-11-15 |
1291 | MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability — MELA, with 46K samples covering 10 languages from a diverse set of language families. |
ZIYIN ZHANG et. al. | arxiv-cs.CL | 2023-11-15 |
1292 | LOKE: Linked Open Knowledge Extraction for Automated Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through this analysis and a qualitative analysis of sentence extractions via all methods, we found that LOKE-GPT extractions are of high utility for the KGC task and suitable for use in semi-automated extraction settings. |
Jamie McCusker; | arxiv-cs.CL | 2023-11-15 |
1293 | Llamas Know What GPTs Don’t Show: Surrogate Models for Confidence Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: To maintain user trust, large language models (LLMs) should signal low confidence on examples where they are incorrect, instead of misleading the user. The standard approach of … |
Vaishnavi Shrivastava; Percy Liang; Ananya Kumar; | arxiv-cs.CL | 2023-11-15 |
1294 | Jailbreaking GPT-4V Via Self-Adversarial Attacks with System Prompts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through carefully designed dialogue, we successfully extract the internal system prompts of GPT-4V. This finding indicates potential exploitable security risks in MLLMs; 2) Based on the acquired system prompts, we propose a novel MLLM jailbreaking attack method termed SASP (Self-Adversarial Attack via System Prompt). |
Yuanwei Wu; Xiang Li; Yixin Liu; Pan Zhou; Lichao Sun; | arxiv-cs.CR | 2023-11-15 |
1295 | Transformers in The Service of Description Logic-based Contexts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we construct the natural language dataset, DELTA$_D$, using the description logic language $\mathcal{ALCQ}$. |
Angelos Poulis; Eleni Tsalapati; Manolis Koubarakis; | arxiv-cs.CL | 2023-11-15 |
1296 | Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Importantly, only proprietary models such as GPT-3.5 and GPT-4 can recognize nonsensical guidelines, which we hypothesize is due to more sophisticated alignment methods. |
Marcio Fonseca; Shay B. Cohen; | arxiv-cs.CL | 2023-11-15 |
1297 | How Good Are Large Language Models on African Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of four popular large language models (mT0, Aya, LLaMa 2, and GPT-4) on six tasks (topic classification, sentiment classification, machine translation, summarization, question answering, and named entity recognition) across 60 African languages, spanning different language families and geographical regions. |
Jessica Ojo; Kelechi Ogueji; Pontus Stenetorp; David Ifeoluwa Adelani; | arxiv-cs.CL | 2023-11-14 |
1298 | Secure Transformer Inference Protocol Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Drawing insights from our hands-on experience in developing two real-world Transformer-based services, we identify the inherent efficiency bottleneck in the two-party assumption. To overcome this limitation, we propose a novel three-party threat model. |
Mu Yuan; Lan Zhang; Xiang-Yang Li; | arxiv-cs.CR | 2023-11-14 |
1299 | Automated Title and Abstract Screening for Scoping Reviews Using The GPT-4 Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This manuscript introduces GPTscreenR, a package for the R statistical programming language that uses the GPT-4 Large Language Model (LLM) to automatically screen sources. |
David Wilkins; | arxiv-cs.CL | 2023-11-14 |
1300 | Exploring Semi-supervised Hierarchical Stacked Encoder for Legal Judgement Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The lengthy and non-uniform document structure poses an even greater challenge in extracting information for decision prediction. In this work, we explore and propose a two-level classification mechanism; both supervised and unsupervised; by using domain-specific pre-trained BERT to extract information from long documents in terms of sentence embeddings further processing with transformer encoder layer and use unsupervised clustering to extract hidden labels from these embeddings to better predict a judgment of a legal case. |
Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki; | arxiv-cs.CL | 2023-11-14 |
1301 | Memory-efficient Stochastic Methods for Memory-based Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel two-phase training mechanism and a novel regularization technique to improve the training efficiency of memory-based transformers, which are often used for long-range context problems. |
Vishwajit Kumar Vishnu; C. Chandra Sekhar; | arxiv-cs.LG | 2023-11-14 |
1302 | Transformer Network with Decoupled Spatial–temporal Embedding for Traffic Flow Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View |
WEI SUN et. al. | Applied Intelligence | 2023-11-13 |
1303 | Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore and evaluate updating LLM used for candidate recommendation during the learning of the text based game as well to mitigate the reliance on the human annotated gameplays, which are costly to acquire. |
Arjun Vaithilingam Sudhakar; Prasanna Parthasarathi; Janarthanan Rajendran; Sarath Chandar; | arxiv-cs.CL | 2023-11-13 |
1304 | Towards Understanding The Geospatial Skills of ChatGPT: Taking A Geographic Information Systems (GIS) Exam Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper examines the performance of ChatGPT, a large language model (LLM), in a geographic information systems (GIS) exam. As LLMs like ChatGPT become increasingly prevalent in … |
Peter Mooney; Wencong Cui; Boyuan Guan; L. Juhász; | Proceedings of the 6th ACM SIGSPATIAL International … | 2023-11-13 |
1305 | Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction Using Cogtale Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While some existing work focus on evaluating large language models performance on retrieving and answering questions from documents, assessing the LLMs performance on QA types that require exact answer selection from predefined options and numerical extraction is yet to be fully assessed. In this paper, we specifically focus on this underexplored context and conduct empirical analysis of LLMs (GPT-4 and GPT-3.5) on question types, including single-choice, yes-no, multiple-choice, and number extraction questions from documents in zero-shot setting. |
ZAFARYAB RASOOL et. al. | arxiv-cs.IR | 2023-11-13 |
1306 | Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4, using the ConceptARC benchmark [10], which is designed to evaluate robust understanding and reasoning with core-knowledge concepts. |
Melanie Mitchell; Alessandro B. Palmarini; Arseny Moskvichev; | arxiv-cs.AI | 2023-11-13 |
1307 | A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, the true challenge lies in the domain of knowledge-intensive VQA tasks, which necessitate not just recognition of visual elements, but also a deep comprehension of the visual information in conjunction with a vast repository of learned knowledge. To uncover such capabilities of MLMs, particularly the newly introduced GPT-4V, we provide an in-depth evaluation from three perspectives: 1) Commonsense Knowledge, which assesses how well models can understand visual cues and connect to general knowledge; 2) Fine-grained World Knowledge, which tests the model’s skill in reasoning out specific knowledge from images, showcasing their proficiency across various specialized fields; 3) Comprehensive Knowledge with Decision-making Rationales, which examines model’s capability to provide logical explanations for its inference, facilitating a deeper analysis from the interpretability perspective. |
YUNXIN LI et. al. | arxiv-cs.CL | 2023-11-13 |
1308 | Interaction Is All You Need? A Study of Robots Ability to Understand and Execute Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to address a critical challenge in robotics, which is enabling them to operate seamlessly in human environments through natural language interactions. |
Kushal Koshti; Nidhir Bhavsar; | arxiv-cs.RO | 2023-11-13 |
1309 | Speech-based Slot Filling Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the potential application of LLMs to slot filling with noisy ASR transcriptions, via both in-context learning and task-specific fine-tuning. |
GUANGZHI SUN et. al. | arxiv-cs.CL | 2023-11-13 |
1310 | MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Several new LLMs have been introduced recently, necessitating their evaluation on non-English languages. This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3.5-Turbo, GPT-4, PaLM2, Gemini-Pro, Mistral, Llama2, and Gemma) by comparing them on the same set of multilingual datasets. |
SANCHIT AHUJA et. al. | arxiv-cs.CL | 2023-11-13 |
1311 | The Impact of Large Language Models on Scientific Discovery: A Preliminary Study Using GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we delve into the performance of LLMs within the context of scientific discovery, focusing on GPT-4, the state-of-the-art language model. |
Microsoft Research AI4Science; Microsoft Azure Quantum; | arxiv-cs.CL | 2023-11-13 |
1312 | GPT-4V(ision) As A Social Media Analysis Engine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore GPT-4V(ision)’s capabilities for social multimedia analysis. |
HANJIA LYU et. al. | arxiv-cs.CV | 2023-11-13 |
1313 | GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MM-Navigator, a GPT-4V-based agent for the smartphone graphical user interface (GUI) navigation task. |
AN YAN et. al. | arxiv-cs.CV | 2023-11-13 |
1314 | LT-ViT: A Vision Transformer for Multi-label Chest X-ray Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we have developed LT-ViT, a transformer that utilizes combined attention between image tokens and randomly initialized auxiliary tokens that represent labels. |
Umar Marikkar; Sara Atito; Muhammad Awais; Adam Mahdi; | arxiv-cs.CV | 2023-11-13 |
1315 | SpectralGPT: Spectral Remote Sensing Foundation Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). |
DANFENG HONG et. al. | arxiv-cs.CV | 2023-11-13 |
1316 | Evaluation of GPT-4 for Chest X-ray Impression Generation: A Reader Study on Performance and Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study we explored and analyzed the generative abilities of GPT-4 for Chest X-ray impression generation. |
SEBASTIAN ZIEGELMAYER et. al. | arxiv-cs.CL | 2023-11-12 |
1317 | Retrieval and Generative Approaches for A Pregnancy Chatbot in Nepali with Stemmed and Non-Stemmed Data : A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To provide pregnancy-related information a health domain chatbot has been proposed and this work explores two different NLP-based approaches for developing the chatbot. |
Sujan Poudel; Nabin Ghimire; Bipesh Subedi; Saugat Singh; | arxiv-cs.CL | 2023-11-12 |
1318 | TransHSI: A Hybrid CNN-Transformer Method for Disjoint Sample-Based Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Hyperspectral images’ (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these … |
Ping Zhang; Haiyang Yu; Pengao Li; Ruili Wang; | Remote. Sens. | 2023-11-12 |
1319 | NewsGPT: ChatGPT Integration for Robot-Reporter Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper a novel system is proposed that integrates AI’s generative pretrained transformer (GPT) model with the Pepper robot, with the aim of improving the robot’s natural language understanding and response generation capabilities for enhanced social interactions. |
Abdelhadi Hireche; Abdelkader Nasreddine Belkacem; Sadia Jamil; Chao Chen; | arxiv-cs.RO | 2023-11-11 |
1320 | Traffic Sign Recognition Using Local Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a new novel model that blends the advantages of both convolutional and transformer-based networks for traffic sign recognition. |
Ali Farzipour; Omid Nejati Manzari; Shahriar B. Shokouhi; | arxiv-cs.CV | 2023-11-11 |
1321 | Controllable Topic-Focused Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new Transformer-based architecture capable of producing topic-focused summaries. |
Seyed Ali Bahrainian; Martin Jaggi; Carsten Eickhoff; | arxiv-cs.CL | 2023-11-11 |
1322 | ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, with enlarged context length to 4,096 tokens and undergoes a comprehensive training regime with pre-training, SFT, and RLHF. |
Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang; | arxiv-cs.CL | 2023-11-10 |
1323 | Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to leverage a Transformer-based architecture with attention layers to automatically capture feature interactions. |
HUAN GUI et. al. | arxiv-cs.IR | 2023-11-10 |
1324 | Heaps’ Law in GPT-Neo Large Language Model Emulated Corpora Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While this law has been validated in diverse human-authored text corpora, its applicability to large language model generated text remains unexplored. This study addresses this gap, focusing on the emulation of corpora using the suite of GPT-Neo large language models. |
Uyen Lai; Gurjit S. Randhawa; Paul Sheridan; | arxiv-cs.CL | 2023-11-10 |
1325 | Argumentation Element Annotation Modeling Using XLNet Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study demonstrates the effectiveness of XLNet, a transformer-based language model, for annotating argumentative elements in persuasive essays. |
Christopher Ormerod; Amy Burkhardt; Mackenzie Young; Sue Lottridge; | arxiv-cs.CL | 2023-11-10 |
1326 | Holistic Evaluation of GPT-4V for Biomedical Imaging Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we present a large-scale evaluation probing GPT-4V’s capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial … |
ZHENG LIU et. al. | ArXiv | 2023-11-10 |
1327 | Fine-tuning Pretrained Transformer Encoders for Sequence-to-sequence Learning Related Papers Related Patents Related Grants Related Venues Related Experts View |
HANGBO BAO et. al. | Int. J. Mach. Learn. Cybern. | 2023-11-10 |
1328 | Establishing Performance Baselines in Fine-Tuning, Retrieval-Augmented Generation and Soft-Prompting for Non-Specialist LLM Users Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we tested an unmodified version of GPT 3.5, a fine-tuned version, and the same unmodified model when given access to a vectorised RAG database, both in isolation and in combination with a basic, non-algorithmic soft prompt. |
JENNIFER DODGSON et. al. | arxiv-cs.IR | 2023-11-10 |
1329 | On The Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore the model’s abilities to understand and reason about driving scenes, make decisions, and ultimately act in the capacity of a driver. |
LICHENG WEN et. al. | arxiv-cs.CV | 2023-11-09 |
1330 | Accuracy of A Vision-Language Model on Challenging Medical Cases IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Background: General-purpose large language models that utilize both text and images have not been evaluated on a diverse array of challenging medical cases. |
Thomas Buckley; James A. Diao; Adam Rodman; Arjun K. Manrai; | arxiv-cs.CV | 2023-11-09 |
1331 | LogShield: A Transformer-based APT Detection System Leveraging Self-Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, existing state-of-the-art techniques that use system provenance graphs, lack a data processing framework generalized across datasets for optimal performance. For mitigating this limitation as well as exploring the effectiveness of transformer-based language models, this paper proposes LogShield, a framework designed to detect APT attack patterns leveraging the power of self-attention in transformers. |
Sihat Afnan; Mushtari Sadia; Shahrear Iqbal; Anindya Iqbal; | arxiv-cs.CR | 2023-11-09 |
1332 | Deep Natural Language Feature Learning for Interpretable Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task. |
Felipe Urrutia; Cristian Buc; Valentin Barriere; | arxiv-cs.CL | 2023-11-09 |
1333 | Large Language Models and Prompt Engineering for Biomedical Query Focused Multi-Document Summarisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper reports on the use of prompt engineering and GPT-3.5 for biomedical query-focused multi-document summarisation. |
Diego Mollá; | arxiv-cs.CL | 2023-11-09 |
1334 | Large GPT-like Models Are Bad Babies: A Closer Look at The Relationship Between Linguistic Competence and Psycholinguistic Measures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find a positive correlation between LM size and performance on all three challenge tasks, with different preferences for model width and depth in each of the tasks. |
Julius Steuer; Marius Mosbach; Dietrich Klakow; | arxiv-cs.CL | 2023-11-08 |
1335 | NLQxform: A Language Model-based Question to SPARQL Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: In recent years, scholarly data has grown dramatically in terms of both scale and complexity. It becomes increasingly challenging to retrieve information from scholarly knowledge … |
Ruijie Wang; Zhiruo Zhang; Luca Rossetto; Florian Ruosch; Abraham Bernstein; | ArXiv | 2023-11-08 |
1336 | Future Lens: Anticipating Subsequent Tokens from A Single Hidden State IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position $t$ in an input, can we reliably anticipate the tokens that will appear at positions $\geq t + 2$? |
Koyena Pal; Jiuding Sun; Andrew Yuan; Byron C. Wallace; David Bau; | arxiv-cs.CL | 2023-11-08 |
1337 | GeoFormer: Predicting Human Mobility Using Generative Pre-trained Transformer (GPT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the GeoFormer, a decoder-only transformer model adapted from the GPT architecture to forecast human mobility. |
Aivin V. Solatorio; | arxiv-cs.LG | 2023-11-08 |
1338 | Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months. |
Michael Lan; Phillip Torr; Fazl Barez; | arxiv-cs.CL | 2023-11-07 |
1339 | Modelling Sentiment Analysis: LLMs and Data Augmentation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper provides different approaches for a binary sentiment classification on a small training dataset. |
Guillem Senabre Prades; | arxiv-cs.CL | 2023-11-07 |
1340 | Evaluating Multiple Large Language Models in Pediatric Ophthalmology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: DESIGN, SETTING, AND PARTICIPANTS This survey study assessed three LLMs, namely ChatGPT (GPT-3.5), GPT-4, and PaLM2, were assessed alongside three human cohorts: medical students, postgraduate students, and attending physicians, in their ability to answer questions related to pediatric ophthalmology. |
JASON HOLMES et. al. | arxiv-cs.CL | 2023-11-07 |
1341 | Neuro-GPT: Towards A Foundation Model for EEG Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model. |
WENHUI CUI et. al. | arxiv-cs.LG | 2023-11-07 |
1342 | Exploring Recommendation Capabilities of GPT-4V(ision): A Preliminary Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to explore the potential of extending LMMs from vision and language tasks to recommendation tasks. |
PEILIN ZHOU et. al. | arxiv-cs.IR | 2023-11-07 |
1343 | DeepInception: Hypnotize Large Language Model to Be Jailbreaker IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, inspired by the Milgram experiment w.r.t. the authority power for inciting harmfulness, we disclose a lightweight method, termed as DeepInception, which can hypnotize an LLM to be a jailbreaker. |
XUAN LI et. al. | arxiv-cs.LG | 2023-11-06 |
1344 | Nexus at ArAIEval Shared Task: Fine-Tuning Arabic Language Models for Propaganda and Disinformation Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The ArAIEval shared task aims to further research on these particular issues within the context of the Arabic language. In this paper, we discuss our participation in these shared tasks. |
Yunze Xiao; Firoj Alam; | arxiv-cs.CL | 2023-11-06 |
1345 | Towards A Transformer-Based Reverse Dictionary Model for Quality Estimation of Definitions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare different transformer-based models for solving the reverse dictionary task and explore their use in the context of a serious game called The Dictionary Game. |
Julien Guité-Vinet; Alexandre Blondin Massé; Fatiha Sadat; | arxiv-cs.CL | 2023-11-06 |
1346 | Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how bias transfers through an AI writing support pipeline. |
THIEMO WAMBSGANSS et. al. | arxiv-cs.CL | 2023-11-06 |
1347 | Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While GPT-4V(ision) impressively models both visual and textual information simultaneously, it’s hallucination behavior has not been systematically assessed. To bridge this gap, we introduce a new benchmark, namely, the Bias and Interference Challenges in Visual Language Models (Bingo). |
CHENHANG CUI et. al. | arxiv-cs.LG | 2023-11-06 |
1348 | A Simple Yet Efficient Ensemble Approach for AI-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, it is essential to build automated approaches capable of distinguishing between artificially generated text and human-authored text. In this paper, we propose a simple yet efficient solution to this problem by ensembling predictions from multiple constituent LLMs. |
HARIKA ABBURI et. al. | arxiv-cs.CL | 2023-11-06 |
1349 | Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual grounding capabilities, making it possible to handle certain tasks through the Visual Question Answering (VQA) … |
JIANGNING ZHANG et. al. | ArXiv | 2023-11-05 |
1350 | Evaluating The Potential of Leading Large Language Models in Reasoning Biology Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advances in Large Language Models (LLMs) have presented new opportunities for integrating Artificial General Intelligence (AGI) into biological research and education. This … |
XINYU GONG et. al. | ArXiv | 2023-11-05 |
1351 | Extraction of Atypical Aspects from Customer Reviews: Datasets and Experiments with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Correspondingly, in this paper we introduce the task of detecting atypical aspects in customer reviews. |
Smita Nannaware; Erfan Al-Hossami; Razvan Bunescu; | arxiv-cs.CL | 2023-11-05 |
1352 | GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores the potential of VQA-oriented GPT-4V in the recently popular visual Anomaly Detection (AD) and is the first to conduct qualitative and quantitative evaluations on the popular MVTec AD and VisA datasets. Considering that this task requires both image-/pixel-level evaluations, the proposed GPT-4V-AD framework contains three components: \textbf{\textit{1)}} Granular Region Division, \textbf{\textit{2)}} Prompt Designing, \textbf{\textit{3)}} Text2Segmentation for easy quantitative evaluation, and have made some different attempts for comparative analysis. |
JIANGNING ZHANG et. al. | arxiv-cs.CV | 2023-11-05 |
1353 | Tailoring Self-Rationalizers with Multi-Reward Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we enable small-scale LMs (approx. 200x smaller than GPT-3) to generate rationales that not only improve downstream task performance, but are also more plausible, consistent, and diverse, assessed both by automatic and human evaluation. |
SAHANA RAMNATH et. al. | arxiv-cs.CL | 2023-11-05 |
1354 | Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes The Lead IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the use of GPT-4V(ision), a powerful visual-linguistic model, to address anomaly detection tasks in a generic manner. |
Yunkang Cao; Xiaohao Xu; Chen Sun; Xiaonan Huang; Weiming Shen; | arxiv-cs.CV | 2023-11-05 |
1355 | Rotation Invariant Transformer for Recognizing Object in UAVs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In particular, our solution wins the first place in the UAV-based person re-recognition track in the Multi-Modal Video Reasoning and Analyzing Competition held in ICCV 2021. |
Shuoyi Chen; Mang Ye; Bo Du; | arxiv-cs.CV | 2023-11-04 |
1356 | Ultra-Long Sequence Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel and efficient distributed training method, the Long Short-Sequence Transformer (LSS Transformer), for training transformer with long sequences. |
XIAO WANG et. al. | arxiv-cs.DC | 2023-11-04 |
1357 | Grounded Intuition of GPT-Vision’s Abilities with Scientific Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use our technique to examine alt text generation for scientific figures, finding that GPT-Vision is particularly sensitive to prompting, counterfactual text in images, and relative spatial relationships. |
Alyssa Hwang; Andrew Head; Chris Callison-Burch; | arxiv-cs.CL | 2023-11-03 |
1358 | Simplifying Transformer Blocks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we ask to what extent the standard transformer block can be simplified? |
Bobby He; Thomas Hofmann; | arxiv-cs.LG | 2023-11-03 |
1359 | An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To expand capacity, we construct two large Chinese ASQP datasets crawled from multiple online platforms. |
Junxian Zhou; Haiqin Yang; Ye Junpeng; Yuxuan He; Hao Mou; | arxiv-cs.CL | 2023-11-03 |
1360 | TCM-GPT: Efficient Pre-training of Large Language Models for Domain Adaptation in Traditional Chinese Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, their effectiveness in specialized domains, such as Traditional Chinese Medicine, requires comprehensive evaluation. To address the above issues, we propose a novel domain specific TCMDA (TCM Domain Adaptation) approach, efficient pre-training with domain-specific corpus. |
Guoxing Yang; Jianyu Shi; Zan Wang; Xiaohong Liu; Guangyu Wang; | arxiv-cs.CL | 2023-11-03 |
1361 | Not All Layers Are Equally As Important: Every Layer Counts BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel modification of the transformer architecture, tailored for the data-efficient pretraining of language models. |
Lucas Georges Gabriel Charpentier; David Samuel; | arxiv-cs.CL | 2023-11-03 |
1362 | Inclusiveness Matters: A Large-Scale Analysis of User Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage user feedback from three popular online sources, Reddit, Google Play Store, and Twitter, for 50 of the most popular apps in the world to reveal the inclusiveness-related concerns from end users. |
Nowshin Nawar Arony; Ze Shi Li; Bowen Xu; Daniela Damian; | arxiv-cs.SE | 2023-11-02 |
1363 | Measuring Five Accountable Talk Moves to Improve Instruction at Scale Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Providing consistent, individualized feedback to teachers on their instruction can improve student learning outcomes. Such feedback can especially benefit novice instructors who … |
Ashlee Kupor; Candice Morgan; Dorottya Demszky; | ArXiv | 2023-11-02 |
1364 | Multi-scale Feature Flow Alignment Fusion with Transformer for The Microscopic Images Segmentation of Activated Sludge Related Papers Related Patents Related Grants Related Venues Related Experts View |
LIJIE ZHAO et. al. | Signal Image Video Process. | 2023-11-02 |
1365 | GPT-4V(ision) As A Generalist Evaluator for Vision-Language Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ two evaluation methods, single-answer grading and pairwise comparison, using GPT-4V. |
XINLU ZHANG et. al. | arxiv-cs.CV | 2023-11-02 |
1366 | Copilot4D: Learning Unsupervised World Models for Autonomous Driving Via Discrete Diffusion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we propose Copilot4D, a novel world modeling approach that first tokenizes sensor observations with VQVAE, then predicts the future via discrete diffusion. |
LUNJUN ZHANG et. al. | arxiv-cs.CV | 2023-11-02 |
1367 | Efficient Vision Transformer for Accurate Traffic Sign Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this task is complicated by suboptimal traffic images affected by factors such as camera movement, adverse weather conditions, and inadequate lighting. This study specifically focuses on traffic sign detection methods and introduces the application of the Transformer model, particularly the Vision Transformer variants, to tackle this task. |
Javad Mirzapour Kaleybar; Hooman Khaloo; Avaz Naghipour; | arxiv-cs.CV | 2023-11-02 |
1368 | MGen: A Framework for Energy-Efficient In-ReRAM Acceleration of Multi-Task BERT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, multiple transformer models, such as BERT, have been utilized together to support multiple natural language processing (NLP) tasks in a system, also known as multi-task … |
Myeonggu Kang; Hyein Shin; Junkyum Kim; L. Kim; | IEEE Transactions on Computers | 2023-11-01 |
1369 | Vibration-Signal-Based Deep Noisy Filtering Model for Online Transformer Diagnosis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Machine learning methods are effective for the diagnosis of power transformer faults. However, influenced by uncertainty and noise in data, machine-learning-based diagnostic … |
ZHIKAI XING et. al. | IEEE Transactions on Industrial Informatics | 2023-11-01 |
1370 | VPCFormer: A Transformer-based Multi-view Finger Vein Recognition Model and A New Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View |
PENGYANG ZHAO et. al. | Pattern Recognit. | 2023-11-01 |
1371 | An Adaptive N-gram Transformer for Multi-scale Scene Text Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xueming Yan; Zhihang Fang; Yaochu Jin; | Knowl. Based Syst. | 2023-11-01 |
1372 | Enhancing Social Network Hate Detection Using Back Translation and GPT-3 Augmentations During Training and Test-time Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
SEFFI COHEN et. al. | Inf. Fusion | 2023-11-01 |
1373 | Offline Handwritten Mathematical Expression Recognition with Graph Encoder and Transformer Decoder Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jianglong Tang; Hong-Yu Guo; Jin-Wen Wu; Fei Yin; Lin-Lin Huang; | Pattern Recognit. | 2023-11-01 |
1374 | A Vision Transformer-based Automated Human Identification Using Ear Biometrics Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ravishankar Mehta; Sindhuja Shukla; Jitesh Pradhan; K. K. Singh; Abhinav Kumar; | J. Inf. Secur. Appl. | 2023-11-01 |
1375 | A Feature Selection and Ensemble Learning Based Methodology for Transformer Fault Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View |
Shaowei Rao; Guoping Zou; Shiyou Yang; S. Barmada; | Appl. Soft Comput. | 2023-11-01 |
1376 | DPENet: Dual-path Extraction Network Based on CNN and Transformer for Accurate Building and Road Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZI-XING CHEN et. al. | Int. J. Appl. Earth Obs. Geoinformation | 2023-11-01 |
1377 | FTransCNN: Fusing Transformer and A CNN Based on Fuzzy Logic for Uncertain Medical Image Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
WEIPING DING et. al. | Inf. Fusion | 2023-11-01 |
1378 | Dyformer: A Dynamic Transformer-based Architecture for Multivariate Time Series Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Chao Yang; Xianzhi Wang; L. Yao; Guodong Long; Guandong Xu; | Inf. Sci. | 2023-11-01 |
1379 | Optimal Model Partitioning with Low-Overhead Profiling on The PIM-based Platform for Deep Learning Inference Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently Processing-in-Memory (PIM) has become a promising solution to achieve energy-efficient computation in data-intensive applications by placing computation near or inside … |
S. KIM et. al. | ACM Transactions on Design Automation of Electronic Systems | 2023-11-01 |
1380 | Fine-tuning GPT-3 for Legal Rule Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Davide Liga; Livio Robaldo; | Comput. Law Secur. Rev. | 2023-11-01 |
1381 | XAI Transformer Based Approach for Interpreting Depressed and Suicidal User Behavior on Online Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View |
Anshu Malhotra; Rajni Jindal; | Cognitive Systems Research | 2023-11-01 |
1382 | KD-Former: Kinematic and Dynamic Coupled Transformer Network for 3D Human Motion Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View |
JU DAI et. al. | Pattern Recognit. | 2023-11-01 |
1383 | Causal-ViT: Robust Vision Transformer By Causal Intervention Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wei Li; Zhixin Li; Xiwei Yang; Huifang Ma; | Eng. Appl. Artif. Intell. | 2023-11-01 |
1384 | External Knowledge-assisted Transformer for Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhixin Li; Qiang Su; Tianyu Chen; | Image Vis. Comput. | 2023-11-01 |
1385 | ADCT-Net: Adaptive Traffic Forecasting Neural Network Via Dual-graphic Cross-fused Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
JIANLEI KONG et. al. | Inf. Fusion | 2023-11-01 |
1386 | Power Transformer Fault Diagnosis Based on A Self-strengthening Offline Pre-training Model Related Papers Related Patents Related Grants Related Venues Related Experts View |
MINGWEI ZHONG et. al. | Eng. Appl. Artif. Intell. | 2023-11-01 |
1387 | Robust Facial Expression Recognition with Transformer Block Enhancement Module Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yuanlun Xie; Wenhong Tian; Zitong Yu; | Eng. Appl. Artif. Intell. | 2023-11-01 |
1388 | Advances in Embodied Navigation Using Large Language Models: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN. |
JINZHOU LIN et. al. | arxiv-cs.AI | 2023-11-01 |
1389 | Reciprocal Transformer for Hyperspectral and Multispectral Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View |
Qing Ma; Junjun Jiang; Xianming Liu; Jiayi Ma; | Inf. Fusion | 2023-11-01 |
1390 | Window-based Transformer Generative Adversarial Network for Autonomous Underwater Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View |
MEHNAZ UMMAR et. al. | Eng. Appl. Artif. Intell. | 2023-11-01 |
1391 | Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the issue, we propose two attention alignment strategies via temperature scaling. |
Ta-Chung Chi; Ting-Han Fan; Alexander I. Rudnicky; | arxiv-cs.CL | 2023-11-01 |
1392 | Are Large Language Models Reliable Judges? A Study on The Factuality Evaluation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the potential of LLMs as reliable assessors of factual consistency in summaries generated by text-generation models. |
Xue-Yong Fu; Md Tahmid Rahman Laskar; Cheng Chen; Shashi Bhushan TN; | arxiv-cs.CL | 2023-11-01 |
1393 | A Large Scale Digital Elevation Model Super-resolution Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZHUOXIAO LI et. al. | Int. J. Appl. Earth Obs. Geoinformation | 2023-11-01 |
1394 | Diagnosis of Photovoltaic Faults Using Digital Twin and PSO-optimized Shifted Window Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ying-Yi Hong; Rolando A. Pula; | Appl. Soft Comput. | 2023-11-01 |
1395 | Breaking The Token Barrier: Chunking and Convolution for Efficient Long Text Classification with BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a relatively simple extension to vanilla BERT architecture called ChunkBERT that allows finetuning of any pretrained models to perform inference on arbitrarily long text. |
Aman Jaiswal; Evangelos Milios; | arxiv-cs.CL | 2023-10-31 |
1396 | Increasing The Performance of Cognitively Inspired Data-Efficient Language Models Via Implicit Structure Building Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we describe our submission to the BabyLM Challenge 2023 shared task on data-efficient language model (LM) pretraining (Warstadt et al., 2023). |
Omar Momen; David Arps; Laura Kallmeyer; | arxiv-cs.CL | 2023-10-31 |
1397 | A Systematic Review for Transformer-based Long-term Series Forecasting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and … |
LIYILEI SU et. al. | arxiv-cs.LG | 2023-10-31 |
1398 | Is GPT Powerful Enough to Analyze The Emotions of Memes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims to explore the capabilities of GPT-3.5, a leading example of LLMs, in processing the sentiment analysis of Internet memes. |
Jingjing Wang; Joshua Luo; Grace Yang; Allen Hong; Feng Luo; | arxiv-cs.CL | 2023-10-31 |
1399 | Does GPT-4 Pass The Turing Test? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness. |
Cameron R. Jones; Benjamin K. Bergen; | arxiv-cs.AI | 2023-10-31 |
1400 | A Systematic Evaluation of GPT-4V’s Multimodal Capability for Medical Image Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work conducts an evaluation of GPT-4V’s multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding. |
YINGSHU LI et. al. | arxiv-cs.CV | 2023-10-31 |
1401 | Do Large Language Models Solve Verbal Analogies Like Children Do? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates whether large language models (LLMs) solve verbal analogies in A:B::C:? |
Claire E. Stevenson; Mathilde ter Veen; Rochelle Choenni; Han L. J. van der Maas; Ekaterina Shutova; | arxiv-cs.CL | 2023-10-31 |
1402 | Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: But such methods are often tested for high-resource languages such as English. In this work, we investigate whether these methods can compensate for data sparseness in low-resource languages, hypothesizing that they ought to be more effective for low-resource languages. |
Luke Gessler; Nathan Schneider; | arxiv-cs.CL | 2023-10-31 |
1403 | TS-Fastformer: Fast Transformer for Time-series Forecasting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Many real-world applications require precise and fast time-series forecasting. Recent trends in time-series forecasting models are shifting from LSTM-based models to … |
Sangwon Lee; Junho Hong; Ling Liu; Wonik Choi; | ACM Transactions on Intelligent Systems and Technology | 2023-10-30 |
1404 | Efficient Classification of Student Help Requests in Programming Courses Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The accurate classification of student help requests with respect to the type of help being sought can enable the tailoring of effective responses. Automatically classifying such requests is non-trivial, but large language models (LLMs) appear to offer an accessible, cost-effective solution. |
Jaromir Savelka; Paul Denny; Mark Liffiton; Brad Sheese; | arxiv-cs.CY | 2023-10-30 |
1405 | Partial Tensorized Transformers for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT. |
Subhadra Vadlamannati; Ryan Solgi; | arxiv-cs.CL | 2023-10-30 |
1406 | MM-VID: Advancing Video Understanding with GPT-4V(ision) IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V, combined with specialized tools in vision, audio, and speech, to facilitate advanced video … |
KEVIN LIN et. al. | ArXiv | 2023-10-30 |
1407 | Extracting User Needs with Chat-GPT for Dialogue Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most conventional interactive recommendation systems, the language model is used only as a dialogue model, and there is a separate recommendation system. |
Yugen Sato; Taisei Nakajima; Tatsuki Kawamoto; Tomohiro Takagi; | arxiv-cs.CY | 2023-10-30 |
1408 | Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new pipeline using ChatGPT instead of human experts to generate high-quality feedback data for improving factual consistency in the clinical note summarization task. |
PRAKAMYA MISHRA et. al. | arxiv-cs.CL | 2023-10-30 |
1409 | MIST: Medical Image Segmentation Transformer with Convolutional Attention Mixing (CAM) Decoder Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite being successful in medical image segmentation, transformers face limitations in capturing local contexts of pixels in multimodal dimensions. We propose a Medical Image Segmentation Transformer (MIST) incorporating a novel Convolutional Attention Mixing (CAM) decoder to address this issue. |
Md Motiur Rahman; Shiva Shokouhmand; Smriti Bhatt; Miad Faezipour; | arxiv-cs.CV | 2023-10-30 |
1410 | Program Synthesis with Generative Pre-trained Transformers and Grammar-Guided Genetic Programming Grammar Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Grammar-Guided Genetic Programming (G3P) is widely recognised as one of the most successful approaches to program synthesis. Using a set of input/output tests, G3P evolves … |
Ning Tao; Anthony Ventresque; Takfarinas Saber; | 2023 IEEE Latin American Conference on Computational … | 2023-10-29 |
1411 | Multimodal ChatGPT for Medical Applications: An Experimental Study of GPT-4V IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we critically evaluate the capabilities of the state-of-the-art multimodal large language model, i.e., GPT-4 with Vision (GPT-4V), on Visual Question Answering (VQA) task. |
ZHILING YAN et. al. | arxiv-cs.CV | 2023-10-29 |
1412 | Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To do so, we propose a novel graph pre-training auto-encoder to obtain sentence embeddings by explicitly modelling intra-sentential distinctive features and inter-sentential cohesive features through sentence-word bipartite graphs. |
QIANREN MAO et. al. | arxiv-cs.CL | 2023-10-29 |
1413 | From Chatbots to PhishBots? – Preventing Phishing Scams Created Using ChatGPT, Google Bard and Claude Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, … |
S. Roy; Poojitha Thota; Krishna Vamsi Naragam; Shirin Nilizadeh; | ArXiv | 2023-10-29 |
1414 | From Chatbots to PhishBots? — Preventing Phishing Scams Created Using ChatGPT, Google Bard and Claude Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of using four popular commercially available LLMs, i.e., ChatGPT (GPT 3.5 Turbo), GPT 4, Claude, and Bard, to generate functional phishing attacks using a series of malicious prompts. |
Sayak Saha Roy; Poojitha Thota; Krishna Vamsi Naragam; Shirin Nilizadeh; | arxiv-cs.CR | 2023-10-29 |
1415 | Prompt-Engineering and Transformer-based Question Generation and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we finetuned a pretrained distilBERT model on the SQuAD question answering dataset to generate questions. |
Rubaba Amyeen; | arxiv-cs.CL | 2023-10-28 |
1416 | Data Ambiguity Strikes Back: How Documentation Improves GPT’s Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have identified prevalent data ambiguities of value consistency, data coverage, and data granularity that affect tasks. |
Zezhou Huang; Pavan Kalyan Damalapati; Eugene Wu; | arxiv-cs.DB | 2023-10-28 |
1417 | CViT: A Convolution Vision Transformer for Video Abnormal Behavior Detection and Localization Related Papers Related Patents Related Grants Related Venues Related Experts View |
Sanjay Roka; M. Diwakar; | SN Computer Science | 2023-10-28 |
1418 | OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present OpinSummEval, a dataset comprising human judgments and outputs from 14 opinion summarization models. |
Yuchen Shen; Xiaojun Wan; | arxiv-cs.CL | 2023-10-27 |
1419 | OffMix-3L: A Novel Code-Mixed Dataset in Bangla-English-Hindi for Offensive Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce OffMix-3L, a novel offensive language identification dataset containing code-mixed data from three different languages. |
Dhiman Goswami; Md Nishat Raihan; Antara Mahmud; Antonios Anastasopoulos; Marcos Zampieri; | arxiv-cs.CL | 2023-10-27 |
1420 | OffMix-3L: A Novel Code-Mixed Test Dataset in Bangla-English-Hindi for Offensive Language Identification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several works have been conducted on building datasets and performing … |
Dhiman Goswami; Md. Nishat Raihan; Antara Mahmud; Antonios Anstasopoulos; Marcos Zampieri; | ArXiv | 2023-10-27 |
1421 | FP8-LM: Training FP8 Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs). |
HOUWEN PENG et. al. | arxiv-cs.LG | 2023-10-27 |
1422 | SentMix-3L: A Novel Code-Mixed Test Dataset in Bangla-English-Hindi for Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several datasets have been build with the goal of training … |
Md. Nishat Raihan; Dhiman Goswami; Antara Mahmud; Antonios Anstasopoulos; Marcos Zampieri; | ArXiv | 2023-10-27 |
1423 | GPT-4 Vision on Medical Image Classification — A Case Study on COVID-19 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This technical report delves into the application of GPT-4 Vision (GPT-4V) in the nuanced realm of COVID-19 image classification, leveraging the transformative potential of … |
RUIBO CHEN et. al. | arxiv-cs.CV | 2023-10-27 |
1424 | GPT-4 Vision on Medical Image Classification – A Case Study on COVID-19 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This technical report delves into the application of GPT-4 Vision (GPT-4V) in the nuanced realm of COVID-19 image classification, leveraging the transformative potential of … |
RUIBO CHEN et. al. | ArXiv | 2023-10-27 |
1425 | MultiScale Spectral-Spatial Convolutional Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a multiscale spectral-spatial convolutional Transformer (MultiscaleFormer) for hyperspectral image classification. |
Zhiqiang Gong; Xian Zhou; Wen Yao; | arxiv-cs.CV | 2023-10-27 |
1426 | SentMix-3L: A Bangla-English-Hindi Code-Mixed Dataset for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce SentMix-3L, a novel dataset for sentiment analysis containing code-mixed data between three languages Bangla, English, and Hindi. |
Md Nishat Raihan; Dhiman Goswami; Antara Mahmud; Antonios Anastasopoulos; Marcos Zampieri; | arxiv-cs.CL | 2023-10-27 |
1427 | Large Language Models for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) offer unprecedented text completion capabilities. |
Paul F. Simmering; Paavo Huoviala; | arxiv-cs.CL | 2023-10-27 |
1428 | Harnessing GPT-3.5-turbo for Rhetorical Role Prediction in Legal Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a comprehensive study of one-stage elicitation techniques for querying a large pre-trained generative transformer (GPT-3.5-turbo) in the rhetorical role prediction task of legal cases. |
Anas Belfathi; Nicolas Hernandez; Laura Monceaux; | arxiv-cs.CL | 2023-10-26 |
1429 | ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing solutions, such as ZeroQuant, offer dynamic quantization for models like BERT and GPT but overlook crucial memory-bounded operators and the complexities of per-token quantization. Addressing these gaps, we present a novel, fully hardware-enhanced robust optimized post-training W8A8 quantization framework, ZeroQuant-HERO. |
ZHEWEI YAO et. al. | arxiv-cs.LG | 2023-10-26 |
1430 | An Ensemble Method Based on The Combination of Transformers with Convolutional Neural Networks to Detect Artificially Generated Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Thanks to the state-of-the-art Large Language Models (LLMs), language generation has reached outstanding levels. These models are capable of generating high quality content, thus … |
Vijini Liyanage; Davide Buscaldi; | arxiv-cs.CL | 2023-10-26 |
1431 | DecoderTracker: Decoder-Only Method for Multiple-Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Decoder-only models, such as GPT, have demonstrated superior performance in many areas compared to traditional encoder-decoder structure transformer models. |
Liao Pan; Yang Feng; Wu Di; Liu Bo; Zhang Xingle; | arxiv-cs.CV | 2023-10-26 |
1432 | LightLM: A Lightweight Deep and Narrow Language Model for Generative Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents LightLM, a lightweight Transformer-based language model for generative recommendation. |
Kai Mei; Yongfeng Zhang; | arxiv-cs.IR | 2023-10-26 |
1433 | Can Large Language Models Replace Humans in The Systematic Review Process? Evaluating GPT-4’s Efficacy in Screening and Extracting Data from Peer-reviewed and Grey Literature in Multiple Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Systematic reviews are vital for guiding practice, research, and policy, yet they are often slow and labour-intensive. Large language models (LLMs) could offer a way to speed up and automate systematic reviews, but their performance in such tasks has not been comprehensively evaluated against humans, and no study has tested GPT-4, the biggest LLM so far. |
Qusai Khraisha; Sophie Put; Johanna Kappenberg; Azza Warraitch; Kristin Hadfield; | arxiv-cs.CL | 2023-10-26 |
1434 | BOOST: Harnessing Black-Box Control to Boost Commonsense in LMs’ Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a computation-efficient framework that steers a frozen Pre-Trained Language Model (PTLM) towards more commonsensical generation (i.e., producing a plausible output that incorporates a list of concepts in a meaningful way). |
Yufei Tian; Felix Zhang; Nanyun Peng; | arxiv-cs.CL | 2023-10-25 |
1435 | Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The disposability of clothing has emerged as a critical concern, precipitating waste accumulation due to product quality degradation. Such consequences exert significant pressure … |
A. Alamsyah; Nadhif Ditertian Girawan; | Big Data Cogn. Comput. | 2023-10-25 |
1436 | Divide Et Impera: Multi-Transformer Architectures for Complex NLP-Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an approach in which complex tasks are divided into simpler subtasks. |
Solveig Helland; Elena Gavagnin; Alexandre de Spindler; | arxiv-cs.CL | 2023-10-25 |
1437 | Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a comprehensive evaluation of the Optical Character Recognition (OCR) capabilities of the recently released GPT-4V(ision), a Large Multimodal Model (LMM). |
YONGXIN SHI et. al. | arxiv-cs.CV | 2023-10-25 |
1438 | CLEX: Continuous Length Extrapolation for Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Length extrapolation methods, although theoretically capable of extending the context window beyond the training sequence length, often underperform in practical long-context applications. To address these challenges, we propose Continuous Length EXtrapolation (CLEX) for LLMs. |
Guanzheng Chen; Xin Li; Zaiqiao Meng; Shangsong Liang; Lidong Bing; | arxiv-cs.CL | 2023-10-25 |
1439 | Can GPT Models Follow Human Summarization Guidelines? Evaluating ChatGPT and GPT-4 for Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the capabilities of prompt-driven Large Language Models (LLMs) like ChatGPT and GPT-4 in adhering to human guidelines for dialogue summarization. |
Yongxin Zhou; Fabien Ringeval; François Portet; | arxiv-cs.CL | 2023-10-25 |
1440 | Decoding Stumpers: Large Language Models Vs. Human Problem-Solvers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the problem-solving capabilities of Large Language Models (LLMs) by evaluating their performance on stumpers, unique single-step intuition problems that pose challenges for human solvers but are easily verifiable. |
Alon Goldstein; Miriam Havin; Roi Reichart; Ariel Goldstein; | arxiv-cs.CL | 2023-10-25 |
1441 | Data Augmentation for Emotion Detection in Small Imbalanced Text Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Certain existing datasets are small, follow different emotion taxonomies and display imbalance in their emotion distribution. In this work, we studied the impact of data augmentation techniques precisely when applied to small imbalanced datasets, for which current state-of-the-art models (such as RoBERTa) under-perform. |
Anna Koufakou; Diego Grisales; Ragy Costa de jesus; Oscar Fox; | arxiv-cs.CL | 2023-10-25 |
1442 | How Well Can Machine-generated Texts Be Identified and Can Language Models Be Trained to Avoid Identification? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We refined five separate language models to generate synthetic tweets, uncovering that shallow learning classification algorithms, like Naive Bayes, achieve detection accuracy between 0.6 and 0.8. |
Sinclair Schneider; Florian Steuber; Joao A. G. Schneider; Gabi Dreo Rodosek; | arxiv-cs.CL | 2023-10-25 |
1443 | Multiscale Convolutional Neural-based Transformer Network for Time Series Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View |
Zhixing Wang; Yepeng Guan; | Signal Image Video Process. | 2023-10-25 |
1444 | Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a deep learning model tailored for document information analysis, emphasizing document classification, entity relation extraction, and document visual question answering. |
Tofik Ali; Partha Pratim Roy; | arxiv-cs.CV | 2023-10-25 |
1445 | An Early Evaluation of GPT-4V(ision) IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate different abilities of GPT-4V including visual understanding, language understanding, visual puzzle solving, and understanding of other modalities such as depth, thermal, video, and audio. |
YANG WU et. al. | arxiv-cs.CL | 2023-10-25 |
1446 | Fast Attention Requires Bounded Entries IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether faster algorithms are possible by \emph{implicitly} making use of the matrix $A$. |
Josh Alman; Zhao Song; | nips | 2023-10-24 |
1447 | From Parameter-Efficient to Memory-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first investigate what is a key factor for the success of existing PEFT methods, and realize that it’s essential to preserve the PLM’s starting point when initializing a PEFT method. With this finding, we propose memory-efficient fine-tuning (MEFT) that inserts adapters into a PLM, preserving the PLM’s starting point and making it reversible without additional pre-training. |
Baohao Liao; Shaomu Tan; Christof Monz; | nips | 2023-10-24 |
1448 | Dissecting In-Context Learning of Translations in GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we try to better understand the role of demonstration attributes for the in-context learning of translations through perturbations of high-quality, in-domain demonstrations. |
Vikas Raunak; Hany Hassan Awadalla; Arul Menezes; | arxiv-cs.CL | 2023-10-24 |
1449 | An LLM-based Framework for Fingerprinting Internet-connected Devices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper we propose the use of large language models (LLMs) for characterizing, clustering, and fingerprinting raw text obtained from network measurements. To this end, We … |
Armin Sarabi; Tongxin Yin; Mingyan Liu; | Proceedings of the 2023 ACM on Internet Measurement … | 2023-10-24 |
1450 | Towards Efficient Pre-Trained Language Model Via Feature Correlation Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, our analysis indicates that the different relations within self-attention, as adopted in other works, involves more computation complexities and can easily be constrained by the number of heads, potentially leading to suboptimal solutions. To address these issues, we propose a novel approach that builds relationships directly from output features. |
Kun Huang; Xin Guo; Meng Wang; | nips | 2023-10-24 |
1451 | ZipLM: Inference-Aware Structured Pruning of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The breakthrough performance of large language models (LLMs) comes with major computational footprints and high deployment costs. In this paper, we progress towards resolving this problem by proposing a novel structured compression approach for LLMs, called ZipLM. |
Eldar Kurtić; Elias Frantar; Dan Alistarh; | nips | 2023-10-24 |
1452 | TART: A Plug-and-play Transformer Module for Task-agnostic Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This raises an intriguing question: Are LLMs actually capable of learning how to reason in a task-agnostic manner? We answer this in the affirmative and, as a proof of concept, propose TART which generically improves an LLM’s reasoning abilities using a synthetically trained reasoning module. |
Kush Bhatia; Avanika Narayan; Christopher De Sa; Christopher Ré; | nips | 2023-10-24 |
1453 | LoTR: Logic-Guided Transformer Reasoner for Human-Object Interaction Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present LoTR: Logic-Guided Transformer Reasoner, a novel approach for HOI detection that leverages Transformer as the reasoner to infer feasible interactions between entities. |
Liulei Li; Jianan Wei; Wenguan Wang; Yi Yang; | nips | 2023-10-24 |
1454 | Global-correlated 3D-decoupling Transformer for Clothed Avatar Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current methods exhibit limitations in performance, largely attributable to their dependence on insufficient 2D image features and inconsistent query methods. Owing to this, we present the Global-correlated 3D-decoupling Transformer for clothed Avatar reconstruction (GTA), a novel transformer-based architecture that reconstructs clothed human avatars from monocular images. |
Zechuan Zhang; Li Sun; Zongxin Yang; Ling Chen; Yi Yang; | nips | 2023-10-24 |
1455 | Event Stream GPT: A Data Pre-processing and Modeling Library for Generative, Pre-trained Transformers Over Continuous-time Sequences of Complex Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their potential, the adoption of foundation models in these domains has been hampered by the lack of suitable tools for model construction and evaluation. To bridge this gap, we introduce Event Stream GPT (ESGPT), an open-source library designed to streamline the end-to-end process for building GPTs for continuous-time event sequences. |
Matthew McDermott; Bret Nestor; Peniel Argaw; Isaac S Kohane; | nips | 2023-10-24 |
1456 | Visual Instruction Tuning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and an LLM for general-purpose visual and language understanding. |
Haotian Liu; Chunyuan Li; Qingyang Wu; Yong Jae Lee; | nips | 2023-10-24 |
1457 | NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, graph neural network (GNN) based approaches still dominate the field of learning representation for the entire network. In this paper, we revisit Transformer and compare it with GNN to analyse the different architecture characteristics of them. |
Yun Yi; Haokui Zhang; Rong Xiao; Nannan Wang; Xiaoyu Wang; | nips | 2023-10-24 |
1458 | Unlimiformer: Long-Range Transformers with Unlimited Length Input IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Unlimiformer: a general approach that wraps any existing pretrained encoder-decoder transformer, and offloads the cross-attention computation to a single $k$-nearest-neighbor ($k$NN) index, while the returned $k$NN distances are the attention dot-product scores. |
Amanda Bertsch; Uri Alon; Graham Neubig; Matthew Gormley; | nips | 2023-10-24 |
1459 | DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives – including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. |
BOXIN WANG et. al. | nips | 2023-10-24 |
1460 | A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the present work, we develop a recurrent neural language model with a single self-attention head, which more closely parallels the memory system assumed by cognitive theories. |
William Timkey; Tal Linzen; | arxiv-cs.CL | 2023-10-24 |
1461 | De Novo Drug Design Using Reinforcement Learning with Multiple GPT Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although technologies such as transformer model and reinforcement learning have been applied in drug design, their potential has not been fully realized. Therefore, we propose MolRL-MGPT, a reinforcement learning algorithm with multiple GPT agents for drug molecular generation. |
Xiuyuan Hu; Hao Zhang; Yang Zhao; | nips | 2023-10-24 |
1462 | Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and model dimension. |
DAN FU et. al. | nips | 2023-10-24 |
1463 | Evaluating Cognitive Maps in Large Language Models: No Emergent Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we make two major contributions. First, we propose CogEval, a Cognitive Science-Inspired protocol for Measurement and Evaluation for Large Language Models. Second, we use CogEval to systematically evaluate hypothesized latent abilities, cognitive maps and planning, across a number of LLMs (OpenAI GPT-4, GPT-3.5, and davinci-003, Anthropic Claude-1, Alpaca-7B, LLaMA-7B, and Bard) using tasks with established construct validity and absent from LLM training sets. |
IDA MOMENNEJAD et. al. | nips | 2023-10-24 |
1464 | On Efficient Training Algorithms For Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we revisit three algorithms: layer stacking, layer dropping, and selective backpropagation. |
Jean Kaddour; Oscar Key; Piotr Nawrot; Pasquale Minervini; Matt Kusner; | nips | 2023-10-24 |
1465 | Scaling Laws for Language Encoding Models in FMRI IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we tested whether larger open-source models such as those from the OPT and LLaMA families are better at predicting brain responses recorded using fMRI. |
Richard Antonello; Aditya Vaidya; Alexander Huth; | nips | 2023-10-24 |
1466 | The Cambridge Law Corpus: A Corpus for Legal AI Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Cambridge Law Corpus (CLC), a corpus for legal AI research. |
ANDREAS ÖSTLING et. al. | nips | 2023-10-24 |
1467 | Using GPT-4 to Augment Unbalanced Data for Automatic Scoring IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning-based automatic scoring can be challenging if students’ responses are unbalanced across scoring categories, as it introduces uncertainty in the machine training process. To meet this challenge, we introduce a novel text data augmentation framework using GPT-4, a generative large language model, specifically tailored for unbalanced datasets in automatic scoring. |
Luyang Fang; Gyeong-Geon Lee; Xiaoming Zhai; | arxiv-cs.CL | 2023-10-24 |
1468 | Is ChatGPT A Good Multi-Party Conversation Solver? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the potential of generative LLMs such as ChatGPT and GPT-4 within the context of MPCs. |
Chao-Hong Tan; Jia-Chen Gu; Zhen-Hua Ling; | arxiv-cs.CL | 2023-10-24 |
1469 | Self-Refine: Iterative Refinement with Self-Feedback IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. |
AMAN MADAAN et. al. | nips | 2023-10-24 |
1470 | CAPP-130 : A Dataset of Chinese Application Privacy Policy Summarization and Interpretations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, research on Chinese application privacy policy summarization is currently almost nonexistent, and there is a lack of a high-quality corpus suitable for addressing readability issues. To tackle these challenges, we introduce a fine-grained CAPP-130 corpus and a TCSI-pp framework. |
JINFEI LIU et. al. | nips | 2023-10-24 |
1471 | AI-enhanced Auto-correction of Programming Exercises: How Effective Is GPT-3.5? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Timely formative feedback is considered as one of the most important drivers for effective learning. Delivering timely and individualized feedback is particularly challenging in … |
Imen Azaiz; Oliver Deckarm; Sven Strickroth; | Int. J. Eng. Pedagog. | 2023-10-24 |
1472 | Block-State Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a hybrid layer named Block-State Transformer (*BST*), that internally combines an SSM sublayer for long-range contextualization, and a Block Transformer sublayer for short-term representation of sequences. |
JONATHAN PILAULT et. al. | nips | 2023-10-24 |
1473 | Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose GRACE, a Lifelong Model Editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs. |
Thomas Hartvigsen; Swami Sankaranarayanan; Hamid Palangi; Yoon Kim; Marzyeh Ghassemi; | nips | 2023-10-24 |
1474 | PointGPT: Auto-regressively Generative Pre-training from Point Clouds IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps. |
GUANGYAN CHEN et. al. | nips | 2023-10-24 |
1475 | Mathematical Capabilities of ChatGPT IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. |
SIMON FRIEDER et. al. | nips | 2023-10-24 |
1476 | Transformer-based Planning for Symbolic Regression IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models primarily rely on supervised pretraining goals borrowed from text generation and overlook equation-specific objectives like accuracy and complexity. To address this, we propose TPSR, a Transformer-based Planning strategy for Symbolic Regression that incorporates Monte Carlo Tree Search into the transformer decoding process. |
Parshin Shojaee; Kazem Meidani; Amir Barati Farimani; Chandan Reddy; | nips | 2023-10-24 |
1477 | What Indeed Can GPT Models Do in Chemistry? A Comprehensive Benchmark on Eight Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain. |
TAICHENG GUO et. al. | nips | 2023-10-24 |
1478 | Making Scalable Meta Learning Practical Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. |
SANG CHOE et. al. | nips | 2023-10-24 |
1479 | Understanding Code Semantics: An Evaluation of Transformer Models in Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Ultimately, our research aims to offer valuable insights into the inner workings of transformer-based LMs, enhancing their ability to understand code and contributing to more efficient software development practices and maintenance workflows. |
Debanjan Mondal; Abhilasha Lodha; Ankita Sahoo; Beena Kumari; | arxiv-cs.LG | 2023-10-24 |
1480 | H3T: Efficient Integration of Memory Optimization and Parallelism for Large-scale Transformer Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a framework to automatically find an efficient integration of memory optimization and parallelism for High-Throughput Transformer Training (named H3T), which is rarely considered by existing efforts for training big Transformer-based models. |
YUZHONG WANG et. al. | nips | 2023-10-24 |
1481 | Transformers As Statisticians: Provable In-Context Learning with In-Context Algorithm Selection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We establish this in theory by explicit constructions, and also observe this phenomenon experimentally. In theory, we construct two general mechanisms for algorithm selection with concrete examples: (1) Pre-ICL testing, where the transformer determines the right task for the given sequence (such as choosing between regression and classification) by examining certain summary statistics of the input sequence; (2) Post-ICL validation, where the transformer selects—among multiple base ICL algorithms (such as ridge regression with multiple regularization strengths)—a near-optimal one for the given sequence using a train-validation split. |
Yu Bai; Fan Chen; Huan Wang; Caiming Xiong; Song Mei; | nips | 2023-10-24 |
1482 | Likelihood-Based Diffusion Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we take the first steps towards closing the perplexity gap between autoregressive and diffusion-based language models, with the goal of building and releasing a diffusion model which outperforms the smallest widely-adopted autoregressive model (GPT-2 124M). |
Ishaan Gulrajani; Tatsunori Hashimoto; | nips | 2023-10-24 |
1483 | RapidBERT: Pretraining BERT from Scratch for $20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce RapidBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining. |
JACOB PORTES et. al. | nips | 2023-10-24 |
1484 | Spike-driven Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we incorporate the spike-driven paradigm into Transformer by the proposed Spike-driven Transformer with four unique properties: (1) Event-driven, no calculation is triggered when the input of Transformer is zero; (2) Binary spike communication, all matrix multiplications associated with the spike matrix can be transformed into sparse additions; (3) Self-attention with linear complexity at both token and channel dimensions; (4) The operations between spike-form Query, Key, and Value are mask and addition. |
MAN YAO et. al. | nips | 2023-10-24 |
1485 | SwiFT: Swin 4D FMRI Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The modeling of spatiotemporal brain dynamics from high-dimensional data, such as 4D functional MRI, is a formidable task in neuroscience. To address this challenge, we present SwiFT (Swin 4D fMRI Transformer), a Swin Transformer architecture that can learn brain dynamics directly from 4D functional brain MRI data in a memory and computation-efficient manner. |
PETER KIM et. al. | nips | 2023-10-24 |
1486 | Neural Data Transformer 2: Multi-context Pretraining for Neural Spiking Activity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus develop Neural Data Transformer 2 (NDT2), a spatiotemporal Transformer for neural spiking activity, and demonstrate pretraining can leverage motor BCI datasets that span sessions, subjects, and experimental tasks. |
Joel Ye; Jennifer Collinger; Leila Wehbe; Robert Gaunt; | nips | 2023-10-24 |
1487 | RealTime QA: What’s The Answer Right Now? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce RealTime QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). |
JUNGO KASAI et. al. | nips | 2023-10-24 |
1488 | Geometric Transformer with Interatomic Positional Encoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, by designing Interatomic Positional Encoding (IPE) thatparameterizes atomic environments as Transformer’s positional encodings,we propose Geoformer, a novel geometric Transformer to effectively model molecular structures for various molecular property prediction. |
YUSONG WANG et. al. | nips | 2023-10-24 |
1489 | 3M-TRANSFORMER: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on this research, we propose a new multimodal transformer-based architecture for predicting turn-taking in embodied, synchronized multi-perspective data. |
Mehdi Fatan; Emanuele Mincato; Dimitra Pintzou; Mariella Dimiccoli; | arxiv-cs.CV | 2023-10-23 |
1490 | Design of A Modified Transformer Architecture Based on Relative Position Coding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
WENFENG ZHENG et. al. | International Journal of Computational Intelligence Systems | 2023-10-23 |
1491 | GPT-3-Powered Type Error Debugging: Investigating The Use of Large Language Models for Code Repair Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Type systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during … |
Francisco Ribeiro; José Nuno Macedo; Kanae Tsushima; Rui Abreu; João Saraiva; | Proceedings of the 16th ACM SIGPLAN International … | 2023-10-23 |
1492 | LINC: A Neurosymbolic Approach for Logical Reasoning By Combining Language Models with First-Order Logic Provers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing in subtle and unpredictable ways. In this work, we investigate the validity of instead reformulating such tasks as modular neurosymbolic programming, which we call LINC: Logical Inference via Neurosymbolic Computation. |
THEO X. OLAUSSON et. al. | arxiv-cs.CL | 2023-10-23 |
1493 | TRAMS: Training-free Memory Selection for Long-range Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present a plug-and-play strategy, known as TRAining-free Memory Selection (TRAMS), that selects tokens participating in attention calculation based on one simple metric. |
Haofei Yu; Cunxiang Wang; Yue Zhang; Wei Bi; | arxiv-cs.CL | 2023-10-23 |
1494 | Exploring The Boundaries of GPT-4 in Radiology IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models. |
QIANCHU LIU et. al. | arxiv-cs.CL | 2023-10-23 |
1495 | GPT-4 As An Effective Zero-Shot Evaluator for Scientific Figure Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates using large language models (LLMs) as a cost-effective, reference-free method for evaluating figure captions. |
TING-YAO HSU et. al. | arxiv-cs.CL | 2023-10-23 |
1496 | Evaluating The Knowledge Base Completion Potential of GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we perform a careful evaluation of GPT’s potential to complete the largest public KB: Wikidata. |
Blerta Veseli; Simon Razniewski; Jan-Christoph Kalo; Gerhard Weikum; | arxiv-cs.CL | 2023-10-23 |
1497 | Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach by conducting a comparative analysis of different Transformers vs SOTA models in the community-based COVID-19 question answering dataset. |
Tam Minh Vo; Khiem Vinh Tran; | arxiv-cs.CL | 2023-10-23 |
1498 | InstructExcel: A Benchmark for Natural Language Instruction in Excel Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To do so we introduce a new large-scale benchmark, InstructExcel, created by leveraging the ‘Automate’ feature in Excel to automatically generate OfficeScripts from users’ actions. |
JUSTIN PAYAN et. al. | arxiv-cs.CL | 2023-10-22 |
1499 | Towards Harmful Erotic Content Detection Through Coreference-Driven Contextual Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a hybrid neural and rule-based context-aware system that leverages coreference resolution to identify harmful contextual cues in erotic content. |
Inez Okulska; Emilia Wiśnios; | arxiv-cs.CL | 2023-10-22 |
1500 | Attention-Enhancing Backdoor Attacks Against BERT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we directly target the interior structure of neural networks and the backdoor mechanism. |
Weimin Lyu; Songzhu Zheng; Lu Pang; Haibin Ling; Chao Chen; | arxiv-cs.LG | 2023-10-22 |