📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,344 articles · Updated every 3 hours · View all reads

arXiv:2603.24721v1 Announce Type: cross Abstract: Spatial reasoning focuses on locating target objects based on spatial relations in 3D scenes, which plays a cr

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1mo ago

Is Geometry Enough? An Evaluation of Landmark-Based Gaze Estimation

arXiv:2603.24724v1 Announce Type: cross Abstract: Appearance-based gaze estimation frequently relies on deep Convolutional Neural Networks (CNNs). These models

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Decentralized Task Scheduling in Distributed Systems: A Deep Reinforcement Learning Approach

arXiv:2603.24738v1 Announce Type: cross Abstract: Efficient task scheduling in large-scale distributed systems presents significant challenges due to dynamic wo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Grokking as a Falsifiable Finite-Size Transition

arXiv:2603.24746v1 Announce Type: cross Abstract: Grokking -- the delayed onset of generalization after early memorization -- is often described with phase-tran

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

Pseudo Label NCF for Sparse OHC Recommendation: Dual Representation Learning and the Separability Accuracy Trade off

arXiv:2603.24750v1 Announce Type: cross Abstract: Online Health Communities connect patients for peer support, but users face a discovery challenge when they ha

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

arXiv:2603.24755v1 Announce Type: cross Abstract: Software development is iterative, yet agentic coding benchmarks overwhelmingly evaluate single-shot solutions

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

arXiv:2603.24772v1 Announce Type: cross Abstract: Clinical documentation is a critical factor for patient safety, diagnosis, and continuity of care. The adminis

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

From Untestable to Testable: Metamorphic Testing in the Age of LLMs

arXiv:2603.24774v1 Announce Type: cross Abstract: This article discusses the challenges of testing software systems with increasingly integrated AI and LLM func

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

AIP: Agent Identity Protocol for Verifiable Delegation Across MCP and A2A

arXiv:2603.24775v1 Announce Type: cross Abstract: AI agents increasingly call tools via the Model Context Protocol (MCP) and delegate to other agents via Agent-

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Dissecting Model Failures in Abdominal Aortic Aneurysm Segmentation through Explainability-Driven Analysis

arXiv:2603.24801v1 Announce Type: cross Abstract: Computed tomography image segmentation of complex abdominal aortic aneurysms (AAA) often fails because the mod

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

GoldiCLIP: The Goldilocks Approach for Balancing Explicit Supervision for Language-Image Pretraining

arXiv:2603.24804v1 Announce Type: cross Abstract: Until recently, the success of large-scale vision-language models (VLMs) has primarily relied on billion-sampl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

FODMP: Fast One-Step Diffusion of Movement Primitives Generation for Time-Dependent Robot Actions

arXiv:2603.24806v1 Announce Type: cross Abstract: Diffusion models are increasingly used for robot learning, but current designs face a clear trade-off. Action-

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Generative Adversarial Perturbations with Cross-paradigm Transferability on Localized Crowd Counting

arXiv:2603.24821v1 Announce Type: cross Abstract: State-of-the-art crowd counting and localization are primarily modeled using two paradigms: density maps and p

ArXiv cs.AI 🏭 MLOps & LLMOps 📄 Paper ⚡ AI Lesson 1mo ago

Learning From Developers: Towards Reliable Patch Validation at Scale for Linux

arXiv:2603.24825v1 Announce Type: cross Abstract: Patch reviewing is critical for software development, especially in distributed open-source development, which

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

A Practical Guide Towards Interpreting Time-Series Deep Clinical Predictive Models: A Reproducibility Study

arXiv:2603.24828v1 Announce Type: cross Abstract: Clinical decisions are high-stakes and require explicit justification, making model interpretability essential

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

arXiv:2603.24844v1 Announce Type: cross Abstract: Given a question, a language model (LM) implicitly encodes a distribution over possible answers. In practice,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

NeuroVLM-Bench: Evaluation of Vision-Enabled Large Language Models for Clinical Reasoning in Neurological Disorders

arXiv:2603.24846v1 Announce Type: cross Abstract: Recent advances in multimodal large language models enable new possibilities for image-based decision support.

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

Gaze patterns predict preference and confidence in pairwise AI image evaluation

arXiv:2603.24849v1 Announce Type: cross Abstract: Preference learning methods, such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference O

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

arXiv:2603.24857v1 Announce Type: cross Abstract: As machine learning (ML) systems expand in both scale and functionality, the security landscape has become inc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

More Than "Means to an End": Supporting Reasoning with Transparently Designed AI Data Science Processes

arXiv:2603.24877v1 Announce Type: cross Abstract: Generative artificial intelligence (AI) tools can now help people perform complex data science tasks regardles

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Surrogates, Spikes, and Sparsity: Performance Analysis and Characterization of SNN Hyperparameters on Hardware

arXiv:2603.24891v1 Announce Type: cross Abstract: Spiking Neural Networks (SNNs) offer inherent advantages for low-power inference through sparse, event-driven

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

LogSigma at SemEval-2026 Task 3: Uncertainty-Weighted Multitask Learning for Dimensional Aspect-Based Sentiment Analysis

arXiv:2603.24896v1 Announce Type: cross Abstract: This paper describes LogSigma, our system for SemEval-2026 Task 3: Dimensional Aspect-Based Sentiment Analysis

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Sovereign AI at the Front Door of Care: A Physically Unidirectional Architecture for Secure Clinical Intelligence

arXiv:2603.24898v1 Announce Type: cross Abstract: We present a Sovereign AI architecture for clinical triage in which all inference is performed on-device and i

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

Integrated Multi-Drone Task Allocation, Sequencing, and Optimal Trajectory Generation in Obstacle-Rich 3D Environments

arXiv:2603.24908v1 Announce Type: cross Abstract: Coordinating teams of aerial robots in cluttered three-dimensional (3D) environments requires a principled int