📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

arXiv:2512.02413v3 Announce Type: replace-cross Abstract: Automatic 3D reconstruction of indoor spaces from 2D floor plans necessitates high-precision semantic

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Lumos: Let there be Language Model System Certification

arXiv:2512.02966v2 Announce Type: replace-cross Abstract: We introduce the first principled framework, Lumos, for specifying and formally certifying Language Mo

ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 1w ago

Geometric-Photometric Event-based 3D Gaussian Ray Tracing

arXiv:2512.18640v2 Announce Type: replace-cross Abstract: Event cameras offer a high temporal resolution over traditional frame-based cameras, which makes them

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Bypassing Prompt Injection Detectors through Evasive Injections

arXiv:2602.00750v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly used in interactive and retrieval-augmented systems, but

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

On the Non-Identifiability of Steering Vectors in Large Language Models

arXiv:2602.06801v4 Announce Type: replace-cross Abstract: Activation steering methods are widely used to control large language model (LLM) behavior and are oft

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability-Plasticity Tradeoff

arXiv:2602.08040v3 Announce Type: replace-cross Abstract: Deep neural networks trained on nonstationary data must balance stability (i.e., retaining prior knowl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Evaluating LLM-Generated ACSL Annotations for Formal Verification

arXiv:2602.13851v2 Announce Type: replace-cross Abstract: Formal specifications are crucial for building verifiable and dependable software systems, yet generat

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

arXiv:2602.14464v2 Announce Type: replace-cross Abstract: Transferring visual style between images while preserving semantic correspondence between similar obje

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Chat-Based Support Alone May Not Be Enough: Comparing Conversational and Embedded LLM Feedback for Mathematical Proof Learning

arXiv:2602.18807v2 Announce Type: replace-cross Abstract: We evaluate GPTutor, an LLM-powered tutoring system for an undergraduate discrete mathematics course.

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

TaCarla: A comprehensive benchmarking dataset for end-to-end autonomous driving

arXiv:2602.23499v2 Announce Type: replace-cross Abstract: Collecting a high-quality dataset is a critical task that demands meticulous attention to detail, as o

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

arXiv:2603.03823v4 Announce Type: replace-cross Abstract: Large language model (LLM)-powered agents have demonstrated strong capabilities in automating software

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago

Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning

arXiv:2603.09697v2 Announce Type: replace-cross Abstract: Recent advances in spectral optimization, notably Muon, have demonstrated that constraining update ste

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

arXiv:2603.11558v3 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) systems have shown strong potential for language-driven robotic manipulat

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design

arXiv:2603.13431v2 Announce Type: replace-cross Abstract: Computational antibody design has seen rapid methodological progress, with dozens of deep generative m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation

arXiv:2603.17205v2 Announce Type: replace-cross Abstract: Domain-specific finetuning is essential for dense retrievers, yet not all training pairs contribute eq

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago

SA-CycleGAN-2.5D: Self-Attention CycleGAN with Tri-Planar Context for Multi-Site MRI Harmonization

arXiv:2603.17219v2 Announce Type: replace-cross Abstract: Multi-site neuroimaging analysis is fundamentally confounded by scanner-induced covariate shifts, wher

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 1w ago

The data heat island effect: quantifying the impact of AI data centers in a warming world

arXiv:2603.20897v2 Announce Type: replace-cross Abstract: The strong and continuous increase of AI-based services leads to the steady proliferation of AI data c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

ChartDiff: A Large-Scale Benchmark for Comprehending Pairs of Charts

arXiv:2603.28902v1 Announce Type: new Abstract: Charts are central to analytical reasoning, yet existing benchmarks for chart understanding focus almost exclusi

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Working Paper: Towards a Category-theoretic Comparative Framework for Artificial General Intelligence

arXiv:2603.28906v1 Announce Type: new Abstract: AGI has become the Holly Grail of AI with the promise of level intelligence and the major Tech companies around

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Towards Computational Social Dynamics of Semi-Autonomous AI Agents

arXiv:2603.28928v1 Announce Type: new Abstract: We present the first comprehensive study of emergent social organization among AI agents in hierarchical multi-a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Enhancing Policy Learning with World-Action Model

arXiv:2603.28955v1 Announce Type: new Abstract: This paper presents the World-Action Model (WAM), an action-regularized world model that jointly reasons over fu

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research

arXiv:2603.28986v1 Announce Type: new Abstract: Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agenti

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Drop the Hierarchy and Roles: How Self-Organizing LLM Agents Outperform Designed Structures

arXiv:2603.28990v1 Announce Type: new Abstract: How much autonomy can multi-agent LLM systems sustain -- and what enables it? We present a 25,000-task computati

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Emergence WebVoyager: Toward Consistent and Transparent Evaluation of (Web) Agents in The Wild

arXiv:2603.29020v1 Announce Type: new Abstract: Reliable evaluation of AI agents operating in complex, real-world environments requires methodologies that are r