📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,539 articles · Updated every 3 hours · View all reads

arXiv:2604.08601v1 Announce Type: new Abstract: The rise of autonomous AI agents exposes a fundamental flaw in API-centric architectures: probabilistic systems

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 16h ago

From Business Events to Auditable Decisions: Ontology-Governed Graph Simulation for Enterprise AI

arXiv:2604.08603v1 Announce Type: new Abstract: Existing LLM-based agent systems share a common architectural failure: they answer from the unrestricted knowled

ArXiv cs.AI 📣 Digital Marketing & Growth 📄 Paper ⚡ AI Lesson 16h ago

Sustained Impact of Agentic Personalisation in Marketing: A Longitudinal Case Study

arXiv:2604.08621v1 Announce Type: new Abstract: In consumer applications, Customer Relationship Management (CRM) has traditionally relied on the manual optimisa

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 16h ago

RAMP: Hybrid DRL for Online Learning of Numeric Action Models

arXiv:2604.08685v1 Announce Type: new Abstract: Automated planning algorithms require an action model specifying the preconditions and effects of each action, b

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 16h ago

Parameterized Complexity Of Representing Models Of MSO Formulas

arXiv:2604.08707v1 Announce Type: new Abstract: Monadic second order logic (MSO2) plays an important role in parameterized complexity due to the Courcelle's the

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 16h ago

Model Space Reasoning as Search in Feedback Space for Planning Domain Generation

arXiv:2604.08712v1 Announce Type: new Abstract: The generation of planning domains from natural language descriptions remains an open problem even with the adve

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 16h ago

Artifacts as Memory Beyond the Agent Boundary

arXiv:2604.08756v1 Announce Type: new Abstract: The situated view of cognition holds that intelligent behavior depends not only on internal memory, but on an ag

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 16h ago

Hidden in Plain Sight: Visual-to-Symbolic Analytical Solution Inference from Field Visualizations

arXiv:2604.08863v1 Announce Type: new Abstract: Recovering analytical solutions of physical fields from visual observations is a fundamental yet underexplored c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 16h ago

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

arXiv:2604.08865v1 Announce Type: new Abstract: Proximal Policy Optimization (PPO) is central to aligning Large Language Models (LLMs) in reasoning tasks with v

ArXiv cs.AI 📄 Paper 16h ago

StaRPO: Stability-Augmented Reinforcement Policy Optimization

arXiv:2604.08905v1 Announce Type: new Abstract: Reinforcement learning (RL) is effective in enhancing the accuracy of large language models in complex reasoning

ArXiv cs.AI 📄 Paper 16h ago

Enhancing LLM Problem Solving via Tutor-Student Multi-Agent Interaction

arXiv:2604.08931v1 Announce Type: new Abstract: Human cognitive development is shaped not only by individual effort but by structured social interaction, where

ArXiv cs.AI 📄 Paper 16h ago

PilotBench: A Benchmark for General Aviation Agents with Safety Constraints

arXiv:2604.08987v1 Announce Type: new Abstract: As Large Language Models (LLMs) advance toward embodied AI agents operating in physical environments, a fundamen

ArXiv cs.AI 📄 Paper 16h ago

SEA-Eval: A Benchmark for Evaluating Self-Evolving Agents Beyond Episodic Assessment

arXiv:2604.08988v1 Announce Type: new Abstract: Current LLM-based agents demonstrate strong performance in episodic task execution but remain constrained by sta

ArXiv cs.AI 📄 Paper 16h ago

Hypergraph Neural Networks Accelerate MUS Enumeration

arXiv:2604.09001v1 Announce Type: new Abstract: Enumerating Minimal Unsatisfiable Subsets (MUSes) is a fundamental task in constraint satisfaction problems (CSP

ArXiv cs.AI 📄 Paper 16h ago

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

arXiv:2604.09035v1 Announce Type: new Abstract: Model-based reinforcement learning (MBRL) with autoregressive world models suffers from compounding errors, wher

ArXiv cs.AI 📄 Paper 16h ago

Overhang Tower: Resource-Rational Adaptation in Sequential Physical Planning

arXiv:2604.09072v1 Announce Type: new Abstract: Humans effortlessly navigate the physical world by predicting how objects behave under gravity and contact force

ArXiv cs.AI 📄 Paper 16h ago

Camera Artist: A Multi-Agent Framework for Cinematic Language Storytelling Video Generation

arXiv:2604.09195v1 Announce Type: new Abstract: We propose Camera Artist, a multi-agent framework that models a real-world filmmaking workflow to generate narra

ArXiv cs.AI 📄 Paper 16h ago

DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math?

arXiv:2604.09251v1 Announce Type: new Abstract: Deep research agents increasingly interleave web browsing with multi-step computation, yet existing benchmarks e

ArXiv cs.AI 📄 Paper 16h ago

SAGE: A Service Agent Graph-guided Evaluation Benchmark

arXiv:2604.09285v1 Announce Type: new Abstract: The development of Large Language Models (LLMs) has catalyzed automation in customer service, yet benchmarking t

ArXiv cs.AI 📄 Paper 16h ago

Constraint-Aware Corrective Memory for Language-Based Drug Discovery Agents

arXiv:2604.09308v1 Announce Type: new Abstract: Large language models are making autonomous drug discovery agents increasingly feasible, but reliable success in

ArXiv cs.AI 📄 Paper 16h ago

Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym

arXiv:2604.09338v1 Announce Type: new Abstract: Spatial reasoning is central to navigation and robotics, yet measuring model capabilities on these tasks remains

ArXiv cs.AI 📄 Paper 16h ago

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?

arXiv:2604.09408v1 Announce Type: new Abstract: Frontier coding agents solve complex tasks when given complete context but collapse when specifications are inco

ArXiv cs.AI 📄 Paper 16h ago

Do We Really Need to Approach the Entire Pareto Front in Many-Objective Bayesian Optimisation?

arXiv:2604.09417v1 Announce Type: new Abstract: Many-objective optimisation, a subset of multi-objective optimisation, involves optimisation problems with more

ArXiv cs.AI 📄 Paper 16h ago

E3-TIR: Enhanced Experience Exploitation for Tool-Integrated Reasoning

arXiv:2604.09455v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated significant potential in Tool-Integrated Reasoning (TIR), e