3,539 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,539 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (10511) ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsHugging Face Blog
ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 16h ago
Parameterized Complexity Of Representing Models Of MSO Formulas
arXiv:2604.08707v1 Announce Type: new Abstract: Monadic second order logic (MSO2) plays an important role in parameterized complexity due to the Courcelle's the
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 16h ago
Model Space Reasoning as Search in Feedback Space for Planning Domain Generation
arXiv:2604.08712v1 Announce Type: new Abstract: The generation of planning domains from natural language descriptions remains an open problem even with the adve
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 16h ago
Artifacts as Memory Beyond the Agent Boundary
arXiv:2604.08756v1 Announce Type: new Abstract: The situated view of cognition holds that intelligent behavior depends not only on internal memory, but on an ag
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 16h ago
Hidden in Plain Sight: Visual-to-Symbolic Analytical Solution Inference from Field Visualizations
arXiv:2604.08863v1 Announce Type: new Abstract: Recovering analytical solutions of physical fields from visual observations is a fundamental yet underexplored c
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 16h ago
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks
arXiv:2604.08865v1 Announce Type: new Abstract: Proximal Policy Optimization (PPO) is central to aligning Large Language Models (LLMs) in reasoning tasks with v
ArXiv cs.AI 📄 Paper 16h ago
StaRPO: Stability-Augmented Reinforcement Policy Optimization
arXiv:2604.08905v1 Announce Type: new Abstract: Reinforcement learning (RL) is effective in enhancing the accuracy of large language models in complex reasoning
ArXiv cs.AI 📄 Paper 16h ago
Enhancing LLM Problem Solving via Tutor-Student Multi-Agent Interaction
arXiv:2604.08931v1 Announce Type: new Abstract: Human cognitive development is shaped not only by individual effort but by structured social interaction, where
ArXiv cs.AI 📄 Paper 16h ago
PilotBench: A Benchmark for General Aviation Agents with Safety Constraints
arXiv:2604.08987v1 Announce Type: new Abstract: As Large Language Models (LLMs) advance toward embodied AI agents operating in physical environments, a fundamen
ArXiv cs.AI 📄 Paper 16h ago
SEA-Eval: A Benchmark for Evaluating Self-Evolving Agents Beyond Episodic Assessment
arXiv:2604.08988v1 Announce Type: new Abstract: Current LLM-based agents demonstrate strong performance in episodic task execution but remain constrained by sta
ArXiv cs.AI 📄 Paper 16h ago
Hypergraph Neural Networks Accelerate MUS Enumeration
arXiv:2604.09001v1 Announce Type: new Abstract: Enumerating Minimal Unsatisfiable Subsets (MUSes) is a fundamental task in constraint satisfaction problems (CSP
ArXiv cs.AI 📄 Paper 16h ago
Advantage-Guided Diffusion for Model-Based Reinforcement Learning
arXiv:2604.09035v1 Announce Type: new Abstract: Model-based reinforcement learning (MBRL) with autoregressive world models suffers from compounding errors, wher
ArXiv cs.AI 📄 Paper 16h ago
Overhang Tower: Resource-Rational Adaptation in Sequential Physical Planning
arXiv:2604.09072v1 Announce Type: new Abstract: Humans effortlessly navigate the physical world by predicting how objects behave under gravity and contact force
ArXiv cs.AI 📄 Paper 16h ago
Camera Artist: A Multi-Agent Framework for Cinematic Language Storytelling Video Generation
arXiv:2604.09195v1 Announce Type: new Abstract: We propose Camera Artist, a multi-agent framework that models a real-world filmmaking workflow to generate narra
ArXiv cs.AI 📄 Paper 16h ago
DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math?
arXiv:2604.09251v1 Announce Type: new Abstract: Deep research agents increasingly interleave web browsing with multi-step computation, yet existing benchmarks e
ArXiv cs.AI 📄 Paper 16h ago
SAGE: A Service Agent Graph-guided Evaluation Benchmark
arXiv:2604.09285v1 Announce Type: new Abstract: The development of Large Language Models (LLMs) has catalyzed automation in customer service, yet benchmarking t
ArXiv cs.AI 📄 Paper 16h ago
Constraint-Aware Corrective Memory for Language-Based Drug Discovery Agents
arXiv:2604.09308v1 Announce Type: new Abstract: Large language models are making autonomous drug discovery agents increasingly feasible, but reliable success in
ArXiv cs.AI 📄 Paper 16h ago
Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym
arXiv:2604.09338v1 Announce Type: new Abstract: Spatial reasoning is central to navigation and robotics, yet measuring model capabilities on these tasks remains
ArXiv cs.AI 📄 Paper 16h ago
HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?
arXiv:2604.09408v1 Announce Type: new Abstract: Frontier coding agents solve complex tasks when given complete context but collapse when specifications are inco
ArXiv cs.AI 📄 Paper 16h ago
Do We Really Need to Approach the Entire Pareto Front in Many-Objective Bayesian Optimisation?
arXiv:2604.09417v1 Announce Type: new Abstract: Many-objective optimisation, a subset of multi-objective optimisation, involves optimisation problems with more
ArXiv cs.AI 📄 Paper 16h ago
E3-TIR: Enhanced Experience Exploitation for Tool-Integrated Reasoning
arXiv:2604.09455v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated significant potential in Tool-Integrated Reasoning (TIR), e