📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,273 articles · Updated every 3 hours · View all news

arXiv:2604.02528v1 Announce Type: new Abstract: The new Specifications for the National Bridge Inventory (SNBI), in effect from 2022, emphasize the use of eleme

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling

arXiv:2604.02545v1 Announce Type: new Abstract: The preservation of intangible cultural heritage is a critical challenge as collective memory fades over time. W

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Mitigating LLM biases toward spurious social contexts using direct preference optimization

arXiv:2604.02585v1 Announce Type: new Abstract: LLMs are increasingly used for high-stakes decision-making, yet their sensitivity to spurious contextual informa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Do Audio-Visual Large Language Models Really See and Hear?

arXiv:2604.02605v1 Announce Type: new Abstract: Audio-Visual Large Language Models (AVLLMs) are emerging as unified interfaces to multimodal perception. We pres

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models

arXiv:2604.02617v1 Announce Type: new Abstract: Scientific and Technical Intelligence (S&TI) analysis requires verifying complex technical claims across rapidly

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

OntoKG: Ontology-Oriented Knowledge Graph Construction with Intrinsic-Relational Routing

arXiv:2604.02618v1 Announce Type: new Abstract: Organizing a large-scale knowledge graph into a typed property graph requires structural decisions -- which enti

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization

arXiv:2604.02666v1 Announce Type: new Abstract: Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constr

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

arXiv:2604.02721v1 Announce Type: new Abstract: Competitive programming remains one of the last few human strongholds in coding against AI. The best AI system t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models

arXiv:2604.02733v1 Announce Type: new Abstract: Reasoning benchmarks typically evaluate whether a model derives the correct answer from a fixed premise set, but

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Aligning Progress and Feasibility: A Neuro-Symbolic Dual Memory Framework for Long-Horizon LLM Agents

arXiv:2604.02734v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated strong potential in long-horizon decision-making tasks, such as e

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity

arXiv:2604.02770v1 Announce Type: new Abstract: In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the d

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

CharTool: Tool-Integrated Visual Reasoning for Chart Understanding

arXiv:2604.02794v1 Announce Type: new Abstract: Charts are ubiquitous in scientific and financial literature for presenting structured data. However, chart reas

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

ESL-Bench: An Event-Driven Synthetic Longitudinal Benchmark for Health Agents

arXiv:2604.02834v1 Announce Type: new Abstract: Longitudinal health agents must reason across multi-source trajectories that combine continuous device streams,

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

EMS: Multi-Agent Voting via Efficient Majority-then-Stopping

arXiv:2604.02863v1 Announce Type: new Abstract: Majority voting is the standard for aggregating multi-agent responses into a final decision. However, traditiona

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Multi-Turn Reinforcement Learning for Tool-Calling Agents with Iterative Reward Calibration

arXiv:2604.02869v1 Announce Type: new Abstract: Training tool-calling agents with reinforcement learning on multi-turn tasks remains challenging due to sparse o

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Analysis of Optimality of Large Language Models on Planning Problems

arXiv:2604.02910v1 Announce Type: new Abstract: Classic AI planning problems have been revisited in the Large Language Model (LLM) era, with a focus of recent b

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

arXiv:2604.02947v1 Announce Type: new Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and exec

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

arXiv:2604.02967v1 Announce Type: new Abstract: Recent Large Reasoning Models (LRMs) like DeepSeek-R1 have demonstrated remarkable success in complex reasoning

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking

arXiv:2604.02971v1 Announce Type: new Abstract: Recent agentic search systems have made substantial progress by emphasising deep, multi-step reasoning. However,

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

arXiv:2604.03016v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) are evolving from passive observers into active agents, solving problem

ArXiv cs.AI 🛠️ AI Tools & Apps 📄 Paper ⚡ AI Lesson 3d ago

Automatic Textbook Formalization

arXiv:2604.03071v1 Announce Type: new Abstract: We present a case study where an automatic AI system formalizes a textbook with more than 500 pages of graduate-

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models

arXiv:2604.03157v1 Announce Type: new Abstract: The recent advancements in Vision Language Models (VLMs) have demonstrated progress toward true intelligence req

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3d ago

Coupled Control, Structured Memory, and Verifiable Action in Agentic AI (SCRAT -- Stochastic Control with Retrieval and Auditable Trajectories): A Comparative Perspective from Squirrel Locomotion and Scatter-Hoarding

arXiv:2604.03201v1 Announce Type: new Abstract: Agentic AI is increasingly judged not by fluent output alone but by whether it can act, remember, and verify und

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

arXiv:2112.07874v2 Announce Type: cross Abstract: We examine the extent to which, in principle, linguistic graph representations can complement and improve neur