📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 4,506 articles · Updated every 3 hours · View all reads

arXiv:2604.09609v1 Announce Type: new Abstract: Human behavior models are essential as behavior references and for simulating human agents in virtual safety ass

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1d ago

Beyond Theory of Mind in Robotics

arXiv:2604.09612v1 Announce Type: new Abstract: Theory of Mind, the capacity to explain and predict behavior by inferring hidden mental states, has become the d

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1d ago

The Geometry of Knowing: From Possibilistic Ignorance to Probabilistic Certainty -- A Measure-Theoretic Framework for Epistemic Convergence

arXiv:2604.09614v1 Announce Type: new Abstract: This paper develops a measure-theoretic framework establishing when and how a possibilistic representation of in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation

arXiv:2604.09617v1 Announce Type: new Abstract: Transparent and standardized documentation is essential for building trustworthy generative AI (GAI) systems. Ho

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1d ago

Competing with AI Scientists: Agent-Driven Approach to Astrophysics Research

arXiv:2604.09621v1 Announce Type: new Abstract: We present an agent-driven approach to the construction of parameter inference pipelines for scientific data ana

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

How LLMs Might Think

arXiv:2604.09674v1 Announce Type: new Abstract: Do large language models (LLMs) think? Daniel Stoljar and Zhihe Vincent Zhang have recently developed an argumen

ArXiv cs.AI 📄 Paper 1d ago

Belief-Aware VLM Model for Human-like Reasoning

arXiv:2604.09686v1 Announce Type: new Abstract: Traditional neural network models for intent inference rely heavily on observable states and struggle to general

ArXiv cs.AI 📄 Paper 1d ago

Tipiano: Cascaded Piano Hand Motion Synthesis via Fingertip Priors

arXiv:2604.09692v1 Announce Type: new Abstract: Synthesizing realistic piano hand motions requires both precision and naturalness. Physics-based methods achieve

ArXiv cs.AI 📄 Paper 1d ago

The Myth of Expert Specialization in MoEs: Why Routing Reflects Geometry, Not Necessarily Domain Expertise

arXiv:2604.09780v1 Announce Type: new Abstract: Mixture of Experts (MoEs) are now ubiquitous in large language models, yet the mechanisms behind their "expert s

ArXiv cs.AI 📄 Paper 1d ago

Pioneer Agent: Continual Improvement of Small Language Models in Production

arXiv:2604.09791v1 Announce Type: new Abstract: Small language models are attractive for production deployment due to their low cost, fast inference, and ease o

ArXiv cs.AI 📄 Paper 1d ago

Controllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement Learning

arXiv:2604.09813v1 Announce Type: new Abstract: Existing synthetic tool-use corpora are primarily designed for offline supervised fine-tuning, yet reinforcement

ArXiv cs.AI 📄 Paper 1d ago

EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience Learning

arXiv:2604.09815v1 Announce Type: new Abstract: Computer-use agents that combine GUI interaction with structured API calls via the Model Context Protocol (MCP)

ArXiv cs.AI 📄 Paper 1d ago

COMPOSITE-Stem

arXiv:2604.09836v1 Announce Type: new Abstract: AI agents hold growing promise for accelerating scientific discovery; yet, a lack of frontier evaluations hinder

ArXiv cs.AI 📄 Paper 1d ago

Steered LLM Activations are Non-Surjective

arXiv:2604.09839v1 Announce Type: new Abstract: Activation steering is a popular white-box control technique that modifies model activations to elicit an abstra

ArXiv cs.AI 📄 Paper 1d ago

MEMENTO: Teaching LLMs to Manage Their Own Context

arXiv:2604.09852v1 Announce Type: new Abstract: Reasoning models think in long, unstructured streams with no mechanism for compressing or organizing their own i

ArXiv cs.AI 📄 Paper 1d ago

Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards

arXiv:2604.09855v1 Announce Type: new Abstract: The recent advancement of Large Language Models (LLMs) has established their potential as autonomous interactive

ArXiv cs.AI 📄 Paper 1d ago

Evolutionary Token-Level Prompt Optimization for Diffusion Models

arXiv:2604.09861v1 Announce Type: new Abstract: Text-to-image diffusion models exhibit strong generative performance but remain highly sensitive to prompt formu

ArXiv cs.AI 📄 Paper 1d ago

What do your logits know? (The answer may surprise you!)

arXiv:2604.09885v1 Announce Type: new Abstract: Recent work has shown that probing model internals can reveal a wealth of information not apparent from the mode

ArXiv cs.AI 📄 Paper 1d ago

In-situ process monitoring for defect detection in wire-arc additive manufacturing: an agentic AI approach

arXiv:2604.09889v1 Announce Type: new Abstract: AI agents are being increasingly deployed across a wide range of real-world applications. In this paper, we prop

ArXiv cs.AI 📄 Paper 1d ago

GLEaN: A Text-to-image Bias Detection Approach for Public Comprehension

arXiv:2604.09923v1 Announce Type: new Abstract: Text-to-image (T2I) models, and their encoded biases, increasingly shape the visual media the public encounters.

ArXiv cs.AI 📄 Paper 1d ago

HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration Tasks

arXiv:2604.09937v1 Announce Type: new Abstract: Healthcare administration accounts for over $1 trillion in annual spending, making it a promising target for LLM

ArXiv cs.AI 📄 Paper 1d ago

New Hybrid Fine-Tuning Paradigm for LLMs: Algorithm Design and Convergence Analysis Framework

arXiv:2604.09940v1 Announce Type: new Abstract: Fine-tuning Large Language Models (LLMs) typically involves either full fine-tuning, which updates all model par

ArXiv cs.AI 📄 Paper 1d ago

FinTrace: Holistic Trajectory-Level Evaluation of LLM Tool Calling for Long-Horizon Financial Tasks

arXiv:2604.10015v1 Announce Type: new Abstract: Recent studies demonstrate that tool-calling capability enables large language models (LLMs) to interact with ex

ArXiv cs.AI 📄 Paper 1d ago

AI Achieves a Perfect LSAT Score

arXiv:2604.10034v1 Announce Type: new Abstract: This paper reports the first documented instance of a language model achieving a perfect score on an officially