📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 6,347 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (15954) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

MAny: Merge Anything for Multimodal Continual Instruction Tuning

arXiv:2604.14016v1 Announce Type: cross Abstract: Multimodal Continual Instruction Tuning (MCIT) is essential for sequential task adaptation of Multimodal Large

ArXiv cs.AI 📄 Paper 6d ago

Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

arXiv:2604.14025v1 Announce Type: cross Abstract: Reconstructing 3D representations from 2D inputs is a fundamental task in computer vision and graphics, servin

ArXiv cs.AI 📄 Paper 6d ago

Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends

arXiv:2604.14034v1 Announce Type: cross Abstract: Recent advances in Generative Artificial Intelligence, particularly Large Language Models (LLMs), have stimula

ArXiv cs.AI 📄 Paper 6d ago

First-See-Then-Design: A Multi-Stakeholder View for Optimal Performance-Fairness Trade-Offs

arXiv:2604.14035v1 Announce Type: cross Abstract: Fairness in algorithmic decision-making is often defined in the predictive space, where predictive performance

ArXiv cs.AI 📄 Paper 6d ago

TIP: Token Importance in On-Policy Distillation

arXiv:2604.14084v1 Announce Type: cross Abstract: On-policy knowledge distillation (OPD) trains a student on its own rollouts under token-level supervision from

ArXiv cs.AI 📄 Paper 6d ago

UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception

arXiv:2604.14089v1 Announce Type: cross Abstract: We present UMI-3D, a multimodal extension of the Universal Manipulation Interface (UMI) for robust and scalabl

ArXiv cs.AI 📄 Paper 6d ago

UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding

arXiv:2604.14113v1 Announce Type: cross Abstract: GUI grounding, which localizes interface elements from screenshots given natural language queries, remains cha

ArXiv cs.AI 📄 Paper 6d ago

HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System

arXiv:2604.14125v1 Announce Type: cross Abstract: While end-to-end Vision-Language-Action (VLA) models offer a promising paradigm for robotic manipulation, fine

ArXiv cs.AI 📄 Paper 6d ago

Rhetorical Questions in LLM Representations: A Linear Probing Study

arXiv:2604.14128v1 Announce Type: cross Abstract: Rhetorical questions are asked not to seek information but to persuade or signal stance. How large language mo

ArXiv cs.AI 📄 Paper 6d ago

From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs

arXiv:2604.14137v1 Announce Type: cross Abstract: Evaluating LLMs is challenging, as benchmark scores often fail to capture models' real-world usefulness. Inste

ArXiv cs.AI 📄 Paper 6d ago

LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning

arXiv:2604.14140v1 Announce Type: cross Abstract: As language models are increasingly deployed for complex autonomous tasks, their ability to reason accurately

ArXiv cs.AI 📄 Paper 6d ago

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

arXiv:2604.14142v1 Announce Type: cross Abstract: While reinforcement learning with verifiable rewards (RLVR) significantly enhances LLM reasoning by optimizing

ArXiv cs.AI 📄 Paper 6d ago

Agentic AI Optimisation (AAIO): what it is, how it works, why it matters, and how to deal with it

arXiv:2504.12482v2 Announce Type: replace Abstract: The emergence of Agentic Artificial Intelligence (AAI) systems capable of independently initiating digital i

ArXiv cs.AI 📄 Paper 6d ago

FieldWorkArena: Agentic AI Benchmark for Real Field Work Tasks

arXiv:2505.19662v3 Announce Type: replace Abstract: This paper introduces FieldWorkArena, a benchmark for agentic AI targeting real-world field work. With the r

ArXiv cs.AI 📄 Paper 6d ago

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

arXiv:2506.03610v3 Announce Type: replace Abstract: Large Language Model (LLM) agents are reshaping the game industry, by enabling more intelligent and human-pr

ArXiv cs.AI 📄 Paper 6d ago

RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

arXiv:2508.00222v5 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Reward (RLVR) has significantly advanced the complex reasoning abilit

ArXiv cs.AI 📄 Paper 6d ago

Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization

arXiv:2508.10164v2 Announce Type: replace Abstract: Recent advances in Large Reasoning Models (LRMs) have demonstrated strong performance on complex tasks throu

ArXiv cs.AI 📄 Paper 6d ago

MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents

arXiv:2509.06477v2 Announce Type: replace Abstract: Shortcuts such as APIs and deep-links have emerged as efficient complements to flexible GUI operations, fost

ArXiv cs.AI 📄 Paper 6d ago

ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration

arXiv:2509.21823v2 Announce Type: replace Abstract: Reward is critical to the evaluation and training of large language models (LLMs). However, existing rule-ba

ArXiv cs.AI 📄 Paper 6d ago

Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems

arXiv:2510.14133v2 Announce Type: replace Abstract: Agentic AI systems, which leverage multiple autonomous agents and large language models (LLMs), are increasi

ArXiv cs.AI 📄 Paper 6d ago

Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model

arXiv:2510.18165v2 Announce Type: replace Abstract: Diffusion language models (DLMs) are emerging as a powerful and promising alternative to the dominant autore

ArXiv cs.AI 📄 Paper 6d ago

Empowerment Gain and Causal Model Construction: Children and adults are sensitive to controllability and variability in their causal interventions

arXiv:2512.08230v2 Announce Type: replace Abstract: Learning about the causal structure of the world is a fundamental problem for human cognition. Causal models

ArXiv cs.AI 📄 Paper 6d ago

Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution

arXiv:2512.10696v2 Announce Type: replace Abstract: Procedural memory enables large language model (LLM) agents to internalize "how-to" knowledge, theoretically

ArXiv cs.AI 📄 Paper 6d ago

Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning

arXiv:2601.02902v2 Announce Type: replace Abstract: Symbolic logical reasoning is a critical yet underexplored capability of large language models (LLMs), provi