📰 ArXiv cs.AI
Articles from ArXiv cs.AI · 6,347 articles · Updated every 3 hours · View all reads
All
⚡ AI Lessons (15954)
ArXiv cs.AIDev.to AIDev.to · FORUM WEBForbes InnovationMedium · ProgrammingMedium · AI
ArXiv cs.AI
📄 Paper
6d ago
MAny: Merge Anything for Multimodal Continual Instruction Tuning
arXiv:2604.14016v1 Announce Type: cross Abstract: Multimodal Continual Instruction Tuning (MCIT) is essential for sequential task adaptation of Multimodal Large
ArXiv cs.AI
📄 Paper
6d ago
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective
arXiv:2604.14025v1 Announce Type: cross Abstract: Reconstructing 3D representations from 2D inputs is a fundamental task in computer vision and graphics, servin
ArXiv cs.AI
📄 Paper
6d ago
Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends
arXiv:2604.14034v1 Announce Type: cross Abstract: Recent advances in Generative Artificial Intelligence, particularly Large Language Models (LLMs), have stimula
ArXiv cs.AI
📄 Paper
6d ago
First-See-Then-Design: A Multi-Stakeholder View for Optimal Performance-Fairness Trade-Offs
arXiv:2604.14035v1 Announce Type: cross Abstract: Fairness in algorithmic decision-making is often defined in the predictive space, where predictive performance
ArXiv cs.AI
📄 Paper
6d ago
TIP: Token Importance in On-Policy Distillation
arXiv:2604.14084v1 Announce Type: cross Abstract: On-policy knowledge distillation (OPD) trains a student on its own rollouts under token-level supervision from
ArXiv cs.AI
📄 Paper
6d ago
UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception
arXiv:2604.14089v1 Announce Type: cross Abstract: We present UMI-3D, a multimodal extension of the Universal Manipulation Interface (UMI) for robust and scalabl
ArXiv cs.AI
📄 Paper
6d ago
UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding
arXiv:2604.14113v1 Announce Type: cross Abstract: GUI grounding, which localizes interface elements from screenshots given natural language queries, remains cha
ArXiv cs.AI
📄 Paper
6d ago
HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System
arXiv:2604.14125v1 Announce Type: cross Abstract: While end-to-end Vision-Language-Action (VLA) models offer a promising paradigm for robotic manipulation, fine
ArXiv cs.AI
📄 Paper
6d ago
Rhetorical Questions in LLM Representations: A Linear Probing Study
arXiv:2604.14128v1 Announce Type: cross Abstract: Rhetorical questions are asked not to seek information but to persuade or signal stance. How large language mo
ArXiv cs.AI
📄 Paper
6d ago
From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs
arXiv:2604.14137v1 Announce Type: cross Abstract: Evaluating LLMs is challenging, as benchmark scores often fail to capture models' real-world usefulness. Inste
ArXiv cs.AI
📄 Paper
6d ago
LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning
arXiv:2604.14140v1 Announce Type: cross Abstract: As language models are increasingly deployed for complex autonomous tasks, their ability to reason accurately
ArXiv cs.AI
📄 Paper
6d ago
From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space
arXiv:2604.14142v1 Announce Type: cross Abstract: While reinforcement learning with verifiable rewards (RLVR) significantly enhances LLM reasoning by optimizing
ArXiv cs.AI
📄 Paper
6d ago
Agentic AI Optimisation (AAIO): what it is, how it works, why it matters, and how to deal with it
arXiv:2504.12482v2 Announce Type: replace Abstract: The emergence of Agentic Artificial Intelligence (AAI) systems capable of independently initiating digital i
ArXiv cs.AI
📄 Paper
6d ago
FieldWorkArena: Agentic AI Benchmark for Real Field Work Tasks
arXiv:2505.19662v3 Announce Type: replace Abstract: This paper introduces FieldWorkArena, a benchmark for agentic AI targeting real-world field work. With the r
ArXiv cs.AI
📄 Paper
6d ago
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games
arXiv:2506.03610v3 Announce Type: replace Abstract: Large Language Model (LLM) agents are reshaping the game industry, by enabling more intelligent and human-pr
ArXiv cs.AI
📄 Paper
6d ago
RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization
arXiv:2508.00222v5 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Reward (RLVR) has significantly advanced the complex reasoning abilit
ArXiv cs.AI
📄 Paper
6d ago
Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization
arXiv:2508.10164v2 Announce Type: replace Abstract: Recent advances in Large Reasoning Models (LRMs) have demonstrated strong performance on complex tasks throu
ArXiv cs.AI
📄 Paper
6d ago
MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents
arXiv:2509.06477v2 Announce Type: replace Abstract: Shortcuts such as APIs and deep-links have emerged as efficient complements to flexible GUI operations, fost
ArXiv cs.AI
📄 Paper
6d ago
ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration
arXiv:2509.21823v2 Announce Type: replace Abstract: Reward is critical to the evaluation and training of large language models (LLMs). However, existing rule-ba
ArXiv cs.AI
📄 Paper
6d ago
Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems
arXiv:2510.14133v2 Announce Type: replace Abstract: Agentic AI systems, which leverage multiple autonomous agents and large language models (LLMs), are increasi
ArXiv cs.AI
📄 Paper
6d ago
Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model
arXiv:2510.18165v2 Announce Type: replace Abstract: Diffusion language models (DLMs) are emerging as a powerful and promising alternative to the dominant autore
ArXiv cs.AI
📄 Paper
6d ago
Empowerment Gain and Causal Model Construction: Children and adults are sensitive to controllability and variability in their causal interventions
arXiv:2512.08230v2 Announce Type: replace Abstract: Learning about the causal structure of the world is a fundamental problem for human cognition. Causal models
ArXiv cs.AI
📄 Paper
6d ago
Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution
arXiv:2512.10696v2 Announce Type: replace Abstract: Procedural memory enables large language model (LLM) agents to internalize "how-to" knowledge, theoretically
ArXiv cs.AI
📄 Paper
6d ago
Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning
arXiv:2601.02902v2 Announce Type: replace Abstract: Symbolic logical reasoning is a critical yet underexplored capability of large language models (LLMs), provi
DeepCamp AI