📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

arXiv:2603.27481v1 Announce Type: cross Abstract: Multimodal Continual Instruction Tuning aims to continually enhance Large Vision Language Models (LVLMs) by le

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Difference Feedback: Generating Multimodal Process-Level Supervision for VLM Reinforcement Learning

arXiv:2603.27482v1 Announce Type: cross Abstract: Vision--language models (VLMs) are increasingly aligned via Group Relative Policy Optimization (GRPO)-style tr

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents

arXiv:2603.27490v1 Announce Type: cross Abstract: As large language models (LLMs) evolve into autonomous agents for long-horizon information-seeking, managing f

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Copilot-Assisted Second-Thought Framework for Brain-to-Robot Hand Motion Decoding

arXiv:2603.27492v1 Announce Type: cross Abstract: Motor kinematics prediction (MKP) from electroencephalography (EEG) is an important research area for developi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs

arXiv:2603.27494v1 Announce Type: cross Abstract: To enhance the perception and reasoning capabilities of multimodal large language models in complex visual sce

ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 1w ago

Understanding Semantic Perturbations on In-Processing Generative Image Watermarks

arXiv:2603.27513v1 Announce Type: cross Abstract: The widespread deployment of high-fidelity generative models has intensified the need for reliable mechanisms

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

A Systematic Taxonomy of Security Vulnerabilities in the OpenClaw AI Agent Framework

arXiv:2603.27517v1 Announce Type: cross Abstract: AI agent frameworks connecting large language model (LLM) reasoning to host execution surfaces--shell, filesys

ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 1w ago

Safer Builders, Risky Maintainers: A Comparative Study of Breaking Changes in Human vs Agentic PRs

arXiv:2603.27524v1 Announce Type: cross Abstract: AI coding agents are increasingly integrated into modern software engineering workflows, actively collaboratin

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago

Cross-attentive Cohesive Subgraph Embedding to Mitigate Oversquashing in GNNs

arXiv:2603.27529v1 Announce Type: cross Abstract: Graph neural networks (GNNs) have achieved strong performance across various real-world domains. Nevertheless,

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Demo-Pose: Depth-Monocular Modality Fusion For Object Pose Estimation

arXiv:2603.27533v1 Announce Type: cross Abstract: Object pose estimation is a fundamental task in 3D vision with applications in robotics, AR/VR, and scene unde

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Toward Reliable Evaluation of LLM-Based Financial Multi-Agent Systems: Taxonomy, Coordination Primacy, and Cost Awareness

arXiv:2603.27539v1 Announce Type: cross Abstract: Multi-agent systems based on large language models (LLMs) for financial trading have grown rapidly since 2023,

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

A Novel Immune Algorithm for Multiparty Multiobjective Optimization

arXiv:2603.27541v1 Announce Type: cross Abstract: Traditional multiobjective optimization problems (MOPs) are insufficiently equipped for scenarios involving mu

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Drag or Traction: Understanding How Designers Appropriate Friction in AI Ideation Outputs

arXiv:2603.27550v1 Announce Type: cross Abstract: Seamless AI presents output as a finished, polished product that users consume rather than shape. This risks d

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

A General Model for Deepfake Speech Detection: Diverse Bonafide Resources or Diverse AI-Based Generators

arXiv:2603.27557v1 Announce Type: cross Abstract: In this paper, we analyze two main factors of Bonafide Resource (BR) or AI-based Generator (AG) which affect t

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

InnerPond: Fostering Inter-Self Dialogue with a Multi-Agent Approach for Introspection

arXiv:2603.27563v1 Announce Type: cross Abstract: Introspection is central to identity construction and future planning, yet most digital tools approach the sel

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

arXiv:2603.27593v1 Announce Type: cross Abstract: Recent progress in video large language models (Video-LLMs) has enabled strong offline reasoning over long and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Expert Streaming: Accelerating Low-Batch MoE Inference via Multi-chiplet Architecture and Dynamic Expert Trajectory Scheduling

arXiv:2603.27624v1 Announce Type: cross Abstract: Mixture-of-Experts is a promising approach for edge AI with low-batch inference. Yet, on-device deployments of

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Umwelt Engineering: Designing the Cognitive Worlds of Linguistic Agents

arXiv:2603.27626v1 Announce Type: cross Abstract: I propose Umwelt engineering -- the deliberate design of the linguistic cognitive environment -- as a third la

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

ContraMap: Contrastive Uncertainty Mapping for Robot Environment Representation

arXiv:2603.27632v1 Announce Type: cross Abstract: Reliable robot perception requires not only predicting scene structure, but also identifying where predictions

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

EvA: An Evidence-First Audio Understanding Paradigm for LALMs

arXiv:2603.27667v1 Announce Type: cross Abstract: Large Audio Language Models (LALMs) still struggle in complex acoustic scenes because they often fail to prese

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

ProgressVLA: Progress-Guided Diffusion Policy for Vision-Language Robotic Manipulation

arXiv:2603.27670v1 Announce Type: cross Abstract: Most existing vision-language-action (VLA) models for robotic manipulation lack progress awareness, typically

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

LVRPO: Language-Visual Alignment with GRPO for Multimodal Understanding and Generation

arXiv:2603.27693v1 Announce Type: cross Abstract: Unified multimodal pretraining has emerged as a promising paradigm for jointly modeling language and vision wi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

RAP: Retrieve, Adapt, and Prompt-Fit for Training-Free Few-Shot Medical Image Segmentation

arXiv:2603.27705v1 Announce Type: cross Abstract: Few-shot medical image segmentation (FSMIS) has achieved notable progress, yet most existing methods mainly re

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

The role of neuromorphic principles in the future of biomedicine and healthcare

arXiv:2603.27716v1 Announce Type: cross Abstract: Neuromorphic engineering has matured over the past four decades and is currently experiencing explosive growth