📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 6,601 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (17616) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

RAG-KT: Cross-platform Explainable Knowledge Tracing with Multi-view Fusion Retrieval Generation

arXiv:2604.10960v1 Announce Type: new Abstract: Knowledge Tracing (KT) infers a student's knowledge state from past interactions to predict future performance.

ArXiv cs.AI 📄 Paper 1w ago

Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models

arXiv:2604.10963v1 Announce Type: new Abstract: Medical image segmentation supports clinical workflows by precisely delineating anatomical structures and lesion

ArXiv cs.AI 📄 Paper 1w ago

CFMS: A Coarse-to-Fine Multimodal Synthesis Framework for Enhanced Tabular Reasoning

arXiv:2604.10973v1 Announce Type: new Abstract: Reasoning over tabular data is a crucial capability for tasks like question answering and fact verification, as

ArXiv cs.AI 📄 Paper 1w ago

ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks

arXiv:2604.10981v1 Announce Type: new Abstract: ATANT v1.0 (arXiv:2604.06710) defined continuity as a system property with 7 required properties and introduced

ArXiv cs.AI 📄 Paper 1w ago

Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models

arXiv:2604.10985v1 Announce Type: new Abstract: Vision-Language Models (VLMs) have rapidly advanced by leveraging powerful pre-trained Large Language Models (LL

ArXiv cs.AI 📄 Paper 1w ago

WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark

arXiv:2604.10988v1 Announce Type: new Abstract: Existing browser agent benchmarks face a fundamental trilemma: real-website benchmarks lack reproducibility due

ArXiv cs.AI 📄 Paper 1w ago

MAFIG: Multi-agent Driven Formal Instruction Generation Framework

arXiv:2604.10989v1 Announce Type: new Abstract: Emergency situations in scheduling systems often trigger local functional failures that undermine system stabili

ArXiv cs.AI 📄 Paper 1w ago

Sanity Checks for Agentic Data Science

arXiv:2604.11003v1 Announce Type: new Abstract: Agentic data science (ADS) pipelines have grown rapidly in both capability and adoption, with systems such as Op

ArXiv cs.AI 📄 Paper 1w ago

Diffusion-CAM: Faithful Visual Explanations for dMLLMs

arXiv:2604.11005v1 Announce Type: new Abstract: While diffusion Multimodal Large Language Models (dMLLMs) have recently achieved remarkable strides in multimoda

ArXiv cs.AI 📄 Paper 1w ago

Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

arXiv:2604.11012v1 Announce Type: new Abstract: The quality of text generated by large language models depends critically on the decoding sampling strategy. Whi

ArXiv cs.AI 📄 Paper 1w ago

Introspective Diffusion Language Models

arXiv:2604.11035v1 Announce Type: new Abstract: Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in qualit

ArXiv cs.AI 📄 Paper 1w ago

Intelligent Approval of Access Control Flow in Office Automation Systems via Relational Modeling

arXiv:2604.11040v1 Announce Type: new Abstract: Office automation (OA) systems play a crucial role in enterprise operations and management, with access control

ArXiv cs.AI 📄 Paper 1w ago

From Topology to Trajectory: LLM-Driven World Models For Supply Chain Resilience

arXiv:2604.11041v1 Announce Type: new Abstract: Semiconductor supply chains face unprecedented resilience challenges amidst global geopolitical turbulence. Conv

ArXiv cs.AI 📄 Paper 1w ago

EmergentBridge: Improving Zero-Shot Cross-Modal Transfer in Unified Multimodal Embedding Models

arXiv:2604.11043v1 Announce Type: new Abstract: Unified multimodal embedding spaces underpin practical applications such as cross-modal retrieval and zero-shot

ArXiv cs.AI 📄 Paper 1w ago

AI Integrity: A New Paradigm for Verifiable AI Governance

arXiv:2604.11065v1 Announce Type: new Abstract: AI systems increasingly shape high-stakes decisions in healthcare, law, defense, and education, yet existing gov

ArXiv cs.AI 📄 Paper 1w ago

PRISM Risk Signal Framework: Hierarchy-Based Red Lines for AI Behavioral Risk

arXiv:2604.11070v1 Announce Type: new Abstract: Current approaches to AI safety define red lines at the case level: specific prompts, specific outputs, specific

ArXiv cs.AI 📄 Paper 1w ago

Hodoscope: Unsupervised Monitoring for AI Misbehaviors

arXiv:2604.11072v1 Announce Type: new Abstract: Existing approaches to monitoring AI agents rely on supervised evaluation: human-written rules or LLM-based judg

ArXiv cs.AI 📄 Paper 1w ago

Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation

arXiv:2604.11077v1 Announce Type: new Abstract: Customer service chatbots are increasingly expected to serve not merely as reactive support tools for users, but

ArXiv cs.AI 📄 Paper 1w ago

Do Agent Rules Shape or Distort? Guardrails Beat Guidance in Coding Agents

arXiv:2604.11088v1 Announce Type: new Abstract: Developers increasingly guide AI coding agents through natural language instruction files (e.g., CLAUDE.md, .cur

ArXiv cs.AI 📄 Paper 1w ago

Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds

arXiv:2604.11104v1 Announce Type: new Abstract: This paper presents an empirical study of a multi-model zero-shot pipeline for knowledge graph construction and

ArXiv cs.AI 📄 Paper 1w ago

Persona Non Grata: Single-Method Safety Evaluation Is Incomplete for Persona-Imbued LLMs

arXiv:2604.11120v1 Announce Type: new Abstract: Personality imbuing customizes LLM behavior, but safety evaluations almost always study prompt-based personas al

ArXiv cs.AI 📄 Paper 1w ago

A Proposed Biomedical Data Policy Framework to Reduce Fragmentation, Improve Quality, and Incentivize Sharing in Indian Healthcare in the era of Artificial Intelligence and Digital Health

arXiv:2604.11125v1 Announce Type: new Abstract: India generates vast biomedical data through postgraduate research, government hospital services and audits, gov

ArXiv cs.AI 📄 Paper 1w ago

MADQRL: Distributed Quantum Reinforcement Learning Framework for Multi-Agent Environments

arXiv:2604.11131v1 Announce Type: new Abstract: Reinforcement learning (RL) is one of the most practical ways to learn from real-life use-cases. Motivated from

ArXiv cs.AI 📄 Paper 1w ago

From Answers to Arguments: Toward Trustworthy Clinical Diagnostic Reasoning with Toulmin-Guided Curriculum Goal-Conditioned Learning

arXiv:2604.11137v1 Announce Type: new Abstract: The integration of Large Language Models (LLMs) into clinical decision support is critically obstructed by their