📰 ArXiv cs.AI
Articles from ArXiv cs.AI · 3,273 articles · Updated every 3 hours · View all news
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning
arXiv:2604.03114v1 Announce Type: cross Abstract: VLMs trained on web-scale data retain sensitive and copyrighted visual concepts that deployment may require re
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
An Independent Safety Evaluation of Kimi K2.5
arXiv:2604.03121v1 Announce Type: cross Abstract: Kimi K2.5 is an open-weight LLM that rivals closed models across coding, multimodal, and agentic benchmarks, b
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts
arXiv:2604.03127v1 Announce Type: cross Abstract: Automated annotation of pedagogical dialogue is a high-stakes task where LLMs often fail without sufficient do
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
A Systematic Security Evaluation of OpenClaw and Its Variants
arXiv:2604.03131v1 Announce Type: cross Abstract: Tool-augmented AI agents substantially extend the practical capabilities of large language models, but they al
ArXiv cs.AI
💻 AI-Assisted Coding
📄 Paper
⚡ AI Lesson
3d ago
AI-Assisted Unit Test Writing and Test-Driven Code Refactoring: A Case Study
arXiv:2604.03135v1 Announce Type: cross Abstract: Many software systems originate as prototypes or minimum viable products (MVPs), developed with an emphasis on
ArXiv cs.AI
💻 AI-Assisted Coding
📄 Paper
⚡ AI Lesson
3d ago
InCoder-32B-Thinking: Industrial Code World Model for Thinking
arXiv:2604.03144v1 Announce Type: cross Abstract: Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reason
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control
arXiv:2604.03147v1 Announce Type: cross Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. F
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation
arXiv:2604.03174v1 Announce Type: cross Abstract: Large language models (LLMs) encode vast world knowledge in their parameters, yet they remain fundamentally li
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models
arXiv:2604.03179v1 Announce Type: cross Abstract: The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Reflective Context Learning: Studying the Optimization Primitives of Context Space
arXiv:2604.03189v1 Announce Type: cross Abstract: Generally capable agents must learn from experience in ways that generalize across tasks and environments. The
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Gradient Boosting within a Single Attention Layer
arXiv:2604.03190v1 Announce Type: cross Abstract: Transformer attention computes a single softmax-weighted average over values -- a one-pass estimate that canno
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization
arXiv:2604.03192v1 Announce Type: cross Abstract: We study multiteacher knowledge distillation for low resource abstractive summarization from a reliability awa
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction
arXiv:2604.03203v1 Announce Type: cross Abstract: Three-dimensional medical image data and computer-aided decision making, particularly using deep learning, are
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Enhancing Robustness of Federated Learning via Server Learning
arXiv:2604.03226v1 Announce Type: cross Abstract: This paper explores the use of server learning for enhancing the robustness of federated learning against mali
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
WiseMind: a knowledge-guided multi-agent framework for accurate and empathetic psychiatric diagnosis
arXiv:2502.20689v3 Announce Type: replace Abstract: Large Language Models (LLMs) offer promising opportunities to support mental healthcare workflows, yet they
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
Learn to Relax with Large Language Models: Solving Constraint Optimization Problems via Bidirectional Coevolution
arXiv:2509.12643v4 Announce Type: replace Abstract: Large Language Model (LLM)-based optimization has recently shown promise for autonomous problem solving, yet
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
Glia: A Human-Inspired AI for Automated Systems Design and Optimization
arXiv:2510.27176v5 Announce Type: replace Abstract: Can AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents
arXiv:2511.02734v2 Announce Type: replace Abstract: Current evaluations of Large Language Model (LLM) agents primarily emphasize task completion, often overlook
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection
arXiv:2512.16300v2 Announce Type: replace Abstract: Existing image forgery detection (IFD) methods either exploit low-level, semantics-agnostic artifacts or rel
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
ClinicalReTrial: Clinical Trial Redesign with Self-Evolving Agents
arXiv:2601.00290v2 Announce Type: replace Abstract: Clinical trials constitute a critical yet exceptionally challenging and costly stage of drug development (\$
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
AgenticRed: Evolving Agentic Systems for Red-Teaming
arXiv:2601.13518v3 Announce Type: replace Abstract: While recent automated red-teaming methods show promise for systematically exposing model vulnerabilities, m
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3d ago
From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics
arXiv:2601.23048v3 Announce Type: replace Abstract: Large language models now solve many benchmark math problems at near-expert levels, yet this progress has no
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3d ago
From Virtual Environments to Real-World Trials: Emerging Trends in Autonomous Driving
arXiv:2603.17714v2 Announce Type: replace Abstract: Autonomous driving technologies have achieved significant advances in recent years, yet their real-world dep
ArXiv cs.AI
🛡️ AI Safety & Ethics
📄 Paper
⚡ AI Lesson
3d ago
When AI Gets it Wrong: Reliability and Risk in AI-Assisted Medication Decision Systems
arXiv:2604.01449v2 Announce Type: replace Abstract: Artificial intelligence (AI) systems are increasingly integrated into healthcare and pharmacy workflows, sup
DeepCamp AI