3,169 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

All ⚡ AI Lessons (8687) ArXiv cs.AIForbes InnovationOpenAI NewsDev.to AIHugging Face BlogHackernoon
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 4d ago
Beyond Isolated Tasks: A Framework for Evaluating Coding Agents on Sequential Software Evolution
arXiv:2604.03035v1 Announce Type: cross Abstract: Existing datasets for coding agents evaluate performance on isolated, single pull request (PR) tasks in a stat
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 4d ago
ARM: Advantage Reward Modeling for Long-Horizon Manipulation
arXiv:2604.03037v1 Announce Type: cross Abstract: Long-horizon robotic manipulation remains challenging for reinforcement learning (RL) because sparse rewards p
ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 4d ago
Analyzing Healthcare Interoperability Vulnerabilities: Formal Modeling and Graph-Theoretic Approach
arXiv:2604.03043v1 Announce Type: cross Abstract: In a healthcare environment, the healthcare interoperability platforms based on HL7 FHIR allow concurrent, asy
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency
arXiv:2604.03044v1 Announce Type: cross Abstract: We introduce JoyAI-LLM Flash, an efficient Mixture-of-Experts (MoE) language model designed to redefine the tr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
MECO: A Multimodal Dataset for Emotion and Cognitive Understanding in Older Adults
arXiv:2604.03050v1 Announce Type: cross Abstract: While affective computing has advanced considerably, multimodal emotion prediction in aging populations remain
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Verbalizing LLMs' assumptions to explain and control sycophancy
arXiv:2604.03058v1 Announce Type: cross Abstract: LLMs can be socially sycophantic, affirming users when they ask questions like "am I in the wrong?" rather tha
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study
arXiv:2604.03070v1 Announce Type: cross Abstract: Third-party skills extend LLM agents with powerful capabilities but often handle sensitive credentials in priv
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems
arXiv:2604.03081v1 Announce Type: cross Abstract: LLM-based coding agents extend their capabilities via third-party agent skills distributed through open market
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 4d ago
A Data-Centric Vision Transformer Baseline for SAR Sea Ice Classification
arXiv:2604.03094v1 Announce Type: cross Abstract: Accurate and automated sea ice classification is important for climate monitoring and maritime safety in the A
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Co-Evolution of Policy and Internal Reward for Language Agents
arXiv:2604.03098v1 Announce Type: cross Abstract: Large language model (LLM) agents learn by interacting with environments, but long-horizon training remains fu
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 4d ago
AlertStar: Path-Aware Alert Prediction on Hyper-Relational Knowledge Graphs
arXiv:2604.03104v1 Announce Type: cross Abstract: Cyber-attacks continue to grow in scale and sophistication, yet existing network intrusion detection approache
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning
arXiv:2604.03114v1 Announce Type: cross Abstract: VLMs trained on web-scale data retain sensitive and copyrighted visual concepts that deployment may require re
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
An Independent Safety Evaluation of Kimi K2.5
arXiv:2604.03121v1 Announce Type: cross Abstract: Kimi K2.5 is an open-weight LLM that rivals closed models across coding, multimodal, and agentic benchmarks, b
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts
arXiv:2604.03127v1 Announce Type: cross Abstract: Automated annotation of pedagogical dialogue is a high-stakes task where LLMs often fail without sufficient do
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 4d ago
A Systematic Security Evaluation of OpenClaw and Its Variants
arXiv:2604.03131v1 Announce Type: cross Abstract: Tool-augmented AI agents substantially extend the practical capabilities of large language models, but they al
ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 4d ago
AI-Assisted Unit Test Writing and Test-Driven Code Refactoring: A Case Study
arXiv:2604.03135v1 Announce Type: cross Abstract: Many software systems originate as prototypes or minimum viable products (MVPs), developed with an emphasis on
ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 4d ago
InCoder-32B-Thinking: Industrial Code World Model for Thinking
arXiv:2604.03144v1 Announce Type: cross Abstract: Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reason
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control
arXiv:2604.03147v1 Announce Type: cross Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. F
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation
arXiv:2604.03174v1 Announce Type: cross Abstract: Large language models (LLMs) encode vast world knowledge in their parameters, yet they remain fundamentally li
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4d ago
Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models
arXiv:2604.03179v1 Announce Type: cross Abstract: The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption