3,273 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,273 articles · Updated every 3 hours · View all news

All ⚡ AI Lessons (8687) ArXiv cs.AIForbes InnovationOpenAI NewsDev.to AIHugging Face BlogHackernoon
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
KLong: Training LLM Agent for Extremely Long-horizon Tasks
arXiv:2602.17547v2 Announce Type: replace Abstract: This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The pri
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
AI Runtime Infrastructure
arXiv:2603.00495v2 Announce Type: replace Abstract: We introduce AI Runtime Infrastructure, a distinct execution-time layer that operates above the model and be
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality
arXiv:2603.05912v2 Announce Type: replace Abstract: Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality r
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation
arXiv:2603.08388v4 Announce Type: replace Abstract: We propose a Hierarchical Error-Corrective Graph FrameworkforAutonomousAgentswithLLM-BasedActionGeneration(H
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
Collective AI can amplify tiny perturbations into divergent decisions
arXiv:2603.09127v2 Announce Type: replace Abstract: Large language models are increasingly deployed not as single assistants but as committees whose members del
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
A Self-Evolving Defect Detection Framework for Industrial Photovoltaic Systems
arXiv:2603.14869v2 Announce Type: replace Abstract: Reliable photovoltaic (PV) power generation requires timely detection of module defects that may reduce ener
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
Adaptive Domain Models: Bayesian Evolution, Warm Rotation, and Principled Training for Geometric and Neuromorphic AI
arXiv:2603.18104v2 Announce Type: replace Abstract: Prevailing AI training infrastructure assumes reverse-mode automatic differentiation over IEEE-754 arithmeti
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
An Onto-Relational-Sophic Framework for Governing Synthetic Minds
arXiv:2603.18633v2 Announce Type: replace Abstract: The rapid evolution of artificial intelligence, from task-specific systems to foundation models exhibiting b
ArXiv cs.AI 🧠 Large Language Models 📄 Paper 2d ago
ClawSafety: "Safe" LLMs, Unsafe Agents
arXiv:2604.01438v2 Announce Type: replace Abstract: Personal AI agents like OpenClaw run with elevated privileges on users' local machines, where a single succe
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks
arXiv:2604.01487v2 Announce Type: replace Abstract: With the rise of personalized, persistent LLM agent frameworks such as OpenClaw, human-centered agentic soci
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
Domain-constrained knowledge representation: A modal framework
arXiv:2604.01770v2 Announce Type: replace Abstract: Knowledge graphs store large numbers of relations efficiently, but they remain weak at representing a quiete
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
Mitigating Value Hallucination in Dyna Planning via Multistep Predecessor Models
arXiv:2006.04363v2 Announce Type: replace-cross Abstract: Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by u
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model
arXiv:2406.14194v3 Announce Type: replace-cross Abstract: The emergence of Large Vision-Language Models (LVLMs) marks significant strides towards achieving gene
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
Detecting and Characterising Mobile App Metamorphosis in Google Play Store
arXiv:2407.14565v2 Announce Type: replace-cross Abstract: App markets have evolved into highly competitive and dynamic environments for developers. While the tr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models
arXiv:2408.11871v3 Announce Type: replace-cross Abstract: Fake news significantly influences decision-making processes by misleading individuals, organizations,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
SPRIG: Improving Large Language Model Performance by System Prompt Optimization
arXiv:2410.14826v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have shown impressive capabilities in many scenarios, but their performan
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
arXiv:2410.21169v5 Announce Type: replace-cross Abstract: Document parsing (DP) transforms unstructured or semi-structured documents into structured, machine-re
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control
arXiv:2411.18235v3 Announce Type: replace-cross Abstract: We study the problem of learning verifiably Lyapunov-stable neural controllers that provably satisfy t
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
Talk to Right Specialists: Iterative Routing in Multi-agent Systems for Question Answering
arXiv:2501.07813v2 Announce Type: replace-cross Abstract: Retrieval-augmented generation (RAG) agents are increasingly deployed to answer questions over local k
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago
Human-AI Collaborative Game Testing with Vision Language Models
arXiv:2501.11782v2 Announce Type: replace-cross Abstract: As modern video games become increasingly complex, traditional manual testing methods are proving cost