3,169 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

All ⚡ AI Lessons (8687) ArXiv cs.AIForbes InnovationOpenAI NewsDev.to AIHugging Face BlogHackernoon
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour
arXiv:2603.29142v1 Announce Type: new Abstract: Formative feedback is central to effective learning, yet providing timely, individualised feedback at scale rema
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Knowledge database development by large language models for countermeasures against viruses and marine toxins
arXiv:2603.29149v1 Announce Type: new Abstract: Access to the most up-to-date information on medical countermeasures is important for the research and developme
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
SimMOF: AI agent for Automated MOF Simulations
arXiv:2603.29152v1 Announce Type: new Abstract: Metal-organic frameworks (MOFs) offer a vast design space, and as such, computational simulations play a critica
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Webscraper: Leverage Multimodal Large Language Models for Index-Content Web Scraping
arXiv:2603.29161v1 Announce Type: new Abstract: Modern web scraping struggles with dynamic, interactive websites that require more than static HTML parsing. Cur
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
AEC-Bench: A Multimodal Benchmark for Agentic Systems in Architecture, Engineering, and Construction
arXiv:2603.29199v1 Announce Type: new Abstract: The AEC-Bench is a multimodal benchmark for evaluating agentic systems on real-world tasks in the Architecture,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Route-Induced Density and Stability (RIDE): Controlled Intervention and Mechanism Analysis of Routing-Style Meta Prompts on LLM Internal States
arXiv:2603.29206v1 Announce Type: new Abstract: Routing is widely used to scale large language models, from Mixture-of-Experts gating to multi-model/tool select
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Xuanwu: Evolving General Multimodal Models into an Industrial-Grade Foundation for Content Ecosystems
arXiv:2603.29211v1 Announce Type: new Abstract: In recent years, multimodal large models have continued to improve on general benchmarks. However, in real-world
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Beyond pass@1: A Reliability Science Framework for Long-Horizon LLM Agents
arXiv:2603.29231v1 Announce Type: new Abstract: Existing benchmarks measure capability -- whether a model succeeds on a single attempt -- but production deploym
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Grokking From Abstraction to Intelligence
arXiv:2603.29262v1 Announce Type: new Abstract: Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a c
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
PSPA-Bench: A Personalized Benchmark for Smartphone GUI Agent
arXiv:2603.29318v1 Announce Type: new Abstract: Smartphone GUI agents execute tasks by operating directly on app interfaces, offering a path to broad capability
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
Nomad: Autonomous Exploration and Discovery
arXiv:2603.29353v1 Announce Type: new Abstract: We introduce Nomad, a system for autonomous data exploration and insight discovery. Given a corpus of documents,
ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago
BenchScope: How Many Independent Signals Does Your Benchmark Provide?
arXiv:2603.29357v1 Announce Type: new Abstract: AI evaluation suites often report many scores without checking whether those scores carry independent informatio
ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago
Rigorous Explanations for Tree Ensembles
arXiv:2603.29361v1 Announce Type: new Abstract: Tree ensembles (TEs) find a multitude of practical applications. They represent one of the most general and accu
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding
arXiv:2603.29366v1 Announce Type: new Abstract: Prior authorization remains one of the most burdensome administrative processes in U.S. healthcare, consuming bi
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilities
arXiv:2603.29399v1 Announce Type: new Abstract: Constructing Extract-Load-Transform (ELT) pipelines is a labor-intensive data engineering task and a high-impact
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Structural Compactness as a Complementary Criterion for Explanation Quality
arXiv:2603.29491v1 Announce Type: new Abstract: In the evaluation of attribution quality, the quantitative assessment of explanation legibility is particularly
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries
arXiv:2603.29500v1 Announce Type: new Abstract: Large language models (LLMs) have recently demonstrated impressive performance on complex, multi-step reasoning
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration
arXiv:2603.29557v1 Announce Type: new Abstract: Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
ASI-Evolve: AI Accelerates AI
arXiv:2603.29640v1 Announce Type: new Abstract: Can AI accelerate the development of AI itself? While recent agentic systems have shown strong performance on we
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
Optimizing Donor Outreach for Blood Collection Sessions: A Scalable Decision Support Framework
arXiv:2603.29643v1 Announce Type: new Abstract: Blood donation centers face challenges in matching supply with demand while managing donor availability. Althoug