3,169 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

All ⚡ AI Lessons (8687) ArXiv cs.AIForbes InnovationOpenAI NewsDev.to AIHugging Face BlogHackernoon
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
Collaborative AI Agents and Critics for Fault Detection and Cause Analysis in Network Telemetry
arXiv:2604.00319v1 Announce Type: new Abstract: We develop algorithms for collaborative control of AI agents and critics in a multi-actor, multi-critic federate
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
Signals: Trajectory Sampling and Triage for Agentic Interactions
arXiv:2604.00356v1 Announce Type: new Abstract: Agentic applications based on large language models increasingly rely on multi-step interaction loops involving
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
In harmony with gpt-oss
arXiv:2604.00362v1 Announce Type: new Abstract: No one has independently reproduced OpenAI's published scores for gpt-oss-20b with tools, because the original p
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Decision-Centric Design for LLM Systems
arXiv:2604.00414v1 Announce Type: new Abstract: LLM systems must make control decisions in addition to generating outputs: whether to answer, clarify, retrieve,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Self-Routing: Parameter-Free Expert Routing from Hidden States
arXiv:2604.00421v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) layers increase model capacity by activating only a small subset of experts per token,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Execution-Verified Reinforcement Learning for Optimization Modeling
arXiv:2604.00442v1 Announce Type: new Abstract: Automating optimization modeling with LLMs is a promising path toward scalable decision intelligence, but existi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models
arXiv:2604.00445v1 Announce Type: new Abstract: Uncertainty estimation (UE) aims to detect hallucinated outputs of large language models (LLMs) to improve their
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation
arXiv:2604.00477v1 Announce Type: new Abstract: LLM-based agent judges are an emerging approach to evaluating conversational AI, yet a fundamental uncertainty r
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
The Silicon Mirror: Dynamic Behavioral Gating for Anti-Sycophancy in LLM Agents
arXiv:2604.00478v1 Announce Type: new Abstract: Large Language Models (LLMs) increasingly prioritize user validation over epistemic accuracy-a phenomenon known
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Adaptive Parallel Monte Carlo Tree Search for Efficient Test-time Compute Scaling
arXiv:2604.00510v1 Announce Type: new Abstract: Monte Carlo Tree Search (MCTS) is an effective test-time compute scaling (TTCS) method for improving the reasoni
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Does Unification Come at a Cost? Uni-SafeBench: A Safety Benchmark for Unified Multimodal Large Models
arXiv:2604.00547v1 Announce Type: new Abstract: Unified Multimodal Large Models (UMLMs) integrate understanding and generation capabilities within a single arch
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
BloClaw: An Omniscient, Multi-Modal Agentic Workspace for Next-Generation Scientific Discovery
arXiv:2604.00550v1 Announce Type: new Abstract: The integration of Large Language Models (LLMs) into life sciences has catalyzed the development of "AI Scientis
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Ontology-Constrained Neural Reasoning in Enterprise Agentic Systems: A Neurosymbolic Architecture for Domain-Grounded AI Agents
arXiv:2604.00555v1 Announce Type: new Abstract: Enterprise adoption of Large Language Models (LLMs) is constrained by hallucination, domain drift, and the inabi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Agent psychometrics: Task-level performance prediction in agentic coding benchmarks
arXiv:2604.00594v1 Announce Type: new Abstract: As the focus in LLM-based coding shifts from static single-step code generation to multi-step agentic interactio
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
CircuitProbe: Predicting Reasoning Circuits in Transformers via Stability Zone Detection
arXiv:2604.00716v1 Announce Type: new Abstract: Transformer language models contain localized reasoning circuits, contiguous layer blocks that improve reasoning
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
UK AISI Alignment Evaluation Case-Study
arXiv:2604.00788v1 Announce Type: new Abstract: This technical report presents methods developed by the UK AI Security Institute for assessing whether advanced
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning
arXiv:2604.00790v1 Announce Type: new Abstract: While large language models (LLMs) have demonstrated strong performance on complex reasoning tasks such as compe
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning
arXiv:2604.00795v1 Announce Type: new Abstract: We propose the Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO) for urban route planning for pe
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants
arXiv:2604.00842v1 Announce Type: new Abstract: Proactive agents that anticipate user needs and autonomously execute tasks hold great promise as digital assista
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Models
arXiv:2604.00890v1 Announce Type: new Abstract: Geometric Problem Solving (GPS) remains at the heart of enhancing mathematical reasoning in large language model