3,273 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,273 articles · Updated every 3 hours · View all news

All ⚡ AI Lessons (8687) ArXiv cs.AIForbes InnovationOpenAI NewsDev.to AIHugging Face BlogHackernoon
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning
arXiv:2604.01170v1 Announce Type: cross Abstract: While test-time scaling has enabled large language models to solve highly difficult tasks, state-of-the-art re
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Screening Is Enough
arXiv:2604.01178v1 Announce Type: cross Abstract: A core limitation of standard softmax attention is that it does not define a notion of absolute query--key rel
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
A ROS 2 Wrapper for Florence-2: Multi-Mode Local Vision-Language Inference for Robotic Systems
arXiv:2604.01179v1 Announce Type: cross Abstract: Foundation vision-language models are becoming increasingly relevant to robotics because they can provide rich
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget
arXiv:2604.01195v1 Announce Type: cross Abstract: Search agents, which integrate language models (LMs) with web search, are becoming crucial for answering compl
ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 1w ago
Neural Harmonic Textures for High-Quality Primitive Based Neural Reconstruction
arXiv:2604.01204v1 Announce Type: cross Abstract: Primitive-based methods such as 3D Gaussian Splatting have recently become the state-of-the-art for novel-view
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery
arXiv:2604.01210v1 Announce Type: cross Abstract: Scientific algorithm discovery is iterative: hypotheses are proposed, implemented, stress-tested, and revised.
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution
arXiv:2604.01212v1 Announce Type: cross Abstract: As LLM agents tackle increasingly complex tasks, a critical question is whether they can maintain strategic co
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago
The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline
arXiv:2604.01215v1 Announce Type: cross Abstract: AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines for
ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago
LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)
arXiv:2604.01216v1 Announce Type: cross Abstract: Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Code Comprehension then Auditing for Unsupervised LLM Evaluation
arXiv:2410.03131v4 Announce Type: replace Abstract: Large Language Models (LLMs) for unsupervised code correctness evaluation have recently gained attention bec
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
arXiv:2501.09136v4 Announce Type: replace Abstract: Large Language Models (LLMs) have advanced artificial intelligence by enabling human-like text generation an
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment
arXiv:2503.02976v3 Announce Type: replace Abstract: Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI system
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering
arXiv:2505.12189v3 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal log
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning
arXiv:2506.13841v3 Announce Type: replace Abstract: Recent advances in large language models (LLMs), particularly those enhanced through reinforced post-trainin
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
HiMA-Ecom: Enabling Joint Training of Hierarchical Multi-Agent E-commerce Assistants
arXiv:2506.19846v2 Announce Type: replace Abstract: Hierarchical multi-agent systems based on large language models (LLMs) have become a common paradigm for bui
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Auto-Formulating Dynamic Programming Problems with Large Language Models
arXiv:2507.11737v2 Announce Type: replace Abstract: Dynamic programming (DP) is a fundamental method in operations research, but formulating DP models has tradi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts
arXiv:2509.21743v2 Announce Type: replace Abstract: Large reasoning models improve accuracy by producing long reasoning traces, but this inflates latency and co
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents
arXiv:2509.25302v2 Announce Type: replace Abstract: The prevalent deployment of Large Language Model agents such as OpenClaw unlocks potential in real-world app
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming
arXiv:2510.18314v2 Announce Type: replace Abstract: As large language model (LLM) agents increasingly automate complex web tasks, they boost productivity while
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago
EHRStruct: A Comprehensive Benchmark Framework for Evaluating Large Language Models on Structured Electronic Health Record Tasks
arXiv:2511.08206v4 Announce Type: replace Abstract: Structured Electronic Health Record (EHR) data stores patient information in relational tables and plays a c