📰 ArXiv cs.AI
Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
1w ago
The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline
arXiv:2604.01215v1 Announce Type: cross Abstract: AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines for
ArXiv cs.AI
📐 ML Fundamentals
📄 Paper
⚡ AI Lesson
1w ago
LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)
arXiv:2604.01216v1 Announce Type: cross Abstract: Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Code Comprehension then Auditing for Unsupervised LLM Evaluation
arXiv:2410.03131v4 Announce Type: replace Abstract: Large Language Models (LLMs) for unsupervised code correctness evaluation have recently gained attention bec
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
arXiv:2501.09136v4 Announce Type: replace Abstract: Large Language Models (LLMs) have advanced artificial intelligence by enabling human-like text generation an
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment
arXiv:2503.02976v3 Announce Type: replace Abstract: Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI system
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering
arXiv:2505.12189v3 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal log
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning
arXiv:2506.13841v3 Announce Type: replace Abstract: Recent advances in large language models (LLMs), particularly those enhanced through reinforced post-trainin
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
HiMA-Ecom: Enabling Joint Training of Hierarchical Multi-Agent E-commerce Assistants
arXiv:2506.19846v2 Announce Type: replace Abstract: Hierarchical multi-agent systems based on large language models (LLMs) have become a common paradigm for bui
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Auto-Formulating Dynamic Programming Problems with Large Language Models
arXiv:2507.11737v2 Announce Type: replace Abstract: Dynamic programming (DP) is a fundamental method in operations research, but formulating DP models has tradi
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts
arXiv:2509.21743v2 Announce Type: replace Abstract: Large reasoning models improve accuracy by producing long reasoning traces, but this inflates latency and co
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents
arXiv:2509.25302v2 Announce Type: replace Abstract: The prevalent deployment of Large Language Model agents such as OpenClaw unlocks potential in real-world app
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming
arXiv:2510.18314v2 Announce Type: replace Abstract: As large language model (LLM) agents increasingly automate complex web tasks, they boost productivity while
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
EHRStruct: A Comprehensive Benchmark Framework for Evaluating Large Language Models on Structured Electronic Health Record Tasks
arXiv:2511.08206v4 Announce Type: replace Abstract: Structured Electronic Health Record (EHR) data stores patient information in relational tables and plays a c
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models
arXiv:2601.05144v2 Announce Type: replace Abstract: Reasoning Large Language Models (RLLMs) excelling in complex tasks present unique challenges for digital wat
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Finite-State Controllers for (Hidden-Model) POMDPs using Deep Reinforcement Learning
arXiv:2602.08734v2 Announce Type: replace Abstract: Solving partially observable Markov decision processes (POMDPs) requires computing policies under imperfect
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent
arXiv:2602.19837v2 Announce Type: replace Abstract: Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability that standard
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents
arXiv:2602.22413v2 Announce Type: replace Abstract: We investigate the collective accuracy of heterogeneous agents who learn to estimate their own reliability o
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
When Agents Persuade: Rhetoric Generation and Mitigation in LLMs
arXiv:2603.04636v2 Announce Type: replace Abstract: Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to prod
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
1w ago
Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium
arXiv:2603.15929v2 Announce Type: replace Abstract: We present a complete Lean 4 formalization of the equilibrium characterization in the Vlasov-Maxwell-Landau
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
1w ago
Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL
arXiv:2407.01570v4 Announce Type: replace-cross Abstract: Despite the significant advances in Deep Reinforcement Learning (RL) observed in the last decade, the
ArXiv cs.AI
🛡️ AI Safety & Ethics
📄 Paper
⚡ AI Lesson
1w ago
A Divide-and-Conquer Strategy for Hard-Label Extraction of Deep Neural Networks via Side-Channel Attacks
arXiv:2411.10174v2 Announce Type: replace-cross Abstract: During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects.
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1w ago
Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning
arXiv:2411.13181v3 Announce Type: replace-cross Abstract: The classification of distracted drivers is pivotal for ensuring safe driving. Previous studies demons
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
1w ago
Enhancing Team Diversity with Generative AI: A Novel Project Management Framework
arXiv:2502.05181v2 Announce Type: replace-cross Abstract: This research-in-progress paper presents a new project management framework that utilises GenAI techno
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
1w ago
How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions
arXiv:2502.14883v3 Announce Type: replace-cross Abstract: For individuals with blindness or low vision (BLV), navigating complex environments can pose serious r
DeepCamp AI