📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

arXiv:2604.01215v1 Announce Type: cross Abstract: AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines for

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 1w ago

LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)

arXiv:2604.01216v1 Announce Type: cross Abstract: Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Code Comprehension then Auditing for Unsupervised LLM Evaluation

arXiv:2410.03131v4 Announce Type: replace Abstract: Large Language Models (LLMs) for unsupervised code correctness evaluation have recently gained attention bec

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

arXiv:2501.09136v4 Announce Type: replace Abstract: Large Language Models (LLMs) have advanced artificial intelligence by enabling human-like text generation an

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

arXiv:2503.02976v3 Announce Type: replace Abstract: Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI system

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering

arXiv:2505.12189v3 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal log

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

arXiv:2506.13841v3 Announce Type: replace Abstract: Recent advances in large language models (LLMs), particularly those enhanced through reinforced post-trainin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

HiMA-Ecom: Enabling Joint Training of Hierarchical Multi-Agent E-commerce Assistants

arXiv:2506.19846v2 Announce Type: replace Abstract: Hierarchical multi-agent systems based on large language models (LLMs) have become a common paradigm for bui

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Auto-Formulating Dynamic Programming Problems with Large Language Models

arXiv:2507.11737v2 Announce Type: replace Abstract: Dynamic programming (DP) is a fundamental method in operations research, but formulating DP models has tradi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

arXiv:2509.21743v2 Announce Type: replace Abstract: Large reasoning models improve accuracy by producing long reasoning traces, but this inflates latency and co

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents

arXiv:2509.25302v2 Announce Type: replace Abstract: The prevalent deployment of Large Language Model agents such as OpenClaw unlocks potential in real-world app

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming

arXiv:2510.18314v2 Announce Type: replace Abstract: As large language model (LLM) agents increasingly automate complex web tasks, they boost productivity while

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

EHRStruct: A Comprehensive Benchmark Framework for Evaluating Large Language Models on Structured Electronic Health Record Tasks

arXiv:2511.08206v4 Announce Type: replace Abstract: Structured Electronic Health Record (EHR) data stores patient information in relational tables and plays a c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models

arXiv:2601.05144v2 Announce Type: replace Abstract: Reasoning Large Language Models (RLLMs) excelling in complex tasks present unique challenges for digital wat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Finite-State Controllers for (Hidden-Model) POMDPs using Deep Reinforcement Learning

arXiv:2602.08734v2 Announce Type: replace Abstract: Solving partially observable Markov decision processes (POMDPs) requires computing policies under imperfect

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent

arXiv:2602.19837v2 Announce Type: replace Abstract: Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability that standard

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

arXiv:2602.22413v2 Announce Type: replace Abstract: We investigate the collective accuracy of heterogeneous agents who learn to estimate their own reliability o

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

When Agents Persuade: Rhetoric Generation and Mitigation in LLMs

arXiv:2603.04636v2 Announce Type: replace Abstract: Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to prod

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium

arXiv:2603.15929v2 Announce Type: replace Abstract: We present a complete Lean 4 formalization of the equilibrium characterization in the Vlasov-Maxwell-Landau

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL

arXiv:2407.01570v4 Announce Type: replace-cross Abstract: Despite the significant advances in Deep Reinforcement Learning (RL) observed in the last decade, the

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 1w ago

A Divide-and-Conquer Strategy for Hard-Label Extraction of Deep Neural Networks via Side-Channel Attacks

arXiv:2411.10174v2 Announce Type: replace-cross Abstract: During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects.

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1w ago

Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning

arXiv:2411.13181v3 Announce Type: replace-cross Abstract: The classification of distracted drivers is pivotal for ensuring safe driving. Previous studies demons

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Enhancing Team Diversity with Generative AI: A Novel Project Management Framework

arXiv:2502.05181v2 Announce Type: replace-cross Abstract: This research-in-progress paper presents a new project management framework that utilises GenAI techno

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions

arXiv:2502.14883v3 Announce Type: replace-cross Abstract: For individuals with blindness or low vision (BLV), navigating complex environments can pose serious r