📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,539 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (10587) ArXiv cs.AI Dev.to · FORUM WEB Dev.to AI Forbes Innovation OpenAI News Hugging Face Blog

SafeMind: A Risk-Aware Differentiable Control Framework for Adaptive and Safe Quadruped Locomotion

arXiv:2604.09474v1 Announce Type: cross Abstract: Learning-based quadruped controllers achieve impressive agility but typically lack formal safety guarantees un

ArXiv cs.AI 📄 Paper 19h ago

XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers

arXiv:2604.09489v1 Announce Type: cross Abstract: Model poisoning attacks pose a significant security threat to Federated Learning (FL). Most existing model poi

ArXiv cs.AI 📄 Paper 19h ago

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

arXiv:2604.09494v1 Announce Type: cross Abstract: We propose RecaLLM, a set of reasoning language models post-trained to make effective use of long-context info

ArXiv cs.AI 📄 Paper 19h ago

BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation

arXiv:2604.09497v1 Announce Type: cross Abstract: Accurate evaluation is central to the large language model (LLM) ecosystem, guiding model selection and downst

ArXiv cs.AI 📄 Paper 19h ago

VISOR: Agentic Visual Retrieval-Augmented Generation via Iterative Search and Over-horizon Reasoning

arXiv:2604.09508v1 Announce Type: cross Abstract: Visual Retrieval-Augmented Generation (VRAG) empowers Vision-Language Models to retrieve and reason over visua

ArXiv cs.AI 📄 Paper 19h ago

Semantic Rate-Distortion for Bounded Multi-Agent Communication: Capacity-Derived Semantic Spaces and the Communication Cost of Alignment

arXiv:2604.09521v1 Announce Type: cross Abstract: When two agents of different computational capacities interact with the same environment, they need not compre

ArXiv cs.AI 📄 Paper 19h ago

Envisioning the Future, One Step at a Time

arXiv:2604.09527v1 Announce Type: cross Abstract: Accurately anticipating how complex, diverse scenes will evolve requires models that represent uncertainty, si

ArXiv cs.AI 📄 Paper 19h ago

VL-Calibration: Decoupled Confidence Calibration for Large Vision-Language Models Reasoning

arXiv:2604.09529v1 Announce Type: cross Abstract: Large Vision Language Models (LVLMs) achieve strong multimodal reasoning but frequently exhibit hallucinations

ArXiv cs.AI 📄 Paper 19h ago

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

arXiv:2604.09531v1 Announce Type: cross Abstract: Vision-language models (VLMs) still struggle with visual perception tasks such as spatial understanding and vi

ArXiv cs.AI 📄 Paper 19h ago

Seeing is Believing: Robust Vision-Guided Cross-Modal Prompt Learning under Label Noise

arXiv:2604.09532v1 Announce Type: cross Abstract: Prompt learning is a parameter-efficient approach for vision-language models, yet its robustness under label n

ArXiv cs.AI 📄 Paper 19h ago

Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision

arXiv:2604.09537v1 Announce Type: cross Abstract: Evidence-grounded reasoning requires more than attaching retrieved text to a prediction: a model should make d

ArXiv cs.AI 📄 Paper 19h ago

Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism

arXiv:2604.09544v1 Announce Type: cross Abstract: Large language models (LLMs) undergo alignment training to avoid harmful behaviors, yet the resulting safeguar

ArXiv cs.AI 📄 Paper 19h ago

Bayesian Social Deduction with Graph-Informed Language Models

arXiv:2506.17788v2 Announce Type: replace Abstract: Social reasoning - inferring unobservable beliefs and intentions from partial observations of other agents -

ArXiv cs.AI 📄 Paper 19h ago

ChipSeek: Optimizing Verilog Generation via EDA-Integrated Reinforcement Learning

arXiv:2507.04736v2 Announce Type: replace Abstract: Large Language Models have emerged as powerful tools for automating Register-Transfer Level (RTL) code gener

ArXiv cs.AI 📄 Paper 19h ago

Rethinking Prospect Theory for LLMs: Revealing the Instability of Decision-Making under Epistemic Uncertainty

arXiv:2508.08992v3 Announce Type: replace Abstract: Prospect Theory (PT) models human decision-making behaviour under uncertainty, among which linguistic uncert

ArXiv cs.AI 📄 Paper 19h ago

Interactive Program Synthesis for Modeling Collaborative Physical Activities from Narrated Demonstrations

arXiv:2509.24250v3 Announce Type: replace Abstract: Teaching systems physical tasks is a long standing goal in HCI, yet most prior work has focused on non colla

ArXiv cs.AI 📄 Paper 19h ago

Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search

arXiv:2509.25835v4 Announce Type: replace Abstract: Test-time scaling improves large language models (LLMs) on long-horizon reasoning tasks by allocating more c

ArXiv cs.AI 📄 Paper 19h ago

When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning

arXiv:2510.07517v5 Announce Type: replace Abstract: Multi-agent debate (MAD) aims to improve large language model (LLM) reasoning by letting multiple agents exc

ArXiv cs.AI 📄 Paper 19h ago

Thermally Activated Dual-Modal Adversarial Clothing against AI Surveillance Systems

arXiv:2511.09829v3 Announce Type: replace Abstract: Adversarial patches have emerged as a popular privacy-preserving approach for resisting AI-driven surveillan

ArXiv cs.AI 📄 Paper 19h ago

Sample-Efficient Neurosymbolic Deep Reinforcement Learning

arXiv:2601.02850v2 Announce Type: replace Abstract: Reinforcement Learning (RL) is a well-established framework for sequential decision-making in complex enviro

ArXiv cs.AI 📄 Paper 19h ago

Precomputing Multi-Agent Path Replanning using Temporal Flexibility

arXiv:2601.04884v2 Announce Type: replace Abstract: Executing a multi-agent plan can be challenging when an agent is delayed, because this typically creates con

ArXiv cs.AI 📄 Paper 19h ago

Reasoning Models Will Sometimes Lie About Their Reasoning

arXiv:2601.07663v3 Announce Type: replace Abstract: Hint-based faithfulness evaluations have established that Large Reasoning Models (LRMs) may not say what the

ArXiv cs.AI 📄 Paper 19h ago

The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?

arXiv:2601.23045v2 Announce Type: replace Abstract: As AI becomes more capable, we entrust it with more general and consequential tasks. The risks from failure

ArXiv cs.AI 📄 Paper 19h ago

Reasoning in a Combinatorial and Constrained World: Benchmarking LLMs on Natural-Language Combinatorial Optimization

arXiv:2602.02188v2 Announce Type: replace Abstract: While large language models (LLMs) have shown strong performance in math and logic reasoning, their ability