📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 6,601 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (17403) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

Does RLVR Extend Reasoning Boundaries? Investigating Capability Expansion in Vision-Language Models

arXiv:2511.00710v4 Announce Type: replace Abstract: Recent studies posit that Reinforcement Learning with Verifiable Rewards (RLVR) primarily amplifies behavior

ArXiv cs.AI 📄 Paper 1w ago

DecompSR: A dataset for decomposed analyses of compositional multihop spatial reasoning

arXiv:2511.02627v2 Announce Type: replace Abstract: We introduce DecompSR, decomposed spatial reasoning, a large benchmark dataset (over 5m datapoints) and gene

ArXiv cs.AI 📄 Paper 1w ago

Dataset Safety in Autonomous Driving: Requirements, Risks, and Assurance

arXiv:2511.08439v2 Announce Type: replace Abstract: Dataset integrity is fundamental to the safety and reliability of AI systems, especially in autonomous drivi

ArXiv cs.AI 📄 Paper 1w ago

Learning the Value of Value Learning

arXiv:2511.17714v5 Announce Type: replace Abstract: Standard decision frameworks address uncertainty about facts but assume fixed options and values. We extend

ArXiv cs.AI 📄 Paper 1w ago

A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents

arXiv:2512.20798v4 Announce Type: replace Abstract: As autonomous AI agents are deployed in high-stakes environments, ensuring their safety has become a paramou

ArXiv cs.AI 📄 Paper 1w ago

No More Stale Feedback: Co-Evolving Critics for Open-World Agent Learning

arXiv:2601.06794v2 Announce Type: replace Abstract: Critique-guided reinforcement learning (RL) has emerged as a powerful paradigm for training LLM agents by au

ArXiv cs.AI 📄 Paper 1w ago

PrivacyReasoner: Can LLM Emulate a Human-like Privacy Mind?

arXiv:2601.09152v2 Announce Type: replace Abstract: Prior work on LLM-based privacy focuses on norm judgment over synthetic vignettes, rather than how people th

ArXiv cs.AI 📄 Paper 1w ago

LatentRefusal: Latent-Signal Refusal for Unanswerable Text-to-SQL Queries

arXiv:2601.10398v3 Announce Type: replace Abstract: In LLM-based text-to-SQL systems, unanswerable and underspecified user queries may generate not only incorre

ArXiv cs.AI 📄 Paper 1w ago

WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents

arXiv:2603.05044v2 Announce Type: replace Abstract: Current paradigms for training GUI agents are fundamentally limited by a reliance on either unsafe, non-repr

ArXiv cs.AI 📄 Paper 1w ago

WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

arXiv:2603.05295v3 Announce Type: replace Abstract: We introduce WebChain, the largest open-source dataset of human-annotated trajectories on real-world website

ArXiv cs.AI 📄 Paper 1w ago

A Survey of Multimodal Mathematical Reasoning: From Perception, Alignment to Reasoning

arXiv:2603.08291v3 Announce Type: replace Abstract: Multimodal Mathematical Reasoning (MMR) has recently attracted increasing attention for its capability to so

ArXiv cs.AI 📄 Paper 1w ago

Reasoning Graphs: Self-Improving, Deterministic RAG through Evidence-Centric Feedback

arXiv:2604.07595v2 Announce Type: replace Abstract: Language model agents reason from scratch on every query, discarding their chain of thought after each run.

ArXiv cs.AI 📄 Paper 1w ago

Pictorial and apictorial polygonal jigsaw puzzles from arbitrary number of crossing cuts

arXiv:2008.07644v3 Announce Type: replace-cross Abstract: Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unor

ArXiv cs.AI 📄 Paper 1w ago

Prompt Evolution for Generative AI: A Classifier-Guided Approach

arXiv:2305.16347v2 Announce Type: replace-cross Abstract: Synthesis of digital artifacts conditioned on user prompts has become an important paradigm facilitati

ArXiv cs.AI 📄 Paper 1w ago

A2-DIDM: Privacy-preserving Accumulator-enabled Auditing for Distributed Identity of DNN Model

arXiv:2405.04108v2 Announce Type: replace-cross Abstract: Recent booming development of Generative Artificial Intelligence (GenAI) has facilitated model commerc

ArXiv cs.AI 📄 Paper 1w ago

OmniHands: Towards Robust 4D Hand Mesh Recovery via A Versatile Transformer

arXiv:2405.20330v4 Announce Type: replace-cross Abstract: In this paper, we introduce OmniHands, a universal approach to recovering interactive hand meshes and

ArXiv cs.AI 📄 Paper 1w ago

animal2vec and MeerKAT: A self-supervised transformer for rare-event raw audio input and a large-scale reference dataset for bioacoustics

arXiv:2406.01253v3 Announce Type: replace-cross Abstract: Bioacoustic research, vital for understanding animal behavior, conservation, and ecology, faces a monu

ArXiv cs.AI 📄 Paper 1w ago

AdaMCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Multilingual Chain-of-Thought

arXiv:2501.16154v4 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown impressive multilingual capabilities through pretraining on di

ArXiv cs.AI 📄 Paper 1w ago

RegD: Hierarchical Embeddings via Dissimilarity between Arbitrary Euclidean Regions

arXiv:2501.17518v3 Announce Type: replace-cross Abstract: Hierarchical data is common in many domains like life sciences and e-commerce, and its embeddings ofte

ArXiv cs.AI 📄 Paper 1w ago

Large Language Models are Powerful Electronic Health Record Encoders

arXiv:2502.17403v5 Announce Type: replace-cross Abstract: Electronic Health Records (EHRs) offer considerable potential for clinical prediction, but their compl

ArXiv cs.AI 📄 Paper 1w ago

Siamese Foundation Models for Crystal Structure Prediction

arXiv:2503.10471v2 Announce Type: replace-cross Abstract: Predicting crystal structures from chemical compositions is a fundamental challenge in materials disco

ArXiv cs.AI 📄 Paper 1w ago

Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data

arXiv:2503.10676v2 Announce Type: replace-cross Abstract: We study the efficacy of fine-tuning Large Language Models (LLMs) for the specific task of report (gov

ArXiv cs.AI 📄 Paper 1w ago

Characterizing higher-order representations through generative diffusion models explains human decoded neurofeedback performance

arXiv:2503.14333v4 Announce Type: replace-cross Abstract: Brains construct not only "first-order" representations of the environment but also "higher-order" rep

ArXiv cs.AI 📄 Paper 1w ago

On the Mathematical Relationship Between Layer Normalization and Dynamic Activation Functions

arXiv:2503.21708v4 Announce Type: replace-cross Abstract: Layer normalization (LN) is an essential component of modern neural networks. While many alternative t