📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,273 articles · Updated every 3 hours · View all news

arXiv:2603.28015v1 Announce Type: new Abstract: Deep learning models for drug-like molecules and proteins overwhelmingly reuse transformer architectures designe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA

arXiv:2603.28026v1 Announce Type: new Abstract: Scientific figure multiple-choice question answering (MCQA) requires models to reason over diverse visual eviden

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Beyond the Answer: Decoding the Behavior of LLMs as Scientific Reasoners

arXiv:2603.28038v1 Announce Type: new Abstract: As Large Language Models (LLMs) achieve increasingly sophisticated performance on complex reasoning tasks, curre

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Dogfight Search: A Swarm-Based Optimization Algorithm for Complex Engineering Optimization and Mountainous Terrain Path Planning

arXiv:2603.28046v1 Announce Type: new Abstract: Dogfight is a tactical behavior of cooperation between fighters. Inspired by this, this paper proposes a novel m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Meta-Harness: End-to-End Optimization of Model Harnesses

arXiv:2603.28052v1 Announce Type: new Abstract: The performance of large language model (LLM) systems depends not only on model weights, but also on their harne

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring

arXiv:2603.28062v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated remarkable fluency in educational dialogues, most generativ

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Reward Hacking as Equilibrium under Finite Evaluation

arXiv:2603.28063v1 Announce Type: new Abstract: We prove that under five minimal axioms -- multi-dimensional quality, finite evaluation, effective optimization,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

arXiv:2603.28135v1 Announce Type: new Abstract: Recent test-time reasoning methods improve performance by generating more candidate chains or searching over lar

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision

arXiv:2603.28183v1 Announce Type: new Abstract: Multimodal Large Language Models have demonstrated powerful cross-modal understanding and reasoning capabilities

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

arXiv:2603.28197v1 Announce Type: new Abstract: Pluralistic alignment is essential for adapting large language models (LLMs) to the diverse preferences of indiv

ArXiv cs.AI 🛠️ AI Tools & Apps 📄 Paper ⚡ AI Lesson 1w ago

Differentiable Power-Flow Optimization

arXiv:2603.28203v1 Announce Type: new Abstract: With the rise of renewable energy sources and their high variability in generation, the management of power grid

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Reasoning as Energy Minimization over Structured Latent Trajectories

arXiv:2603.28248v1 Announce Type: new Abstract: Single-shot neural decoders commit to answers without iterative refinement, while chain-of-thought methods intro

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Evaluating LLMs for Answering Student Questions in Introductory Programming Courses

arXiv:2603.28295v1 Announce Type: new Abstract: The rapid emergence of Large Language Models (LLMs) presents both opportunities and challenges for programming e

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

A Multi-Agent Rhizomatic Pipeline for Non-Linear Literature Analysis

arXiv:2603.28336v1 Announce Type: new Abstract: Systematic literature reviews in the social sciences overwhelmingly follow arborescent logics -- hierarchical ke

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems

arXiv:2603.28360v1 Announce Type: new Abstract: Uncertainty estimation in multi-LLM systems remains largely single-model-centric: existing methods quantify unce

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Deep Research of Deep Research: From Transformer to Agent, From AI to AI for Science

arXiv:2603.28361v1 Announce Type: new Abstract: With the advancement of large language models (LLMs) in their knowledge base and reasoning capabilities, their i

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Policies and Environments via Two-Player Zero-Sum Game

arXiv:2603.28386v1 Announce Type: new Abstract: A central challenge in building continually improving agents is that training environments are typically static

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

The Scaffold Effect: How Prompt Framing Drives Apparent Multimodal Gains in Clinical VLM Evaluation

arXiv:2603.28387v1 Announce Type: new Abstract: Trustworthy clinical AI requires that performance gains reflect genuine evidence integration rather than surface

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

arXiv:2603.28407v1 Announce Type: new Abstract: Recent progress in deep research systems has been impressive, but evaluation still lags behind real user needs.

ArXiv cs.AI 🔍 RAG & Vector Search 📄 Paper ⚡ AI Lesson 1w ago

Entropic Claim Resolution: Uncertainty-Driven Evidence Selection for RAG

arXiv:2603.28444v1 Announce Type: new Abstract: Current Retrieval-Augmented Generation (RAG) systems predominantly rely on relevance-based dense retrieval, sequ

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

T-Norm Operators for EU AI Act Compliance Classification: An Empirical Comparison of Lukasiewicz, Product, and G\"odel Semantics in a Neuro-Symbolic Reasoning System

arXiv:2603.28558v1 Announce Type: new Abstract: We present a first comparative pilot study of three t-norm operators -- Lukasiewicz (T_L), Product (T_P), and G\

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1w ago

Towards a Medical AI Scientist

arXiv:2603.28589v1 Announce Type: new Abstract: Autonomous systems that generate scientific hypotheses, conduct experiments, and draft manuscripts have recently

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models

arXiv:2603.28590v1 Announce Type: new Abstract: Large language models (LLMs) can generate chains of thought (CoTs) that are not always causally responsible for

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Seeing with You: Perception-Reasoning Coevolution for Multimodal Reasoning

arXiv:2603.28618v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has substantially enhanced the reasoning capabilities of m