📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 1,234 articles · Updated every 3 hours · View all news

arXiv:2604.03232v1 Announce Type: new Abstract: IC3, also known as property-directed reachability (PDR), is a commonly-used algorithm for hardware safety model

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 9h ago

Structural Segmentation of the Minimum Set Cover Problem: Exploiting Universe Decomposability for Metaheuristic Optimization

arXiv:2604.03234v1 Announce Type: new Abstract: The Minimum Set Cover Problem (MSCP) is a classical NP-hard combinatorial optimization problem with numerous app

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 9h ago

To Throw a Stone with Six Birds: On Agents and Agenthood

arXiv:2604.03239v1 Announce Type: new Abstract: Six Birds Theory (SBT) treats macroscopic objects as induced closures rather than primitives. Empirical discussi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 9h ago

Position: Science of AI Evaluation Requires Item-level Benchmark Data

arXiv:2604.03244v1 Announce Type: new Abstract: AI evaluations have become the primary evidence for deploying generative AI systems across high-stakes domains.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 9h ago

Toward Full Autonomous Laboratory Instrumentation Control with Large Language Models

arXiv:2604.03286v1 Announce Type: new Abstract: The control of complex laboratory instrumentation often requires significant programming expertise, creating a b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 9h ago

Evaluating Artificial Intelligence Through a Christian Understanding of Human Flourishing

arXiv:2604.03356v1 Announce Type: new Abstract: Artificial intelligence (AI) alignment is fundamentally a formation problem, not only a safety problem. As Large

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 9h ago

VERT: Reliable LLM Judges for Radiology Report Evaluation

arXiv:2604.03376v1 Announce Type: new Abstract: Current literature on radiology report evaluation has focused primarily on designing LLM-based metrics and fine-

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 9h ago

Hume's Representational Conditions for Causal Judgment: What Bayesian Formalization Abstracted Away

arXiv:2604.03387v1 Announce Type: new Abstract: Hume's account of causal judgment presupposes three representational conditions: experiential grounding (ideas m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 9h ago

TABQAWORLD: Optimizing Multimodal Reasoning for Multi-Turn Table Question Answering

arXiv:2604.03393v1 Announce Type: new Abstract: Multimodal reasoning has emerged as a powerful framework for enhancing reasoning capabilities of reasoning model

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 9h ago

Contextual Control without Memory Growth in a Context-Switching Task

arXiv:2604.03479v1 Announce Type: new Abstract: Context-dependent sequential decision making is commonly addressed either by providing context explicitly as an

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web

arXiv:2604.02334v1 Announce Type: new Abstract: As large language models (LLM)-driven agents transition from isolated task solvers to persistent digital entitie

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

arXiv:2604.02368v1 Announce Type: new Abstract: As Large Language Models (LLMs) exhibit plateauing performance on conventional benchmarks, a pivotal challenge p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Compositional Neuro-Symbolic Reasoning

arXiv:2604.02434v1 Announce Type: new Abstract: We study structured abstraction-based reasoning for the Abstraction and Reasoning Corpus (ARC) and compare its g

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Understanding the Nature of Generative AI as Threshold Logic in High-Dimensional Space

arXiv:2604.02476v1 Announce Type: new Abstract: This paper examines the role of threshold logic in understanding generative artificial intelligence. Threshold f

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

AIVV: Neuro-Symbolic LLM Agent-Integrated Verification and Validation for Trustworthy Autonomous Systems

arXiv:2604.02478v1 Announce Type: new Abstract: Deep learning models excel at detecting anomaly patterns in normal data. However, they do not provide a direct s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

I must delete the evidence: AI Agents Explicitly Cover up Fraud and Violent Crime

arXiv:2604.02500v1 Announce Type: new Abstract: As ongoing research explores the ability of AI agents to be insider threats and act against company interests, w

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization

arXiv:2604.02528v1 Announce Type: new Abstract: The new Specifications for the National Bridge Inventory (SNBI), in effect from 2022, emphasize the use of eleme

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling

arXiv:2604.02545v1 Announce Type: new Abstract: The preservation of intangible cultural heritage is a critical challenge as collective memory fades over time. W

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Mitigating LLM biases toward spurious social contexts using direct preference optimization

arXiv:2604.02585v1 Announce Type: new Abstract: LLMs are increasingly used for high-stakes decision-making, yet their sensitivity to spurious contextual informa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

Do Audio-Visual Large Language Models Really See and Hear?

arXiv:2604.02605v1 Announce Type: new Abstract: Audio-Visual Large Language Models (AVLLMs) are emerging as unified interfaces to multimodal perception. We pres

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1d ago

AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models

arXiv:2604.02617v1 Announce Type: new Abstract: Scientific and Technical Intelligence (S&TI) analysis requires verifying complex technical claims across rapidly

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments

arXiv:2603.25747v1 Announce Type: new Abstract: The rapid evolution of Large Multimodal Models (LMMs) has enabled agents to perform complex digital and physical

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

AutoB2G: A Large Language Model-Driven Agentic Framework For Automated Building-Grid Co-Simulation

arXiv:2603.26005v1 Announce Type: new Abstract: The growing availability of building operational data motivates the use of reinforcement learning (RL), which ca

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Semi-Automated Knowledge Engineering and Process Mapping for Total Airport Management

arXiv:2603.26076v1 Announce Type: new Abstract: Documentation of airport operations is inherently complex due to extensive technical terminology, rigorous regul