📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,169 articles · Updated every 3 hours · View all news

arXiv:2311.00855v3 Announce Type: replace Abstract: Human immunodeficiency virus (HIV) is a major public health concern in the United States (U.S.), with about

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Barriers to Complexity-Theoretic Proofs that "AGI" Using Machine Learning is Impossible

arXiv:2411.06498v2 Announce Type: replace Abstract: A recent paper (van Rooij et al. 2024) claims to have proved that achieving human-like intelligence using le

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 2d ago

Representation learning to advance multi-institutional studies with electronic health record data from US and France

arXiv:2502.08547v2 Announce Type: replace Abstract: The widespread adoption of electronic health records has created new opportunities for translational clinica

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Reflection of Episodes: Learning to Play Game from Expert and Self Experiences

arXiv:2502.13388v3 Announce Type: replace Abstract: StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models

arXiv:2506.17585v3 Announce Type: replace Abstract: Trustworthy language models should provide both correct and verifiable answers. However, citations generated

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game

arXiv:2508.02900v2 Announce Type: replace Abstract: There is a broad consensus that the inability to form long-term plans is one of the key limitations of curre

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Similarity Field Theory: A Mathematical Framework for Intelligence

arXiv:2509.18218v5 Announce Type: replace Abstract: We posit that transforming similarity relations form the structural basis of comprehensible dynamic systems.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

arXiv:2510.09901v2 Announce Type: replace Abstract: Computing has long served as a cornerstone of scientific discovery. Recently, a paradigm shift has emerged w

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

arXiv:2510.23883v3 Announce Type: replace Abstract: Agentic AI systems powered by large language models (LLMs) and endowed with planning, tool use, memory, and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval

arXiv:2511.14130v2 Announce Type: replace Abstract: With the rapid progress of large language models (LLMs), financial information retrieval has become a critic

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

An Agent-Based Framework for the Automatic Validation of Mathematical Optimization Models

arXiv:2511.16383v2 Announce Type: replace Abstract: Recently, using Large Language Models (LLMs) to generate optimization models from natural language descripti

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Agile Deliberation: Concept Deliberation for Subjective Visual Classification

arXiv:2512.10821v2 Announce Type: replace Abstract: From content moderation to content curation, applications requiring vision classifiers for visual concepts a

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

arXiv:2512.13168v4 Announce Type: replace Abstract: We introduce FinWorkBench (a.k.a. Finch), a benchmark for evaluating agents on real-world, enterprise-grade

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

The Drill-Down and Fabricate Test (DDFT): A Protocol for Measuring Epistemic Robustness in Language Models

arXiv:2512.23850v2 Announce Type: replace Abstract: Current language model evaluations measure what models know under ideal conditions but not how robustly they

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

HAG: Hierarchical Demographic Tree-based Agent Generation for Topic-Adaptive Simulation

arXiv:2601.05656v3 Announce Type: replace Abstract: High-fidelity agent initialization is crucial for credible Agent-Based Modeling across diverse domains. A ro

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Circuit Mechanisms for Spatial Relation Generation in Diffusion Transformers

arXiv:2601.06338v2 Announce Type: replace Abstract: Diffusion Transformers (DiTs) have greatly advanced text-to-image generation, but models still struggle to g

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

ConvoLearn: A Dataset for Fine-Tuning Dialogic AI Tutors

arXiv:2601.08950v2 Announce Type: replace Abstract: Despite their growing adoption in education, LLMs remain misaligned with the core principle of effective tut

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Decision-Making

arXiv:2601.21439v2 Announce Type: replace Abstract: While Large Language Models (LLMs) are widely documented to be sensitive to minor prompt perturbations and p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization

arXiv:2601.22776v2 Announce Type: replace Abstract: Multi-turn tool-integrated reasoning enables Large Language Models (LLMs) to solve complex tasks through ite

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration

arXiv:2602.03151v2 Announce Type: replace Abstract: Vision Language Model (VLM) typically assume complete modality input during inference. However, their effect

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery

arXiv:2602.07943v2 Announce Type: replace Abstract: In the presence of confounding between an endogenous variable and the outcome, instrumental variables (IVs)

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Voxtral Realtime

arXiv:2602.11298v3 Announce Type: replace Abstract: We introduce Voxtral Realtime, a natively streaming automatic speech recognition model that matches offline

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

arXiv:2602.13218v2 Announce Type: replace Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) is bottlenecked by data: existing synthesis pipelines

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

KLong: Training LLM Agent for Extremely Long-horizon Tasks

arXiv:2602.17547v2 Announce Type: replace Abstract: This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The pri