📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 7,966 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (20977) ArXiv cs.AI Dev.to AI Forbes Innovation Medium · AI Medium · Programming Medium · Cybersecurity

ArXiv cs.AI 📄 Paper 2w ago

Physics-Informed State Space Models for Reliable Solar Irradiance Forecasting in Off-Grid Systems

arXiv:2604.11807v1 Announce Type: cross Abstract: The stable operation of autonomous off-grid photovoltaic systems dictates reliance on solar forecasting algori

ArXiv cs.AI 📄 Paper 2w ago

Can Large Language Models Infer Causal Relationships from Real-World Text?

arXiv:2505.18931v4 Announce Type: replace Abstract: Understanding and inferring causal relationships from texts is a core aspect of human cognition and is essen

ArXiv cs.AI 📄 Paper 2w ago

VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments

arXiv:2506.02387v3 Announce Type: replace Abstract: Recent advancements in Vision Language Models (VLMs) have expanded their capabilities to interactive agent t

ArXiv cs.AI 📄 Paper 2w ago

Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky

arXiv:2507.03336v4 Announce Type: replace Abstract: Large language models (LLMs) are increasingly tasked with invoking enterprise APIs, yet they routinely falte

ArXiv cs.AI 📄 Paper 2w ago

PosterGen: Aesthetic-Aware Multi-Modal Paper-to-Poster Generation via Multi-Agent LLMs

arXiv:2508.17188v2 Announce Type: replace Abstract: Multi-agent systems built upon large language models (LLMs) have demonstrated remarkable capabilities in tac

ArXiv cs.AI 📄 Paper 2w ago

ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care

arXiv:2509.00891v3 Announce Type: replace Abstract: Real-world adoption of closed-loop insulin delivery systems (CLIDS) in type 1 diabetes remains low, driven n

ArXiv cs.AI 📄 Paper 2w ago

RISK: A Framework for GUI Agents in E-commerce Risk Management

arXiv:2509.21982v2 Announce Type: replace Abstract: E-commerce risk management requires aggregating diverse, deeply embedded web data through multi-step, statef

ArXiv cs.AI 📄 Paper 2w ago

Interactive Learning for LLM Reasoning

arXiv:2509.26306v4 Announce Type: replace Abstract: Existing multi-agent learning approaches have developed interactive training environments to explicitly prom

ArXiv cs.AI 📄 Paper 2w ago

TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

arXiv:2509.26627v2 Announce Type: replace Abstract: Designing dense rewards is crucial for reinforcement learning (RL), yet in robotics it often demands extensi

ArXiv cs.AI 📄 Paper 2w ago

Advancing Reasoning in Diffusion Language Models with Denoising Process Rewards

arXiv:2510.01544v2 Announce Type: replace Abstract: Diffusion-based large language models offer a non-autoregressive alternative for text generation, but enabli

ArXiv cs.AI 📄 Paper 2w ago

Plug-and-Play Dramaturge: A Divide-and-Conquer Approach for Iterative Narrative Script Refinement via Collaborative LLM Agents

arXiv:2510.05188v2 Announce Type: replace Abstract: Although LLMs have been widely adopted for creative content generation, a single-pass process often struggle

ArXiv cs.AI 📄 Paper 2w ago

SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

arXiv:2510.07972v3 Announce Type: replace Abstract: Query-product relevance prediction is vital for AI-driven e-commerce, yet current LLM-based approaches face

ArXiv cs.AI 📄 Paper 2w ago

Graph-Coarsening Approach for the Capacitated Vehicle Routing Problem with Time Windows

arXiv:2510.22329v2 Announce Type: replace Abstract: The Capacitated Vehicle Routing Problem with Time Windows (CVRPTW) is a fundamental NP-hard optimization pro

ArXiv cs.AI 📄 Paper 2w ago

MGA: Memory-Driven GUI Agent for Observation-Centric Interaction

arXiv:2510.24168v2 Announce Type: replace Abstract: Multimodal Large Language Models (MLLMs) have significantly advanced GUI agents, yet long-horizon automation

ArXiv cs.AI 📄 Paper 2w ago

Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight

arXiv:2512.19691v3 Announce Type: replace Abstract: Reference labels for machine-learning benchmarks are increasingly synthesized with LLM assistance, but their

ArXiv cs.AI 📄 Paper 2w ago

Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration

arXiv:2601.07224v2 Announce Type: replace Abstract: While Hybrid Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has become the standard pa

ArXiv cs.AI 📄 Paper 2w ago

AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts

arXiv:2601.11044v3 Announce Type: replace Abstract: Large Language Models (LLMs) based autonomous agents demonstrate multifaceted capabilities to contribute sub

ArXiv cs.AI 📄 Paper 2w ago

Subargument Argumentation Frameworks: Separating Direct Conflict from Structural Dependency

arXiv:2601.12038v3 Announce Type: replace Abstract: Dung's abstract argumentation frameworks model acceptability solely in terms of an attack relation, thereby

ArXiv cs.AI 📄 Paper 2w ago

Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility

arXiv:2602.03402v3 Announce Type: replace Abstract: Vision language models (VLMs) extend the reasoning capabilities of large language models (LLMs) to cross-mod

ArXiv cs.AI 📄 Paper 2w ago

ANCHOR: Branch-Point Data Generation for GUI Agents

arXiv:2602.07153v2 Announce Type: replace Abstract: End-to-end GUI agents for real desktop environments require large amounts of high-quality interaction data,

ArXiv cs.AI 📄 Paper 2w ago

X-SYS: A Reference Architecture for Interactive Explanation Systems

arXiv:2602.12748v3 Announce Type: replace Abstract: The explainable AI (XAI) research community has proposed numerous technical methods, yet deploying explainab

ArXiv cs.AI 📄 Paper 2w ago

Constrained Assumption-Based Argumentation Frameworks

arXiv:2602.13135v2 Announce Type: replace Abstract: Assumption-based Argumentation (ABA) is a well-established form of structured argumentation. ABA frameworks

ArXiv cs.AI 📄 Paper 2w ago

Hunt Globally: Wide Search AI Agents for Drug Asset Scouting in Investing, Business Development, and Competitive Intelligence

arXiv:2602.15019v3 Announce Type: replace Abstract: Bio-pharmaceutical innovation has shifted: many new drug assets now originate outside the United States and

ArXiv cs.AI 📄 Paper 2w ago

FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

arXiv:2602.22822v2 Announce Type: replace Abstract: The identification and property prediction of chemical molecules is of central importance in the advancement