📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,273 articles · Updated every 3 hours · View all news

arXiv:2508.16703v2 Announce Type: replace-cross Abstract: On-device running Large Language Models (LLMs) is nowadays a critical enabler towards preserving user

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 2d ago

Challenges in Deep Learning-Based Small Organ Segmentation: A Benchmarking Perspective for Medical Research with Limited Datasets

arXiv:2509.05892v2 Announce Type: replace-cross Abstract: Accurate segmentation of carotid artery structures in histopathological images is vital for cardiovasc

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

RAPTOR: A Foundation Policy for Quadrotor Control

arXiv:2509.11481v2 Announce Type: replace-cross Abstract: Humans are remarkably data-efficient when adapting to new unseen conditions, like driving a new car. I

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

DoubleAgents: Human-Agent Alignment in a Socially Embedded Workflow

arXiv:2509.12626v3 Announce Type: replace-cross Abstract: Aligning agentic AI with user intent is critical for delegating complex, socially embedded tasks, yet

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

arXiv:2509.22258v5 Announce Type: replace-cross Abstract: Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medi

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2d ago

Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

arXiv:2509.23279v2 Announce Type: replace-cross Abstract: The rapid progress of image-to-video (I2V) generation models has introduced significant risks by enabl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

arXiv:2509.24186v2 Announce Type: replace-cross Abstract: Accuracy-based evaluation of Large Language Models (LLMs) measures benchmark-specific performance rath

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

ACT: Agentic Classification Tree

arXiv:2509.26433v4 Announce Type: replace-cross Abstract: When used in high-stakes settings, AI systems are expected to produce decisions that are transparent,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Autonomy Reshapes How Personalization Affects Privacy Concerns and Trust in LLM Agents

arXiv:2510.04465v2 Announce Type: replace-cross Abstract: LLM agents require personal information for personalization in order to effectively act on users' beha

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

arXiv:2510.06800v3 Announce Type: replace-cross Abstract: As large language models (LLMs) advance in role-playing (RP) tasks, existing benchmarks quickly become

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

Fewer Weights, More Problems: A Practical Attack on LLM Pruning

arXiv:2510.07985v3 Announce Type: replace-cross Abstract: Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing t

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Clear Roads, Clear Vision: Advancements in Multi-Weather Restoration for Smart Transportation

arXiv:2510.09228v2 Announce Type: replace-cross Abstract: Adverse weather conditions such as haze, rain, and snow significantly degrade the quality of images an

ArXiv cs.AI 🛠️ AI Tools & Apps 📄 Paper ⚡ AI Lesson 2d ago

Leveraging Wireless Sensor Networks for Real-Time Monitoring and Control of Industrial Environments

arXiv:2510.13820v2 Announce Type: replace-cross Abstract: This research proposes an extensive technique for monitoring and controlling the industrial parameters

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

A Linguistics-Aware LLM Watermarking via Syntactic Predictability

arXiv:2510.13829v2 Announce Type: replace-cross Abstract: As large language models (LLMs) continue to advance rapidly, reliable governance tools have become cri

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

arXiv:2510.15148v2 Announce Type: replace-cross Abstract: Omni-modal large language models (OLLMs) aim to unify audio, vision, and text understanding within a s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

arXiv:2510.15746v2 Announce Type: replace-cross Abstract: Ideal or real - that is the question.In this work, we explore whether principles from game theory can

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

AI-BAAM: AI-Driven Bank Statement Analytics as Alternative Data for Malaysian MSME Credit Scoring

arXiv:2510.16066v4 Announce Type: replace-cross Abstract: Despite accounting for 96.1% of all businesses in Malaysia, access to financing remains one of the mos

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

A Model Can Help Itself: Reward-Free Self-Training for LLM Reasoning

arXiv:2510.18814v2 Announce Type: replace-cross Abstract: Can language models improve their reasoning performance without external rewards, using only their own

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2d ago

Co-Designing Quantum Codes with Transversal Diagonal Gates via Multi-Agent Systems

arXiv:2510.20728v3 Announce Type: replace-cross Abstract: Exact scientific discovery requires more than heuristic search: candidate constructions must be turned

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted MDE

arXiv:2510.25890v3 Announce Type: replace-cross Abstract: ATLAS is a constraint-guided generation framework for structured engineering artifacts whose outputs m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

arXiv:2511.06391v3 Announce Type: replace-cross Abstract: Optimization of offensive content moderation models for different types of hateful messages is typical

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

arXiv:2511.06448v2 Announce Type: replace-cross Abstract: In this work, we study the risks of collective financial fraud in large-scale multi-agent systems powe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

FAST-CAD: A Fairness-Aware Framework for Non-Contact Stroke Diagnosis

arXiv:2511.08887v4 Announce Type: replace-cross Abstract: Stroke is an acute cerebrovascular disease, and timely diagnosis significantly improves patient surviv

ArXiv cs.AI 💻 AI-Assisted Coding 📄 Paper ⚡ AI Lesson 2d ago

SPHINX: A Synthetic Environment for Visual Perception and Reasoning

arXiv:2511.20814v2 Announce Type: replace-cross Abstract: We present Sphinx, a synthetic environment for visual perception and reasoning that targets core cogni