📰 ArXiv cs.AI
Articles from ArXiv cs.AI · 8,253 articles · Updated every 3 hours · View all reads
All
⚡ AI Lessons (21843)
ArXiv cs.AIDev.to AIMedium · AIMedium · ProgrammingForbes InnovationMedium · Machine Learning
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3w ago
VERDI: VLM-Embedded Reasoning for Autonomous Driving
arXiv:2505.15925v4 Announce Type: replace-cross Abstract: While autonomous driving (AD) stacks struggle with decision making under partial observability and rea
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3w ago
Informatics for Food Processing
arXiv:2505.17087v2 Announce Type: replace-cross Abstract: This chapter explores the evolution, classification, and health implications of food processing, while
ArXiv cs.AI
🛡️ AI Safety & Ethics
📄 Paper
⚡ AI Lesson
3w ago
SoSBench: Benchmarking Safety Alignment on Six Scientific Domains
arXiv:2505.21605v3 Announce Type: replace-cross Abstract: Large language models (LLMs) exhibit advancing capabilities in complex tasks, such as reasoning and gr
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
LLMs Judging LLMs: A Simplex Perspective
arXiv:2505.21972v3 Announce Type: replace-cross Abstract: Given the challenge of automatically evaluating free-form outputs from large language models (LLMs), a
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
arXiv:2505.24535v3 Announce Type: replace-cross Abstract: Controlling multiple behavioral attributes in large language models (LLMs) at inference time is a chal
ArXiv cs.AI
💻 AI-Assisted Coding
📄 Paper
⚡ AI Lesson
3w ago
PhysGaia: A Physics-Aware Benchmark with Multi-Body Interactions for Dynamic Novel View Synthesis
arXiv:2506.02794v3 Announce Type: replace-cross Abstract: We introduce PhysGaia, a novel physics-aware benchmark for Dynamic Novel View Synthesis (DyNVS) that e
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Large Language Models for Combinatorial Optimization of Design Structure Matrix
arXiv:2506.09749v3 Announce Type: replace-cross Abstract: In complex engineering systems, the dependencies among components or development activities are often
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
ZINA: Multimodal Fine-grained Hallucination Detection and Editing
arXiv:2506.13130v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) often generate hallucinations, where the output deviates from
ArXiv cs.AI
💻 AI-Assisted Coding
📄 Paper
⚡ AI Lesson
3w ago
Vision Transformer-Based Time-Series Image Reconstruction for Cloud-Filling Applications
arXiv:2506.19591v2 Announce Type: replace-cross Abstract: Cloud cover in multispectral imagery (MSI) poses significant challenges for early season crop mapping,
ArXiv cs.AI
📐 ML Fundamentals
📄 Paper
⚡ AI Lesson
3w ago
PRISM: Lightweight Multivariate Time-Series Classification through Symmetric Multi-Resolution Convolutional Layers
arXiv:2508.04503v3 Announce Type: replace-cross Abstract: Multivariate time series classification supports applications from wearable sensing to biomedical moni
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Making Prompts First-Class Citizens for Adaptive LLM Pipelines
arXiv:2508.05012v2 Announce Type: replace-cross Abstract: Modern LLM pipelines increasingly resemble complex data-centric applications: they retrieve data, corr
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
3w ago
CATNet: A geometric deep learning approach for CAT bond spread prediction in the primary market
arXiv:2508.10208v2 Announce Type: replace-cross Abstract: Traditional models for pricing catastrophe (CAT) bonds struggle to capture the complex, relational dat
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3w ago
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
arXiv:2508.13998v2 Announce Type: replace-cross Abstract: Generalization in embodied AI is hindered by the "seeing-to-doing gap," which stems from data scarcity
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference
arXiv:2508.16703v2 Announce Type: replace-cross Abstract: On-device running Large Language Models (LLMs) is nowadays a critical enabler towards preserving user
ArXiv cs.AI
📐 ML Fundamentals
📄 Paper
⚡ AI Lesson
3w ago
Challenges in Deep Learning-Based Small Organ Segmentation: A Benchmarking Perspective for Medical Research with Limited Datasets
arXiv:2509.05892v2 Announce Type: replace-cross Abstract: Accurate segmentation of carotid artery structures in histopathological images is vital for cardiovasc
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3w ago
RAPTOR: A Foundation Policy for Quadrotor Control
arXiv:2509.11481v2 Announce Type: replace-cross Abstract: Humans are remarkably data-efficient when adapting to new unseen conditions, like driving a new car. I
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3w ago
DoubleAgents: Human-Agent Alignment in a Socially Embedded Workflow
arXiv:2509.12626v3 Announce Type: replace-cross Abstract: Aligning agentic AI with user intent is critical for delegating complex, socially embedded tasks, yet
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
3w ago
Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks
arXiv:2509.22258v5 Announce Type: replace-cross Abstract: Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medi
ArXiv cs.AI
🛡️ AI Safety & Ethics
📄 Paper
⚡ AI Lesson
3w ago
Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing
arXiv:2509.23279v2 Announce Type: replace-cross Abstract: The rapid progress of image-to-video (I2V) generation models has introduced significant risks by enabl
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks
arXiv:2509.24186v2 Announce Type: replace-cross Abstract: Accuracy-based evaluation of Large Language Models (LLMs) measures benchmark-specific performance rath
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
ACT: Agentic Classification Tree
arXiv:2509.26433v4 Announce Type: replace-cross Abstract: When used in high-stakes settings, AI systems are expected to produce decisions that are transparent,
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Autonomy Reshapes How Personalization Affects Privacy Concerns and Trust in LLM Agents
arXiv:2510.04465v2 Announce Type: replace-cross Abstract: LLM agents require personal information for personalization in order to effectively act on users' beha
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline
arXiv:2510.06800v3 Announce Type: replace-cross Abstract: As large language models (LLMs) advance in role-playing (RP) tasks, existing benchmarks quickly become
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
arXiv:2510.07985v3 Announce Type: replace-cross Abstract: Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing t
DeepCamp AI