📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 2,972 articles · Updated every 3 hours · View all news

All ⚡ AI Lessons (5842) ArXiv cs.AI Forbes Innovation OpenAI News Dev.to AI Hugging Face Blog Hackernoon

XAttnRes: Cross-Stage Attention Residuals for Medical Image Segmentation

arXiv:2604.03297v1 Announce Type: cross Abstract: In the field of Large Language Models (LLMs), Attention Residuals have recently demonstrated that learned, sel

ArXiv cs.AI 📄 Paper 17h ago

MoViD: View-Invariant 3D Human Pose Estimation via Motion-View Disentanglement

arXiv:2604.03299v1 Announce Type: cross Abstract: 3D human pose estimation is a key enabling technology for applications such as healthcare monitoring, human-ro

ArXiv cs.AI 📄 Paper 17h ago

AIFS-COMPO: A Global Data-Driven Atmospheric Composition Forecasting System

arXiv:2604.03300v1 Announce Type: cross Abstract: We introduce AIFS-COMPO, a skilful medium-range data-driven global forecasting system for aerosols and reactiv

ArXiv cs.AI 📄 Paper 17h ago

Embedding-Only Uplink for Onboard Retrieval Under Shift in Remote Sensing

arXiv:2604.03301v1 Announce Type: cross Abstract: Downlink bottlenecks motivate onboard systems that prioritize hazards without transmitting raw pixels. We stud

ArXiv cs.AI 📄 Paper 17h ago

Beyond Static Vision: Scene Dynamic Field Unlocks Intuitive Physics Understanding in Multi-modal Large Language Models

arXiv:2604.03302v1 Announce Type: cross Abstract: While Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in image and video un

ArXiv cs.AI 📄 Paper 17h ago

Downscaling weather forecasts from Low- to High-Resolution with Diffusion Models

arXiv:2604.03303v1 Announce Type: cross Abstract: We introduce a probabilistic diffusion-based method for global atmospheric downscaling implemented within the

ArXiv cs.AI 📄 Paper 17h ago

Generative Chemical Language Models for Energetic Materials Discovery

arXiv:2604.03304v1 Announce Type: cross Abstract: The discovery of new energetic materials remains a pressing challenge hindered by limited availability of high

ArXiv cs.AI 📄 Paper 17h ago

V-Reflection: Transforming MLLMs from Passive Observers to Active Interrogators

arXiv:2604.03307v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have achieved remarkable success, yet they remain prone to perception

ArXiv cs.AI 📄 Paper 17h ago

TreeGaussian: Tree-Guided Cascaded Contrastive Learning for Hierarchical Consistent 3D Gaussian Scene Segmentation and Understanding

arXiv:2604.03309v1 Announce Type: cross Abstract: 3D Gaussian Splatting (3DGS) has emerged as a real-time, differentiable representation for neural scene unders

ArXiv cs.AI 📄 Paper 17h ago

StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard with Spatial-temporal Dynamics

arXiv:2604.03315v1 Announce Type: cross Abstract: Storyboarding is a core skill in visual storytelling for film, animation, and games. However, automating this

ArXiv cs.AI 📄 Paper 17h ago

General Explicit Network (GEN): A novel deep learning architecture for solving partial differential equations

arXiv:2604.03321v1 Announce Type: cross Abstract: Machine learning, especially physics-informed neural networks (PINNs) and their neural network variants, has b

ArXiv cs.AI 📄 Paper 17h ago

VitaTouch: Property-Aware Vision-Tactile-Language Model for Robotic Quality Inspection in Manufacturing

arXiv:2604.03322v1 Announce Type: cross Abstract: Quality inspection in smart manufacturing requires identifying intrinsic material and surface properties beyon

ArXiv cs.AI 📄 Paper 17h ago

Safety-Aligned 3D Object Detection: Single-Vehicle, Cooperative, and End-to-End Perspectives

arXiv:2604.03325v1 Announce Type: cross Abstract: Perception plays a central role in connected and autonomous vehicles (CAVs), underpinning not only conventiona

ArXiv cs.AI 📄 Paper 17h ago

CoLoRSMamba: Conditional LoRA-Steered Mamba for Supervised Multimodal Violence Detection

arXiv:2604.03329v1 Announce Type: cross Abstract: Violence detection benefits from audio, but real-world soundscapes can be noisy or weakly related to the visib

ArXiv cs.AI 📄 Paper 17h ago

AICCE: AI Driven Compliance Checker Engine

arXiv:2604.03330v1 Announce Type: cross Abstract: For digital infrastructure to be safe, compatible, and standards-aligned, automated communication protocol com

ArXiv cs.AI 📄 Paper 17h ago

Composer Vector: Style-steering Symbolic Music Generation in a Latent Space

arXiv:2604.03333v1 Announce Type: cross Abstract: Symbolic music generation has made significant progress, yet achieving fine-grained and flexible control over

ArXiv cs.AI 📄 Paper 17h ago

The Ideation Bottleneck: Decomposing the Quality Gap Between AI-Generated and Human Economics Research

arXiv:2604.03338v1 Announce Type: cross Abstract: Autonomous AI systems can now generate complete economics research papers, but they substantially underperform

ArXiv cs.AI 📄 Paper 17h ago

Learning Additively Compositional Latent Actions for Embodied AI

arXiv:2604.03340v1 Announce Type: cross Abstract: Latent action learning infers pseudo-action labels from visual transitions, providing an approach to leverage

ArXiv cs.AI 📄 Paper 17h ago

Towards Intelligent Energy Security: A Unified Spatio-Temporal and Graph Learning Framework for Scalable Electricity Theft Detection in Smart Grids

arXiv:2604.03344v1 Announce Type: cross Abstract: Electricity theft and non-technical losses (NTLs) remain critical challenges in modern smart grids, causing si

ArXiv cs.AI 📄 Paper 17h ago

From Model-Based Screening to Data-Driven Surrogates: A Multi-Stage Workflow for Exploring Stochastic Agent-Based Models

arXiv:2604.03350v1 Announce Type: cross Abstract: Systematic exploration of Agent-Based Models (ABMs) is challenged by the curse of dimensionality and their inh

ArXiv cs.AI 📄 Paper 17h ago

CresOWLve: Benchmarking Creative Problem-Solving Over Real-World Knowledge

arXiv:2604.03374v1 Announce Type: cross Abstract: Creative problem-solving requires combining multiple cognitive abilities, including logical reasoning, lateral

ArXiv cs.AI 📄 Paper 17h ago

Banana100: Breaking NR-IQA Metrics by 100 Iterative Image Replications with Nano Banana Pro

arXiv:2604.03400v1 Announce Type: cross Abstract: The multi-step, iterative image editing capabilities of multi-modal agentic systems have transformed digital c

ArXiv cs.AI 📄 Paper 17h ago

Can LLMs Reason About Attention? Towards Zero-Shot Analysis of Multimodal Classroom Behavior

arXiv:2604.03401v1 Announce Type: cross Abstract: Understanding student engagement usually requires time-consuming manual observation or invasive recording that

ArXiv cs.AI 📄 Paper 17h ago

Generative AI for material design: A mechanics perspective from burgers to matter

arXiv:2604.03409v1 Announce Type: cross Abstract: Generative artificial intelligence offers a new paradigm to design matter in high-dimensional spaces. However,