📰 ArXiv cs.AI
85 articles · Updated every 3 hours · View all reads
All
Articles 91,407Blog Posts 109,561Tech Tutorials 22,864Research Papers 19,230News 14,854
⚡ AI Lessons
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1w ago
ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation
arXiv:2606.11670v1 Announce Type: cross Abstract: Subject-preserving video generation is not solved by frontal-face similarity alone: a generated person must re
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1w ago
Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning
arXiv:2606.11683v1 Announce Type: cross Abstract: Spatial reasoning from egocentric videos is inherently challenging because the observable evidence is constrai
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1w ago
Multi-View In-Cabin Monitoring System for Public Transport Vehicles
arXiv:2606.11739v1 Announce Type: cross Abstract: We introduce a multi-view in-cabin monitoring dataset for public transportation with synchronized RGB and dept
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1w ago
AnchorEdit: Maintaining Temporal Consistency in Multi-turn Image Editing via Causal Memory
arXiv:2606.11751v1 Announce Type: cross Abstract: Multi-turn image editing is essential for iterative design, yet current models often struggle with identity dr
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1w ago
TextHOI-3D: Text-to-3D Hand-Object Interaction via Discrete Multi-View Generation and Joint Mesh Optimization
arXiv:2606.11805v1 Announce Type: cross Abstract: Text-conditioned 3D generation has progressed rapidly for images and isolated objects, but producing a hand-ob
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
3w ago
BlazeEdit: Generalist Image Editing on Mobile Devices with Image-to-Image Diffusion Models
arXiv:2605.28067v1 Announce Type: new Abstract: The remarkable generation quality of modern diffusion models often comes at the cost of massive parameter counts
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
4w ago
A Camera-Cooperative ISAC Framework for Multimodal Non-Cooperative UAVs Sensing
arXiv:2605.22090v1 Announce Type: new Abstract: The detection of non-cooperative unmanned aerial vehicles (UAVs) presents significant challenges for Integrated
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections
arXiv:2605.05402v1 Announce Type: new Abstract: Artificial intelligence (AI) and computer vision are transforming transportation data collection. This study int
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
ReflectCAP: Detailed Image Captioning with Reflective Memory
arXiv:2604.12357v1 Announce Type: new Abstract: Detailed image captioning demands both factual grounding and fine-grained coverage, yet existing methods have st
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
Intelligent ROI-Based Vehicle Counting Framework for Automated Traffic Monitoring
arXiv:2604.12470v1 Announce Type: new Abstract: Accurate vehicle counting through video surveillance is crucial for efficient traffic management. However, achie
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
ART-VITON: Measurement-Guided Latent Diffusion for Artifact-Free Virtual Try-On
arXiv:2509.25749v2 Announce Type: cross Abstract: Virtual try-on (VITON) aims to generate realistic images of a person wearing a target garment, requiring preci
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation
arXiv:2604.05070v1 Announce Type: new Abstract: Simulation is essential for autonomous driving, yet current frameworks often model vehicles as rigid assets and
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
CRFT: Consistent-Recurrent Feature Flow Transformer for Cross-Modal Image Registration
arXiv:2604.05689v1 Announce Type: cross Abstract: We present Consistent-Recurrent Feature Flow Transformer (CRFT), a unified coarse-to-fine framework based on f
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
A reconfigurable smart camera implementation for jet flames characterization based on an optimized segmentation model
arXiv:2604.03267v1 Announce Type: cross Abstract: In this work we present a novel framework for fire safety management in industrial settings through the implem
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
InCaRPose: In-Cabin Relative Camera Pose Estimation Model and Dataset
arXiv:2604.03814v1 Announce Type: cross Abstract: Camera extrinsic calibration is a fundamental task in computer vision. However, precise relative pose estimati
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
HOIGS: Human-Object Interaction Gaussian Splatting
arXiv:2604.04016v1 Announce Type: cross Abstract: Reconstructing dynamic scenes with complex human-object interactions is a fundamental challenge in computer vi
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
Pickalo: Leveraging 6D Pose Estimation for Low-Cost Industrial Bin Picking
arXiv:2604.04690v1 Announce Type: cross Abstract: Bin picking in real industrial environments remains challenging due to severe clutter, occlusions, and the hig
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Aligned Attention
arXiv:2512.08477v2 Announce Type: replace-cross Abstract: Drag-based image editing enables intuitive visual manipulation through point-based drag operations. Ex
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
PaveBench: A Versatile Benchmark for Pavement Distress Perception and Interactive Vision-Language Analysis
arXiv:2604.02804v1 Announce Type: cross Abstract: Pavement condition assessment is essential for road safety and maintenance. Existing research has made signifi
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
NavCrafter: Exploring 3D Scenes from a Single Image
arXiv:2604.02828v1 Announce Type: cross Abstract: Creating flexible 3D scenes from a single image is vital when direct 3D data acquisition is costly or impracti
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
DePT3R: Joint Dense Point Tracking and 3D Reconstruction of Dynamic Scenes in a Single Forward Pass
arXiv:2512.13122v2 Announce Type: replace-cross Abstract: Current methods for dense 3D point tracking in dynamic scenes typically rely on pairwise processing, r
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning
arXiv:2411.13181v3 Announce Type: replace-cross Abstract: The classification of distracted drivers is pivotal for ensuring safe driving. Previous studies demons
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
From Skeletons to Semantics: Design and Deployment of a Hybrid Edge-Based Action Detection System for Public Safety
arXiv:2603.29777v1 Announce Type: cross Abstract: Public spaces such as transport hubs, city centres, and event venues require timely and reliable detection of
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
2mo ago
End-to-End Image Compression with Segmentation Guided Dual Coding for Wind Turbines
arXiv:2603.29927v1 Announce Type: cross Abstract: Transferring large volumes of high-resolution images during wind turbine inspections introduces a bottleneck i
DeepCamp AI