📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 6,601 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (17316) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

INDOTABVQA: A Benchmark for Cross-Lingual Table Understanding in Bahasa Indonesia Documents

arXiv:2604.11970v1 Announce Type: cross Abstract: We introduce INDOTABVQA, a benchmark for evaluating cross-lingual Table Visual Question Answering (VQA) on rea

ArXiv cs.AI 📄 Paper 1w ago

Filtered Reasoning Score: Evaluating Reasoning Quality on a Model's Most-Confident Traces

arXiv:2604.11996v1 Announce Type: cross Abstract: Should we trust Large Language Models (LLMs) with high accuracy? LLMs achieve high accuracy on reasoning bench

ArXiv cs.AI 📄 Paper 1w ago

The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results

arXiv:2604.11998v1 Announce Type: cross Abstract: Cross-domain few-shot object detection (CD-FSOD) remains a challenging problem for existing object detectors a

ArXiv cs.AI 📄 Paper 1w ago

BayMOTH: Bayesian optiMizatiOn with meTa-lookahead -- a simple approacH

arXiv:2604.12005v1 Announce Type: cross Abstract: Bayesian optimization (BO) has for sequential optimization of expensive black-box functions demonstrated pract

ArXiv cs.AI 📄 Paper 1w ago

LLMs Struggle with Abstract Meaning Comprehension More Than Expected

arXiv:2604.12018v1 Announce Type: cross Abstract: Understanding abstract meanings is crucial for advanced language comprehension. Despite extensive research, ab

ArXiv cs.AI 📄 Paper 1w ago

Curvelet-Based Frequency-Aware Feature Enhancement for Deepfake Detection

arXiv:2604.12028v1 Announce Type: cross Abstract: The proliferation of sophisticated generative models has significantly advanced the realism of synthetic facia

ArXiv cs.AI 📄 Paper 1w ago

Benchmarking Deflection and Hallucination in Large Vision-Language Models

arXiv:2604.12033v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) increasingly rely on retrieval to answer knowledge-intensive multimodal q

ArXiv cs.AI 📄 Paper 1w ago

SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents

arXiv:2604.12040v1 Announce Type: cross Abstract: We present SIR-Bench, a benchmark of 794 test cases for evaluating autonomous security incident response agent

ArXiv cs.AI 📄 Paper 1w ago

VISTA: Validation-Informed Trajectory Adaptation via Self-Distillation

arXiv:2604.12044v1 Announce Type: cross Abstract: Deep learning models may converge to suboptimal solutions despite strong validation accuracy, masking an optim

ArXiv cs.AI 📄 Paper 1w ago

Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs

arXiv:2604.12049v1 Announce Type: cross Abstract: The use of Large Language Models (LLMs) for reliable, enterprise-grade analytics such as text categorization i

ArXiv cs.AI 📄 Paper 1w ago

Interpretable DNA Sequence Classification via Dynamic Feature Generation in Decision Trees

arXiv:2604.12060v1 Announce Type: cross Abstract: The analysis of DNA sequences has become critical in numerous fields, from evolutionary biology to understandi

ArXiv cs.AI 📄 Paper 1w ago

Robust Explanations for User Trust in Enterprise NLP Systems

arXiv:2604.12069v1 Announce Type: cross Abstract: Robust explanations are increasingly required for user trust in enterprise NLP, yet pre-deployment validation

ArXiv cs.AI 📄 Paper 1w ago

OpenTME: An Open Dataset of AI-powered H&E Tumor Microenvironment Profiles from TCGA

arXiv:2604.12075v1 Announce Type: cross Abstract: The tumor microenvironment (TME) plays a central role in cancer progression, treatment response, and patient o

ArXiv cs.AI 📄 Paper 1w ago

Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models

arXiv:2604.12076v1 Announce Type: cross Abstract: The Identifiable Victim Effect (IVE) $-$ the tendency to allocate greater resources to a specific, narratively

ArXiv cs.AI 📄 Paper 1w ago

LLM-Based Automated Diagnosis Of Integration Test Failures At Google

arXiv:2604.12108v1 Announce Type: cross Abstract: Integration testing is critical for the quality and reliability of complex software systems. However, diagnosi

ArXiv cs.AI 📄 Paper 1w ago

PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation

arXiv:2604.12113v1 Announce Type: cross Abstract: Visual Foundation Models (VFMs) such as the Segment Anything Model (SAM) have significantly advanced broad use

ArXiv cs.AI 📄 Paper 1w ago

Observing the unobserved confounding through its effects: toward randomized trial-like estimates from real-world survival data

arXiv:2604.12137v1 Announce Type: cross Abstract: Background: Randomized controlled trials (RCTs) are costly, time-consuming, and often infeasible, while treatm

ArXiv cs.AI 📄 Paper 1w ago

From Plan to Action: How Well Do Agents Follow the Plan?

arXiv:2604.12147v1 Announce Type: cross Abstract: Agents aspire to eliminate the need for task-specific prompt crafting through autonomous reason-act-observe lo

ArXiv cs.AI 📄 Paper 1w ago

Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution

arXiv:2604.12152v1 Announce Type: cross Abstract: Latent diffusion models for medical image super-resolution universally inherit variational autoencoders design

ArXiv cs.AI 📄 Paper 1w ago

Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference

arXiv:2604.12168v1 Announce Type: cross Abstract: The applications of Generative Artificial Intelligence (GenAI) and their intersections with data-driven fields

ArXiv cs.AI 📄 Paper 1w ago

CycloneMAE: A Scalable Multi-Task Learning Model for Global Tropical Cyclone Probabilistic Forecasting

arXiv:2604.12180v1 Announce Type: cross Abstract: Tropical cyclones (TCs) rank among the most destructive natural hazards, yet their forecasting faces fundament

ArXiv cs.AI 📄 Paper 1w ago

Clustering-Enhanced Domain Adaptation for Cross-Domain Intrusion Detection in Industrial Control Systems

arXiv:2604.12183v1 Announce Type: cross Abstract: Industrial control systems operate in dynamic environments where traffic distributions vary across scenarios,

ArXiv cs.AI 📄 Paper 1w ago

Characterizing Resource Sharing Practices on Underground Internet Forum Synthetic Non-Consensual Intimate Image Content Creation Communities

arXiv:2604.12190v1 Announce Type: cross Abstract: Many malicious actors responsible for disseminating synthetic non-consensual intimate imagery (SNCII) operate

ArXiv cs.AI 📄 Paper 1w ago

Towards grounded autonomous research: an end-to-end LLM mini research loop on published computational physics

arXiv:2604.12198v1 Announce Type: cross Abstract: Recent autonomous LLM agents have demonstrated end-to-end automation of machine-learning research. Real-world