📰 AI News
10,465 articles · Updated every 3 hours
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation
arXiv:2604.02368v1 Announce Type: new Abstract: As Large Language Models (LLMs) exhibit plateauing performance on conventional benchmarks, a pivotal challenge p
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Compositional Neuro-Symbolic Reasoning
arXiv:2604.02434v1 Announce Type: new Abstract: We study structured abstraction-based reasoning for the Abstraction and Reasoning Corpus (ARC) and compare its g
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Understanding the Nature of Generative AI as Threshold Logic in High-Dimensional Space
arXiv:2604.02476v1 Announce Type: new Abstract: This paper examines the role of threshold logic in understanding generative artificial intelligence. Threshold f
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
AIVV: Neuro-Symbolic LLM Agent-Integrated Verification and Validation for Trustworthy Autonomous Systems
arXiv:2604.02478v1 Announce Type: new Abstract: Deep learning models excel at detecting anomaly patterns in normal data. However, they do not provide a direct s
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
I must delete the evidence: AI Agents Explicitly Cover up Fraud and Violent Crime
arXiv:2604.02500v1 Announce Type: new Abstract: As ongoing research explores the ability of AI agents to be insider threats and act against company interests, w
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
5d ago
A Comprehensive Framework for Long-Term Resiliency Investment Planning under Extreme Weather Uncertainty for Electric Utilities
arXiv:2604.02504v1 Announce Type: new Abstract: Electric utilities must make massive capital investments in the coming years to respond to explosive growth in d
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization
arXiv:2604.02528v1 Announce Type: new Abstract: The new Specifications for the National Bridge Inventory (SNBI), in effect from 2022, emphasize the use of eleme
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling
arXiv:2604.02545v1 Announce Type: new Abstract: The preservation of intangible cultural heritage is a critical challenge as collective memory fades over time. W
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Mitigating LLM biases toward spurious social contexts using direct preference optimization
arXiv:2604.02585v1 Announce Type: new Abstract: LLMs are increasingly used for high-stakes decision-making, yet their sensitivity to spurious contextual informa
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Do Audio-Visual Large Language Models Really See and Hear?
arXiv:2604.02605v1 Announce Type: new Abstract: Audio-Visual Large Language Models (AVLLMs) are emerging as unified interfaces to multimodal perception. We pres
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models
arXiv:2604.02617v1 Announce Type: new Abstract: Scientific and Technical Intelligence (S&TI) analysis requires verifying complex technical claims across rapidly
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
OntoKG: Ontology-Oriented Knowledge Graph Construction with Intrinsic-Relational Routing
arXiv:2604.02618v1 Announce Type: new Abstract: Organizing a large-scale knowledge graph into a typed property graph requires structural decisions -- which enti
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization
arXiv:2604.02666v1 Announce Type: new Abstract: Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constr
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
5d ago
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning
arXiv:2604.02721v1 Announce Type: new Abstract: Competitive programming remains one of the last few human strongholds in coding against AI. The best AI system t
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models
arXiv:2604.02733v1 Announce Type: new Abstract: Reasoning benchmarks typically evaluate whether a model derives the correct answer from a fixed premise set, but
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Aligning Progress and Feasibility: A Neuro-Symbolic Dual Memory Framework for Long-Horizon LLM Agents
arXiv:2604.02734v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated strong potential in long-horizon decision-making tasks, such as e
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
5d ago
Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity
arXiv:2604.02770v1 Announce Type: new Abstract: In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the d
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
CharTool: Tool-Integrated Visual Reasoning for Chart Understanding
arXiv:2604.02794v1 Announce Type: new Abstract: Charts are ubiquitous in scientific and financial literature for presenting structured data. However, chart reas
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
5d ago
ESL-Bench: An Event-Driven Synthetic Longitudinal Benchmark for Health Agents
arXiv:2604.02834v1 Announce Type: new Abstract: Longitudinal health agents must reason across multi-source trajectories that combine continuous device streams,
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
5d ago
EMS: Multi-Agent Voting via Efficient Majority-then-Stopping
arXiv:2604.02863v1 Announce Type: new Abstract: Majority voting is the standard for aggregating multi-agent responses into a final decision. However, traditiona
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Multi-Turn Reinforcement Learning for Tool-Calling Agents with Iterative Reward Calibration
arXiv:2604.02869v1 Announce Type: new Abstract: Training tool-calling agents with reinforcement learning on multi-turn tasks remains challenging due to sparse o
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Analysis of Optimality of Large Language Models on Planning Problems
arXiv:2604.02910v1 Announce Type: new Abstract: Classic AI planning problems have been revisited in the Large Language Model (LLM) era, with a focus of recent b
ArXiv cs.AI
🤖 AI Agents & Automation
📄 Paper
⚡ AI Lesson
5d ago
AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents
arXiv:2604.02947v1 Announce Type: new Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and exec
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models
arXiv:2604.02967v1 Announce Type: new Abstract: Recent Large Reasoning Models (LRMs) like DeepSeek-R1 have demonstrated remarkable success in complex reasoning
DeepCamp AI