Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,627
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding

Showing 5,195 reads from curated sources

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs
arXiv:2603.24034v1 Announce Type: cross Abstract: Contextual automatic speech recognition (ASR) with Speech-LLMs is typically trained with oracle conversation h
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification
arXiv:2603.24058v1 Announce Type: cross Abstract: Object hallucination in Large Vision-Language Models (LVLMs) severely compromises their reliability in real-wo
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm
arXiv:2603.24079v1 Announce Type: cross Abstract: Recently, multimodal large language models (MLLMs) have emerged as a unified paradigm for language and image g
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning
arXiv:2603.24083v1 Announce Type: cross Abstract: This paper introduces Knowledge Graph based Massively Multi-task Model-based Policy Optimization (KG-M3PO), a
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization
arXiv:2603.24093v1 Announce Type: cross Abstract: Recently, reinforcement learning~(RL) has become an important approach for improving the capabilities of large
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation
arXiv:2603.24124v1 Announce Type: cross Abstract: RLHF-aligned language models exhibit response homogenization: on TruthfulQA (n=790), 40-79% of questions produ
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare
arXiv:2603.24132v1 Announce Type: cross Abstract: Conversational artificial intelligence has the potential to assist users in preliminary medical consultations,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula
arXiv:2603.24202v1 Announce Type: cross Abstract: Reinforcement learning (RL) has emerged as a powerful paradigm for improving large language models beyond supe
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search
arXiv:2603.24203v1 Announce Type: cross Abstract: Recent advances in the Model Context Protocol (MCP) have enabled large language models (LLMs) to invoke extern
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Powerful Teachers Matter: Text-Guided Multi-view Knowledge Distillation with Visual Prior Enhancement
arXiv:2603.24208v1 Announce Type: cross Abstract: Knowledge distillation transfers knowledge from large teacher models to smaller students for efficient inferen
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Uncovering Memorization in Timeseries Imputation models: LBRM Membership Inference and its link to attribute Leakage
arXiv:2603.24213v1 Announce Type: cross Abstract: Deep learning models for time series imputation are now essential in fields such as healthcare, the Internet o
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Who Benefits from RAG? The Role of Exposure, Utility and Attribution Bias
arXiv:2603.24218v1 Announce Type: cross Abstract: Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have achieved substantial impr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing
arXiv:2603.24221v1 Announce Type: cross Abstract: The increasing complexity and interconnectivity of digital infrastructures make scalable and reliable security
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
DVM: Real-Time Kernel Generation for Dynamic AI Models
arXiv:2603.24239v1 Announce Type: cross Abstract: Dynamism is common in AI computation, e.g., the dynamic tensor shapes and the dynamic control flows in models.
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep
arXiv:2603.24260v1 Announce Type: cross Abstract: Diffusion-based video editing has emerged as an important paradigm for high-quality and flexible content gener
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents
arXiv:2603.24284v1 Announce Type: cross Abstract: When multiple LLM-based code agents independently implement parts of the same class, they must agree on shared
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning
arXiv:2603.24324v1 Announce Type: cross Abstract: Designing effective auxiliary rewards for cooperative multi-agent systems remains a precarious task; misaligne
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents
arXiv:2603.24329v1 Announce Type: cross Abstract: Multimodal LLMs are increasingly deployed as perceptual backbones for autonomous agents in 3D environments, fr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Evidence of an Emergent "Self" in Continual Robot Learning
arXiv:2603.24350v1 Announce Type: cross Abstract: A key challenge to understanding self-awareness has been a principled way of quantifying whether an intelligen
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization
arXiv:2603.24382v1 Announce Type: cross Abstract: Despite deep learning's success in chemistry, its impact is hindered by a lack of interpretability and an inab
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools
arXiv:2603.24389v1 Announce Type: cross Abstract: High-quality teacher-child interaction (TCI) is fundamental to early childhood development, yet traditional ex
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers
arXiv:2603.24414v1 Announce Type: cross Abstract: OpenClaw has rapidly established itself as a leading open-source autonomous agent runtime, offering powerful c
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework
arXiv:2603.24422v1 Announce Type: cross Abstract: Generative Retrieval (GR) has emerged as a promising paradigm for modern search systems. Compared to multi-sta
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Enes Causal Discovery
arXiv:2603.24436v1 Announce Type: cross Abstract: Enes The proposed architecture is a mixture of experts, which allows for the model entities, such as the causa
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents
arXiv:2603.24440v1 Announce Type: cross Abstract: Computer-use agents (CUAs) hold great promise for automating complex desktop workflows, yet progress toward ge
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs
arXiv:2603.24511v1 Announce Type: cross Abstract: LLM agents like Claude Code can not only write code but also be used for autonomous AI research and engineerin
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty Attributions
arXiv:2603.24524v1 Announce Type: cross Abstract: Research on explainable AI (XAI) has frequently focused on explaining model predictions. More recently, method
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience
arXiv:2603.24533v1 Announce Type: cross Abstract: Autonomous mobile GUI agents have attracted increasing attention along with the advancement of Multimodal Larg
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents
arXiv:2603.24556v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has emerged as a framework to address the constraints of Large Language M
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
The Free-Market Algorithm: Self-Organizing Optimization for Open-Ended Complex Systems
arXiv:2603.24559v1 Announce Type: cross Abstract: We introduce the Free-Market Algorithm (FMA), a novel metaheuristic inspired by free-market economics. Unlike
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Anti-I2V: Safeguarding your photos from malicious image-to-video generation
arXiv:2603.24570v1 Announce Type: cross Abstract: Advances in diffusion-based video generation models, while significantly improving human animation, poses thre
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Chameleon: Episodic Memory for Long-Horizon Robotic Manipulation
arXiv:2603.24576v1 Announce Type: cross Abstract: Robotic manipulation often requires memory: occlusion and state changes can make decision-time observations pe
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction
arXiv:2603.24577v1 Announce Type: cross Abstract: Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, l
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
arXiv:2603.24580v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) systems are increasingly used to analyze complex policy documents, but ac
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Learning To Guide Human Decision Makers With Vision-Language Models
arXiv:2403.16501v4 Announce Type: replace Abstract: There is growing interest in AI systems that support human decision-making in high-stakes domains (e.g., med
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
The Collaboration Paradox: Why Generative AI Requires Both Strategic Intelligence and Operational Stability in Supply Chain Management
arXiv:2508.13942v2 Announce Type: replace Abstract: The rise of autonomous, AI-driven agents in economic settings raises critical questions about their emergent
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs
arXiv:2508.20810v2 Announce Type: replace Abstract: Rigorous evaluation of domain-specific language models requires benchmarks that are comprehensive, contamina
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation
arXiv:2509.22460v3 Announce Type: replace Abstract: Geometric Problem Solving (GPS) poses a unique challenge for Multimodal Large Language Models (MLLMs), requi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
SAG-Agent: Enabling Long-Horizon Reasoning in Strategy Games via Dynamic Knowledge Graphs
arXiv:2510.15259v3 Announce Type: replace Abstract: Most commodity software lacks accessible Application Programming Interfaces (APIs), requiring autonomous age
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning
arXiv:2512.16917v3 Announce Type: replace Abstract: Large language models (LLMs) with explicit reasoning capabilities excel at mathematical reasoning yet still
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering
arXiv:2601.10402v5 Announce Type: replace Abstract: The advancement of artificial intelligence toward agentic science is currently bottlenecked by the challenge
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation
arXiv:2601.12410v2 Announce Type: replace Abstract: Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer othe
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
CollectiveKV: Decoupling and Sharing Collaborative Information in Sequential Recommendation
arXiv:2601.19178v2 Announce Type: replace Abstract: Sequential recommendation models are widely used in applications, yet they face stringent latency requiremen
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Agentified Assessment of Logical Reasoning Agents
arXiv:2603.02788v3 Announce Type: replace Abstract: We present a framework for evaluating and benchmarking logical reasoning agents when assessment itself must
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning
arXiv:2603.03072v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used to assist scientists across diverse workflows. A key chal
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics
arXiv:2603.11442v2 Announce Type: replace Abstract: Can humans detect AI-generated financial documents better than machines? We present GPT4o-Receipt, a benchma
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
Relationship-Aware Safety Unlearning for Multimodal LLMs
arXiv:2603.14185v3 Announce Type: replace Abstract: Generative multimodal models can exhibit safety failures that are inherently relational: two benign concepts
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4w ago
DomAgent: Leveraging Knowledge Graphs and Case-Based Reasoning for Domain-Specific Code Generation
arXiv:2603.21430v2 Announce Type: replace Abstract: Large language models (LLMs) have shown impressive capabilities in code generation. However, because most LL