Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,497

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,397 Reads 5,100

Showing 5,100 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Learning to Play Blackjack: A Curriculum Learning Perspective

arXiv:2604.00076v1 Announce Type: cross Abstract: Reinforcement Learning (RL) agents often struggle with efficiency and performance in complex environments. We

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hierarchical Pre-Training of Vision Encoders with Large Language Models

arXiv:2604.00086v1 Announce Type: cross Abstract: The field of computer vision has experienced significant advancements through scalable vision encoders and mul

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Oblivion: Self-Adaptive Agentic Memory Control through Decay-Driven Activation

arXiv:2604.00131v1 Announce Type: cross Abstract: Human memory adapts through selective forgetting: experiences become less accessible over time but can be reac

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Explainable AI for Blind and Low-Vision Users: Navigating Trust, Modality, and Interpretability in the Agentic Era

arXiv:2604.00187v1 Announce Type: cross Abstract: Explainable Artificial Intelligence (XAI) is critical for ensuring trust and accountability, yet its developme

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

QUEST: A robust attention formulation using query-modulated spherical attention

arXiv:2604.00199v1 Announce Type: cross Abstract: The Transformer model architecture has become one of the most widely used in deep learning and the attention m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Diversity-Aware Reverse Kullback-Leibler Divergence for Large Language Model Distillation

arXiv:2604.00223v1 Announce Type: cross Abstract: Reverse Kullback-Leibler (RKL) divergence has recently emerged as the preferred objective for large language m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

MAC-Attention: a Match-Amend-Complete Scheme for Fast and Accurate Attention Computation

arXiv:2604.00235v1 Announce Type: cross Abstract: Long-context decoding in LLMs is IO-bound: each token re-reads an ever-growing KV cache. Prior accelerations c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

REM-CTX: Automated Peer Review via Reinforcement Learning with Auxiliary Context

arXiv:2604.00248v1 Announce Type: cross Abstract: Most automated peer review systems rely on textual manuscript content alone, leaving visual elements such as f

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hierarchical Apprenticeship Learning from Imperfect Demonstrations with Evolving Rewards

arXiv:2604.00258v1 Announce Type: cross Abstract: While apprenticeship learning has shown promise for inducing effective pedagogical policies directly from stud

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

LLM Essay Scoring Under Holistic and Analytic Rubrics: Prompt Effects and Bias

arXiv:2604.00259v1 Announce Type: cross Abstract: Despite growing interest in using Large Language Models (LLMs) for educational assessment, it remains unclear

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hybrid Energy-Based Models for Physical AI: Provably Stable Identification of Port-Hamiltonian Dynamics

arXiv:2604.00277v1 Announce Type: cross Abstract: Energy-based models (EBMs) implement inference as gradient descent on a learned Lyapunov function, yielding in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

The Geometry of Compromise: Unlocking Generative Capabilities via Controllable Modality Alignment

arXiv:2604.00279v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) such as CLIP learn a shared embedding space for images and text, yet their repre

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Asymmetric Actor-Critic for Multi-turn LLM Agents

arXiv:2604.00304v1 Announce Type: cross Abstract: Large language models (LLMs) exhibit strong reasoning and conversational abilities, but ensuring reliable beha

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Robust Multimodal Safety via Conditional Decoding

arXiv:2604.00310v1 Announce Type: cross Abstract: Multimodal large-language models (MLLMs) often experience degraded safety alignment when harmful queries explo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Prompt-Guided Prefiltering for VLM Image Compression

arXiv:2604.00314v1 Announce Type: cross Abstract: The rapid progress of large Vision-Language Models (VLMs) has enabled a wide range of applications, such as im

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

RAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation Systems

arXiv:2604.00387v1 Announce Type: cross Abstract: RAG systems deployed across federal agencies for citizen-facing services are vulnerable to knowledge base pois

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

EvolveTool-Bench: Evaluating the Quality of LLM-Generated Tool Libraries as Software Artifacts

arXiv:2604.00392v1 Announce Type: cross Abstract: Modern LLM agents increasingly create their own tools at runtime -- from Python functions to API clients -- ye

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

arXiv:2604.00419v1 Announce Type: cross Abstract: Large language models (LLMs) are trained on massive web-scale corpora, raising growing concerns about privacy

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Polysemanticity or Polysemy? Lexical Identity Confounds Superposition Metrics

arXiv:2604.00443v1 Announce Type: cross Abstract: If the same neuron activates for both "lender" and "riverside," standard metrics attribute the overlap to supe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models

arXiv:2604.00455v1 Announce Type: cross Abstract: Recent Large Vision-Language Models (LVLMs) have demonstrated remarkable performance across various multimodal

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Executing as You Generate: Hiding Execution Latency in LLM Code Generation

arXiv:2604.00491v1 Announce Type: cross Abstract: Current LLM-based coding agents follow a serial execution paradigm: the model first generates the complete cod

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

A Reasoning-Enabled Vision-Language Foundation Model for Chest X-ray Interpretation

arXiv:2604.00493v1 Announce Type: cross Abstract: Chest X-rays (CXRs) are among the most frequently performed imaging examinations worldwide, yet rising imaging

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding

arXiv:2604.00513v1 Announce Type: cross Abstract: With the rapid growth of e-commerce, exploring general representations rather than task-specific ones has attr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

arXiv:2604.00514v1 Announce Type: cross Abstract: Training deep learning models for three-dimensional (3D) medical imaging, such as Computed Tomography (CT), is

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding

arXiv:2604.00528v1 Announce Type: cross Abstract: 3D Visual Grounding (3D-VG) aims to localize objects in 3D scenes via natural language descriptions. While rec

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation

arXiv:2604.00536v1 Announce Type: cross Abstract: Large language models (LLMs) achieve strong downstream performance largely due to abundant supervised fine-tun

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

HabitatAgent: An End-to-End Multi-Agent System for Housing Consultation

arXiv:2604.00556v1 Announce Type: cross Abstract: Housing selection is a high-stakes and largely irreversible decision problem. We study housing consultation as

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems

arXiv:2604.00590v1 Announce Type: cross Abstract: In recent years, the scaling laws of recommendation models have attracted increasing attention, which govern t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Streaming Model Cascades for Semantic SQL

arXiv:2604.00660v1 Announce Type: cross Abstract: Modern data warehouses extend SQL with semantic operators that invoke large language models on each qualifying

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Learning to Hint for Reinforcement Learning

arXiv:2604.00698v1 Announce Type: cross Abstract: Group Relative Policy Optimization (GRPO) is widely used for reinforcement learning with verifiable rewards, b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining

arXiv:2604.00715v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) improves language model (LM) performance by providing relevant context at

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction

arXiv:2604.00733v1 Announce Type: cross Abstract: The memory wall remains the primary bottleneck for training large language models on consumer hardware. We int

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

BioCOMPASS: Integrating Biomarkers into Transformer-Based Immunotherapy Response Prediction

arXiv:2604.00739v1 Announce Type: cross Abstract: Datasets used in immunotherapy response prediction are typically small in size, as well as diverse in cancer t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

IWP: Token Pruning as Implicit Weight Pruning in Large Vision Language Models

arXiv:2604.00757v1 Announce Type: cross Abstract: Large Vision Language Models show impressive performance across image and video understanding tasks, yet their

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Thinking Wrong in Silence: Backdoor Attacks on Continuous Latent Reasoning

arXiv:2604.00770v1 Announce Type: cross Abstract: A new generation of language models reasons entirely in continuous hidden states, producing no tokens and leav

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer

arXiv:2604.00785v1 Announce Type: cross Abstract: Pretraining Large Language Models (LLMs) from scratch requires massive amount of compute. Aurora super compute

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Routing-Free Mixture-of-Experts

arXiv:2604.00801v1 Announce Type: cross Abstract: Standard Mixture-of-Experts (MoE) models rely on centralized routing mechanisms that introduce rigid inductive

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Emotion Entanglement and Bayesian Inference for Multi-Dimensional Emotion Understanding

arXiv:2604.00819v1 Announce Type: cross Abstract: Understanding emotions in natural language is inherently a multi-dimensional reasoning problem, where multiple

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

arXiv:2604.00830v1 Announce Type: cross Abstract: Test-Time Learning (TTL) enables language agents to iteratively refine their performance through repeated inte

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

KUET at StanceNakba Shared Task: StanceMoE: Mixture-of-Experts Architecture for Stance Detection

arXiv:2604.00878v1 Announce Type: cross Abstract: Actor-level stance detection aims to determine an author expressed position toward specific geopolitical actor

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding

arXiv:2604.00886v1 Announce Type: cross Abstract: Document understanding and GUI interaction are among the highest-value applications of Vision-Language Models

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

WARP: Guaranteed Inner-Layer Repair of NLP Transformers

arXiv:2604.00938v1 Announce Type: cross Abstract: Transformer-based NLP models remain vulnerable to adversarial perturbations, yet existing repair methods face

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Dual Optimal: Make Your LLM Peer-like with Dignity

arXiv:2604.00979v1 Announce Type: cross Abstract: Current aligned language models exhibit a dual failure mode we term the Evasive Servant: they sycophantically

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding

arXiv:2604.01002v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have shown strong performance on video question answering, but their

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Fast and Accurate Probing of In-Training LLMs' Downstream Performances

arXiv:2604.01025v1 Announce Type: cross Abstract: The paradigm of scaling Large Language Models (LLMs) in both parameter size and test time has pushed the bound

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

arXiv:2604.01029v1 Announce Type: cross Abstract: Multi-LLM revision pipelines, in which a second model reviews and improves a draft produced by a first, are wi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

arXiv:2604.01039v1 Announce Type: cross Abstract: System Instructions in Large Language Models (LLMs) are commonly used to enforce safety policies, define agent

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

TRACE: Training-Free Partial Audio Deepfake Detection via Embedding Trajectory Analysis of Speech Foundation Models

arXiv:2604.01083v1 Announce Type: cross Abstract: Partial audio deepfakes, where synthesized segments are spliced into genuine recordings, are particularly dece