Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,530

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,408 Reads 5,122

Showing 5,122 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

arXiv:2603.28258v1 Announce Type: cross Abstract: Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights

arXiv:2603.28263v1 Announce Type: cross Abstract: Large Language Models (LLMs) remain heavily centered on English, with limited performance in low-resource lang

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Pre-Deployment Complexity Estimation for Federated Perception Systems

arXiv:2603.28282v1 Announce Type: cross Abstract: Edge AI systems increasingly rely on federated learning to train perception models in distributed, privacy-pre

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

FI-KAN: Fractal Interpolation Kolmogorov-Arnold Networks

arXiv:2603.28288v1 Announce Type: cross Abstract: Kolmogorov-Arnold Networks (KAN) employ B-spline bases on a fixed grid, providing no intrinsic multi-scale dec

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

NeiGAD: Augmenting Graph Anomaly Detection via Spectral Neighbor Information

arXiv:2603.28300v1 Announce Type: cross Abstract: Graph anomaly detection (GAD) aims to identify irregular nodes or structures in attributed graphs. Neighbor in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Building evidence-based knowledge graphs from full-text literature for disease-specific biomedical reasoning

arXiv:2603.28325v1 Announce Type: cross Abstract: Biomedical knowledge resources often either preserve evidence as unstructured text or compress it into flat tr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Integrating Multimodal Large Language Model Knowledge into Amodal Completion

arXiv:2603.28333v1 Announce Type: cross Abstract: With the widespread adoption of autonomous vehicles and robotics, amodal completion, which reconstructs the oc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

arXiv:2603.28345v1 Announce Type: cross Abstract: LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that no existing program

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Coherent Without Grounding, Grounded Without Success: Observability and Epistemic Failure

arXiv:2603.28371v1 Announce Type: cross Abstract: When an agent can articulate why something works, we typically take this as evidence of genuine understanding.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Membership Inference Attacks against Large Audio Language Models

arXiv:2603.28378v1 Announce Type: cross Abstract: We present the first systematic Membership Inference Attack (MIA) evaluation of Large Audio Language Models (L

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids

arXiv:2603.28385v1 Announce Type: cross Abstract: Maritime surveillance missions, such as search and rescue and environmental monitoring, rely on the efficient

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

EdgeDiT: Hardware-Aware Diffusion Transformers for Efficient On-Device Image Generation

arXiv:2603.28405v1 Announce Type: cross Abstract: Diffusion Transformers (DiT) have established a new state-of-the-art in high-fidelity image synthesis; however

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models

arXiv:2603.28416v1 Announce Type: cross Abstract: Reinforcement learning algorithms are defined by their learning update rules, which are typically hand-designe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Spectral Higher-Order Neural Networks

arXiv:2603.28420v1 Announce Type: cross Abstract: Neural networks are fundamental tools of modern machine learning. The standard paradigm assumes binary interac

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allocation

arXiv:2603.28455v1 Announce Type: cross Abstract: In federated healthcare systems, Federated Class-Incremental Learning (FCIL) has emerged as a key paradigm, en

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

arXiv:2603.28458v1 Announce Type: cross Abstract: Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification

arXiv:2603.28488v1 Announce Type: cross Abstract: Large language models (LLMs) remain unreliable for high-stakes claim verification due to hallucinations and sh

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Next-Token Prediction and Regret Minimization

arXiv:2603.28499v1 Announce Type: cross Abstract: We consider the question of how to employ next-token prediction algorithms in adversarial online decision-maki

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

The Unreasonable Effectiveness of Scaling Laws in AI

arXiv:2603.28507v1 Announce Type: cross Abstract: Classical AI scaling laws, especially for pre-training, describe how training loss decreases with compute in a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model

arXiv:2603.28554v1 Announce Type: cross Abstract: Visual document understanding typically requires separate retrieval and generation models, doubling memory and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Domain-Invariant Prompt Learning for Vision-Language Models

arXiv:2603.28555v1 Announce Type: cross Abstract: Large pre-trained vision-language models like CLIP have transformed computer vision by aligning images and tex

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Fine-Tuning Large Language Models for Cooperative Tactical Deconfliction of Small Unmanned Aerial Systems

arXiv:2603.28561v1 Announce Type: cross Abstract: The growing deployment of small Unmanned Aerial Systems (sUASs) in low-altitude airspaces has increased the ne

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments

arXiv:2603.28569v1 Announce Type: cross Abstract: The increasing agentic capabilities of Large Language Models (LLMs) have enabled their deployment in real-worl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Learning Partial Action Replacement in Offline MARL

arXiv:2603.28573v1 Announce Type: cross Abstract: Offline multi-agent reinforcement learning (MARL) faces a critical challenge: the joint action space grows exp

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ChemCLIP: Bridging Organic and Inorganic Anticancer Compounds Through Contrastive Learning

arXiv:2603.28575v1 Announce Type: cross Abstract: The discovery of anticancer therapeutics has traditionally treated organic small molecules and metal-based coo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Moving Beyond Review: Applying Language Models to Planning and Translation in Reflection

arXiv:2603.28596v1 Announce Type: cross Abstract: Reflective writing is known to support the development of students' metacognitive skills, yet learners often s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning

arXiv:2603.28610v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) achieve stronger visual understanding by scaling input fidelity, yet

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Trust-Aware Routing for Distributed Generative AI Inference at the Edge

arXiv:2603.28622v1 Announce Type: cross Abstract: Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous ed

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AMIGO: Agentic Multi-Image Grounding Oracle Benchmark

arXiv:2603.28662v1 Announce Type: cross Abstract: Agentic vision-language models increasingly act through extended interactions, but most evaluations still focu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

arXiv:2603.28696v1 Announce Type: cross Abstract: Long video understanding remains challenging for Multi-modal Large Language Models (MLLMs) due to high memory

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Stepwise Credit Assignment for GRPO on Flow-Matching Models

arXiv:2603.28718v1 Announce Type: cross Abstract: Flow-GRPO successfully applies reinforcement learning to flow models, but uses uniform credit assignment acros

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining

arXiv:2603.28737v1 Announce Type: cross Abstract: We introduce ParaSpeechCLAP, a dual-encoder contrastive model that maps speech and text style captions into a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

arXiv:2603.28762v1 Announce Type: cross Abstract: Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Retrieving Classes of Causal Orders with Inconsistent Knowledge Bases

arXiv:2412.14019v4 Announce Type: replace Abstract: Traditional causal discovery methods often depend on strong, untestable assumptions, making them unreliable

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Synergizing Large Language Models and Task-specific Models for Time Series Anomaly Detection

arXiv:2501.05675v5 Announce Type: replace Abstract: In anomaly detection, methods based on large language models (LLMs) can incorporate expert knowledge by read

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Inspire or Predict? Exploring New Paradigms in Assisting Classical Planners with Large Language Models

arXiv:2508.11524v2 Announce Type: replace Abstract: Addressing large-scale planning problems has become one of the central challenges in the planning community,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

arXiv:2509.23392v3 Announce Type: replace Abstract: Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reas

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Searching Meta Reasoning Skeleton to Guide LLM Reasoning

arXiv:2510.04116v3 Announce Type: replace Abstract: Meta reasoning behaviors work as a skeleton to guide large language model (LLM) reasoning, thus help to impr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ShortcutBreaker: Low-Rank Noisy Bottleneck and Frequency Filtering Block for Multi-Class Unsupervised Anomaly Detection

arXiv:2510.18342v2 Announce Type: replace Abstract: Multi-class unsupervised anomaly detection (MUAD) has garnered growing research interest, as it seeks to dev

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

From Questions to Queries: An AI-powered Multi-Agent Framework for Spatial Text-to-SQL

arXiv:2510.21045v3 Announce Type: replace Abstract: The complexity of SQL and the spatial semantics of PostGIS create barriers for non-experts working with spat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

FlipVQA: Scaling Multi-modal Instruction Tuning via Textbook-to-Knowledge Synthesis

arXiv:2511.16216v2 Announce Type: replace Abstract: Textbooks are among the richest repositories of human-verified reasoning knowledge, yet their complex layout

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

An Attention Mechanism for Robust Multimodal Integration in a Global Workspace Architecture

arXiv:2602.08597v2 Announce Type: replace Abstract: Robust multimodal systems must remain effective when some modalities are noisy, degraded, or unreliable. Exi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems

arXiv:2602.11510v2 Announce Type: replace Abstract: Multi-agent Large Language Model (LLM) systems create privacy risks that current benchmarks cannot measure.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Evaluating and Understanding Scheming Propensity in LLM Agents

arXiv:2603.01608v2 Announce Type: replace Abstract: As frontier language models are increasingly deployed as autonomous agents pursuing complex, long-term objec

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Seed1.8 Model Card: Towards Generalized Real-World Agency

arXiv:2603.20633v2 Announce Type: replace Abstract: We present Seed1.8, a foundation model aimed at generalized real-world agency: going beyond single-turn pred

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Silicon Bureaucracy and AI Test-Oriented Education: Contamination Sensitivity and Score Confidence in LLM Benchmarks

arXiv:2603.21636v2 Announce Type: replace Abstract: Public benchmarks increasingly govern how large language models (LLMs) are ranked, selected, and deployed. W

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Continual Graph Learning: A Survey

arXiv:2301.12230v2 Announce Type: replace-cross Abstract: Continual Graph Learning (CGL) enables models to incrementally learn from streaming graph-structured d

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Deep Neural Networks: A Formulation Via Non-Archimedean Analysis

arXiv:2402.00094v2 Announce Type: replace-cross Abstract: We introduce a new class of deep neural networks (DNNs) with multilayered tree-like architectures. The