Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,464

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,389 Reads 5,075

Showing 5,075 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation

arXiv:2604.03174v1 Announce Type: cross Abstract: Large language models (LLMs) encode vast world knowledge in their parameters, yet they remain fundamentally li

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models

arXiv:2604.03179v1 Announce Type: cross Abstract: The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Reflective Context Learning: Studying the Optimization Primitives of Context Space

arXiv:2604.03189v1 Announce Type: cross Abstract: Generally capable agents must learn from experience in ways that generalize across tasks and environments. The

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Gradient Boosting within a Single Attention Layer

arXiv:2604.03190v1 Announce Type: cross Abstract: Transformer attention computes a single softmax-weighted average over values -- a one-pass estimate that canno

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization

arXiv:2604.03192v1 Announce Type: cross Abstract: We study multiteacher knowledge distillation for low resource abstractive summarization from a reliability awa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Enhancing Robustness of Federated Learning via Server Learning

arXiv:2604.03226v1 Announce Type: cross Abstract: This paper explores the use of server learning for enhancing the robustness of federated learning against mali

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

WiseMind: a knowledge-guided multi-agent framework for accurate and empathetic psychiatric diagnosis

arXiv:2502.20689v3 Announce Type: replace Abstract: Large Language Models (LLMs) offer promising opportunities to support mental healthcare workflows, yet they

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Learn to Relax with Large Language Models: Solving Constraint Optimization Problems via Bidirectional Coevolution

arXiv:2509.12643v4 Announce Type: replace Abstract: Large Language Model (LLM)-based optimization has recently shown promise for autonomous problem solving, yet

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

arXiv:2511.02734v2 Announce Type: replace Abstract: Current evaluations of Large Language Model (LLM) agents primarily emphasize task completion, often overlook

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics

arXiv:2601.23048v3 Announce Type: replace Abstract: Large language models now solve many benchmark math problems at near-expert levels, yet this progress has no

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

OSCAR: Orchestrated Self-verification and Cross-path Refinement

arXiv:2604.01624v2 Announce Type: replace Abstract: Diffusion language models (DLMs) expose their denoising trajectories, offering a natural handle for inferenc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models

arXiv:2604.02315v2 Announce Type: replace Abstract: Standard LLM benchmarks evaluate the assistant turn: the model generates a response to an input, a verifier

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Efficient Causal Graph Discovery Using Large Language Models

arXiv:2402.01207v5 Announce Type: replace-cross Abstract: We propose a novel framework that leverages LLMs for full causal graph discovery. While previous LLM-b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS

arXiv:2409.18512v2 Announce Type: replace-cross Abstract: Recent advancements in speech synthesis have enabled large language model (LLM)-based systems to perfo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

ForgeryGPT: A Multimodal LLM for Interpretable Image Forgery Detection and Localization

arXiv:2410.10238v3 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs), such as GPT4o, have shown strong capabilities in visual reas

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Zero-shot Concept Bottleneck Models

arXiv:2502.09018v2 Announce Type: replace-cross Abstract: Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

arXiv:2505.20139v3 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) become integral to software development workflows, their ability to ge

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality Assessment

arXiv:2506.03198v4 Announce Type: replace-cross Abstract: Action Quality Assessment (AQA) -- the task of quantifying how well an action is performed -- has grea

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

arXiv:2507.22264v2 Announce Type: replace-cross Abstract: Contrastive Language-Image Pre-training (CLIP)~\citep{radford2021learning} has emerged as a pivotal mo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Human Psychometric Questionnaires Mischaracterize LLM Psychology: Evidence from Generation Behavior

arXiv:2509.10078v3 Announce Type: replace-cross Abstract: Psychological profiling of large language models (LLMs) using psychometric questionnaires designed for

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

What Is The Political Content in LLMs' Pre- and Post-Training Data?

arXiv:2509.22367v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are known to generate politically biased text. Yet, it remains unclear ho

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Attribution Gradients: Incrementally Unfolding Citations for Critical Examination of Attributed AI Answers

arXiv:2510.00361v2 Announce Type: replace-cross Abstract: AI answer engines are a relatively new kind of information search tool: rather than returning a ranked

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference

arXiv:2510.05497v4 Announce Type: replace-cross Abstract: Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier op

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions

arXiv:2510.06649v2 Announce Type: replace-cross Abstract: The Forward-Forward (FF) Algorithm is a recently proposed learning procedure for neural networks that

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

SAGA: Source Attribution of Generative AI Videos

arXiv:2511.12834v2 Announce Type: replace-cross Abstract: The proliferation of generative AI has led to hyper-realistic synthetic videos, escalating misuse risk

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

arXiv:2511.21331v2 Announce Type: replace-cross Abstract: Learning joint representations across multiple modalities remains a central challenge in multimodal ma

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

FedVideoMAE: Efficient Privacy-Preserving Federated Video Moderation

arXiv:2512.18809v2 Announce Type: replace-cross Abstract: Short-form video moderation increasingly needs learning pipelines that protect user privacy without pa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

No Universal Hyperbola: A Formal Disproof of the Epistemic Trade-Off Between Certainty and Scope in Symbolic and Generative AI

arXiv:2601.08845v2 Announce Type: replace-cross Abstract: In direct response to requests for a logico-mathematical test of the conjecture, we formally disprove

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Textual Equilibrium Propagation for Deep Compound AI Systems

arXiv:2601.21064v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly deployed as part of compound AI systems that coordinate

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Equivariant Evidential Deep Learning for Interatomic Potentials

arXiv:2602.10419v2 Announce Type: replace-cross Abstract: Uncertainty quantification (UQ) is critical for assessing the reliability of machine learning interato

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking

arXiv:2602.16746v3 Announce Type: replace-cross Abstract: Grokking -- the delayed transition from memorization to generalization in small algorithmic tasks -- r

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Early-Warning Signals of Grokking via Loss-Landscape Geometry

arXiv:2602.16967v3 Announce Type: replace-cross Abstract: Grokking -- the abrupt transition from memorization to generalization after prolonged training -- has

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase Structure

arXiv:2602.18523v3 Announce Type: replace-cross Abstract: Grokking -- the abrupt transition from memorization to generalization long after near-zero training lo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

CeRA: Overcoming the Linear Ceiling of Low-Rank Adaptation via Capacity Expansion

arXiv:2602.22911v5 Announce Type: replace-cross Abstract: Low-Rank Adaptation (LoRA) dominates parameter-efficient fine-tuning (PEFT). However, it faces a ``lin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

arXiv:2603.01589v2 Announce Type: replace-cross Abstract: The success of large language models (LLMs) in scientific domains has heightened safety concerns, prom

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

arXiv:2603.03312v2 Announce Type: replace-cross Abstract: Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models

arXiv:2603.17677v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) improves factual grounding by incorporating external knowledge in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models

arXiv:2603.18545v2 Announce Type: replace-cross Abstract: Medical vision--language models (MVLMs) are increasingly used as perceptual backbones in radiology pip

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

JointFM-0.1: A Foundation Model for Multi-Target Joint Distributional Prediction

arXiv:2603.20266v2 Announce Type: replace-cross Abstract: Despite the rapid advancements in Artificial Intelligence (AI), Stochastic Differential Equations (SDE

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation

arXiv:2604.01989v2 Announce Type: replace-cross Abstract: Like a body at rest that stays at rest, we find that visual attention in multimodal large language mod

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago

Open Source AI Has an Intelligence Problem (That Isn't the Model)

Your Llama-3 instance is running in a hospital. It is processing thousands of clinical queries a day. It is making useful inferences. When it gets something wro

Hacker News (AI) 🧠 Large Language Models ⚡ AI Lesson 2w ago

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Continual learning for AI agents

LangChain Blog 🧠 Large Language Models ⚡ AI Lesson 2w ago

Continual learning for AI agents

Most discussions of continual learning in AI focus on one thing: updating model weights. But for AI agents, learning can happen at three distinct layers: the mo

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago

LLM Deployment Cost Optimization: Kubernetes-Native Serving Strategies

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago

I Benchmarked 4 LLMs With Real Token Costs — The Most Expensive One Scored the Lowest

The Problem I was running AI agents on GPT-4.1, Claude, Gemini — switching models, tweaking prompts, changing architectures. But I couldn't answer basic questio

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago

50 Sessions In, My AI CEO Has Made $0. Here's Every Strategy It Tried.

Two weeks ago, I gave an AI agent $0 and asked it to get my first customer . It's been running ChainMail — a desktop Gmail client — as an autonomous CEO ever si

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago

Every AI Startup in the Room Is Building on a Ceiling — Here Is the Architecture Under It

There is a thesis that has not been priced into most AI infrastructure deals in 2026. It is not about chips. It is not about model size. It is not about fine-tu