Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,530
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding

Showing 5,122 reads from curated sources

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries
arXiv:2603.28258v1 Announce Type: cross Abstract: Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied p
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights
arXiv:2603.28263v1 Announce Type: cross Abstract: Large Language Models (LLMs) remain heavily centered on English, with limited performance in low-resource lang
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Pre-Deployment Complexity Estimation for Federated Perception Systems
arXiv:2603.28282v1 Announce Type: cross Abstract: Edge AI systems increasingly rely on federated learning to train perception models in distributed, privacy-pre
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
FI-KAN: Fractal Interpolation Kolmogorov-Arnold Networks
arXiv:2603.28288v1 Announce Type: cross Abstract: Kolmogorov-Arnold Networks (KAN) employ B-spline bases on a fixed grid, providing no intrinsic multi-scale dec
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
NeiGAD: Augmenting Graph Anomaly Detection via Spectral Neighbor Information
arXiv:2603.28300v1 Announce Type: cross Abstract: Graph anomaly detection (GAD) aims to identify irregular nodes or structures in attributed graphs. Neighbor in
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Building evidence-based knowledge graphs from full-text literature for disease-specific biomedical reasoning
arXiv:2603.28325v1 Announce Type: cross Abstract: Biomedical knowledge resources often either preserve evidence as unstructured text or compress it into flat tr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Integrating Multimodal Large Language Model Knowledge into Amodal Completion
arXiv:2603.28333v1 Announce Type: cross Abstract: With the widespread adoption of autonomous vehicles and robotics, amodal completion, which reconstructs the oc
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code
arXiv:2603.28345v1 Announce Type: cross Abstract: LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that no existing program
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Coherent Without Grounding, Grounded Without Success: Observability and Epistemic Failure
arXiv:2603.28371v1 Announce Type: cross Abstract: When an agent can articulate why something works, we typically take this as evidence of genuine understanding.
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Membership Inference Attacks against Large Audio Language Models
arXiv:2603.28378v1 Announce Type: cross Abstract: We present the first systematic Membership Inference Attack (MIA) evaluation of Large Audio Language Models (L
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids
arXiv:2603.28385v1 Announce Type: cross Abstract: Maritime surveillance missions, such as search and rescue and environmental monitoring, rely on the efficient
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
EdgeDiT: Hardware-Aware Diffusion Transformers for Efficient On-Device Image Generation
arXiv:2603.28405v1 Announce Type: cross Abstract: Diffusion Transformers (DiT) have established a new state-of-the-art in high-fidelity image synthesis; however
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models
arXiv:2603.28416v1 Announce Type: cross Abstract: Reinforcement learning algorithms are defined by their learning update rules, which are typically hand-designe
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Spectral Higher-Order Neural Networks
arXiv:2603.28420v1 Announce Type: cross Abstract: Neural networks are fundamental tools of modern machine learning. The standard paradigm assumes binary interac
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allocation
arXiv:2603.28455v1 Announce Type: cross Abstract: In federated healthcare systems, Federated Class-Incremental Learning (FCIL) has emerged as a key paradigm, en
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention
arXiv:2603.28458v1 Announce Type: cross Abstract: Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification
arXiv:2603.28488v1 Announce Type: cross Abstract: Large language models (LLMs) remain unreliable for high-stakes claim verification due to hallucinations and sh
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Next-Token Prediction and Regret Minimization
arXiv:2603.28499v1 Announce Type: cross Abstract: We consider the question of how to employ next-token prediction algorithms in adversarial online decision-maki
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
The Unreasonable Effectiveness of Scaling Laws in AI
arXiv:2603.28507v1 Announce Type: cross Abstract: Classical AI scaling laws, especially for pre-training, describe how training loss decreases with compute in a
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model
arXiv:2603.28554v1 Announce Type: cross Abstract: Visual document understanding typically requires separate retrieval and generation models, doubling memory and
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Domain-Invariant Prompt Learning for Vision-Language Models
arXiv:2603.28555v1 Announce Type: cross Abstract: Large pre-trained vision-language models like CLIP have transformed computer vision by aligning images and tex
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Fine-Tuning Large Language Models for Cooperative Tactical Deconfliction of Small Unmanned Aerial Systems
arXiv:2603.28561v1 Announce Type: cross Abstract: The growing deployment of small Unmanned Aerial Systems (sUASs) in low-altitude airspaces has increased the ne
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments
arXiv:2603.28569v1 Announce Type: cross Abstract: The increasing agentic capabilities of Large Language Models (LLMs) have enabled their deployment in real-worl
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Learning Partial Action Replacement in Offline MARL
arXiv:2603.28573v1 Announce Type: cross Abstract: Offline multi-agent reinforcement learning (MARL) faces a critical challenge: the joint action space grows exp
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ChemCLIP: Bridging Organic and Inorganic Anticancer Compounds Through Contrastive Learning
arXiv:2603.28575v1 Announce Type: cross Abstract: The discovery of anticancer therapeutics has traditionally treated organic small molecules and metal-based coo
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Moving Beyond Review: Applying Language Models to Planning and Translation in Reflection
arXiv:2603.28596v1 Announce Type: cross Abstract: Reflective writing is known to support the development of students' metacognitive skills, yet learners often s
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning
arXiv:2603.28610v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) achieve stronger visual understanding by scaling input fidelity, yet
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Trust-Aware Routing for Distributed Generative AI Inference at the Edge
arXiv:2603.28622v1 Announce Type: cross Abstract: Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous ed
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
AMIGO: Agentic Multi-Image Grounding Oracle Benchmark
arXiv:2603.28662v1 Announce Type: cross Abstract: Agentic vision-language models increasingly act through extended interactions, but most evaluations still focu
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding
arXiv:2603.28696v1 Announce Type: cross Abstract: Long video understanding remains challenging for Multi-modal Large Language Models (MLLMs) due to high memory
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Stepwise Credit Assignment for GRPO on Flow-Matching Models
arXiv:2603.28718v1 Announce Type: cross Abstract: Flow-GRPO successfully applies reinforcement learning to flow models, but uses uniform credit assignment acros
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining
arXiv:2603.28737v1 Announce Type: cross Abstract: We introduce ParaSpeechCLAP, a dual-encoder contrastive model that maps speech and text style captions into a
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers
arXiv:2603.28762v1 Announce Type: cross Abstract: Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Retrieving Classes of Causal Orders with Inconsistent Knowledge Bases
arXiv:2412.14019v4 Announce Type: replace Abstract: Traditional causal discovery methods often depend on strong, untestable assumptions, making them unreliable
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Synergizing Large Language Models and Task-specific Models for Time Series Anomaly Detection
arXiv:2501.05675v5 Announce Type: replace Abstract: In anomaly detection, methods based on large language models (LLMs) can incorporate expert knowledge by read
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Inspire or Predict? Exploring New Paradigms in Assisting Classical Planners with Large Language Models
arXiv:2508.11524v2 Announce Type: replace Abstract: Addressing large-scale planning problems has become one of the central challenges in the planning community,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking
arXiv:2509.23392v3 Announce Type: replace Abstract: Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reas
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Searching Meta Reasoning Skeleton to Guide LLM Reasoning
arXiv:2510.04116v3 Announce Type: replace Abstract: Meta reasoning behaviors work as a skeleton to guide large language model (LLM) reasoning, thus help to impr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ShortcutBreaker: Low-Rank Noisy Bottleneck and Frequency Filtering Block for Multi-Class Unsupervised Anomaly Detection
arXiv:2510.18342v2 Announce Type: replace Abstract: Multi-class unsupervised anomaly detection (MUAD) has garnered growing research interest, as it seeks to dev
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
From Questions to Queries: An AI-powered Multi-Agent Framework for Spatial Text-to-SQL
arXiv:2510.21045v3 Announce Type: replace Abstract: The complexity of SQL and the spatial semantics of PostGIS create barriers for non-experts working with spat
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
FlipVQA: Scaling Multi-modal Instruction Tuning via Textbook-to-Knowledge Synthesis
arXiv:2511.16216v2 Announce Type: replace Abstract: Textbooks are among the richest repositories of human-verified reasoning knowledge, yet their complex layout
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
An Attention Mechanism for Robust Multimodal Integration in a Global Workspace Architecture
arXiv:2602.08597v2 Announce Type: replace Abstract: Robust multimodal systems must remain effective when some modalities are noisy, degraded, or unreliable. Exi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems
arXiv:2602.11510v2 Announce Type: replace Abstract: Multi-agent Large Language Model (LLM) systems create privacy risks that current benchmarks cannot measure.
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Evaluating and Understanding Scheming Propensity in LLM Agents
arXiv:2603.01608v2 Announce Type: replace Abstract: As frontier language models are increasingly deployed as autonomous agents pursuing complex, long-term objec
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Seed1.8 Model Card: Towards Generalized Real-World Agency
arXiv:2603.20633v2 Announce Type: replace Abstract: We present Seed1.8, a foundation model aimed at generalized real-world agency: going beyond single-turn pred
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Silicon Bureaucracy and AI Test-Oriented Education: Contamination Sensitivity and Score Confidence in LLM Benchmarks
arXiv:2603.21636v2 Announce Type: replace Abstract: Public benchmarks increasingly govern how large language models (LLMs) are ranked, selected, and deployed. W
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Continual Graph Learning: A Survey
arXiv:2301.12230v2 Announce Type: replace-cross Abstract: Continual Graph Learning (CGL) enables models to incrementally learn from streaming graph-structured d
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Deep Neural Networks: A Formulation Via Non-Archimedean Analysis
arXiv:2402.00094v2 Announce Type: replace-cross Abstract: We introduce a new class of deep neural networks (DNNs) with multilayered tree-like architectures. The