Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,450

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,389 Reads 5,061

Showing 5,061 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

arXiv:2602.13218v2 Announce Type: replace Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) is bottlenecked by data: existing synthesis pipelines

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

KLong: Training LLM Agent for Extremely Long-horizon Tasks

arXiv:2602.17547v2 Announce Type: replace Abstract: This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The pri

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality

arXiv:2603.05912v2 Announce Type: replace Abstract: Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality r

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation

arXiv:2603.08388v4 Announce Type: replace Abstract: We propose a Hierarchical Error-Corrective Graph FrameworkforAutonomousAgentswithLLM-BasedActionGeneration(H

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Collective AI can amplify tiny perturbations into divergent decisions

arXiv:2603.09127v2 Announce Type: replace Abstract: Large language models are increasingly deployed not as single assistants but as committees whose members del

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

An Onto-Relational-Sophic Framework for Governing Synthetic Minds

arXiv:2603.18633v2 Announce Type: replace Abstract: The rapid evolution of artificial intelligence, from task-specific systems to foundation models exhibiting b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper 2w ago

ClawSafety: "Safe" LLMs, Unsafe Agents

arXiv:2604.01438v2 Announce Type: replace Abstract: Personal AI agents like OpenClaw run with elevated privileges on users' local machines, where a single succe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Domain-constrained knowledge representation: A modal framework

arXiv:2604.01770v2 Announce Type: replace Abstract: Knowledge graphs store large numbers of relations efficiently, but they remain weak at representing a quiete

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model

arXiv:2406.14194v3 Announce Type: replace-cross Abstract: The emergence of Large Vision-Language Models (LVLMs) marks significant strides towards achieving gene

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

arXiv:2408.11871v3 Announce Type: replace-cross Abstract: Fake news significantly influences decision-making processes by misleading individuals, organizations,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

SPRIG: Improving Large Language Model Performance by System Prompt Optimization

arXiv:2410.14826v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have shown impressive capabilities in many scenarios, but their performan

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

arXiv:2410.21169v5 Announce Type: replace-cross Abstract: Document parsing (DP) transforms unstructured or semi-structured documents into structured, machine-re

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Implicit Bias-Like Patterns in Reasoning Models

arXiv:2503.11572v4 Announce Type: replace-cross Abstract: Implicit biases refer to automatic mental processes that shape perceptions, judgments, and behaviors.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

BalancedDPO: Adaptive Multi-Metric Alignment

arXiv:2503.12575v2 Announce Type: replace-cross Abstract: Diffusion models have achieved remarkable progress in text-to-image generation, yet aligning them with

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

LLMs Judging LLMs: A Simplex Perspective

arXiv:2505.21972v3 Announce Type: replace-cross Abstract: Given the challenge of automatically evaluating free-form outputs from large language models (LLMs), a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Beyond Linear Steering: Unified Multi-Attribute Control for Language Models

arXiv:2505.24535v3 Announce Type: replace-cross Abstract: Controlling multiple behavioral attributes in large language models (LLMs) at inference time is a chal

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Large Language Models for Combinatorial Optimization of Design Structure Matrix

arXiv:2506.09749v3 Announce Type: replace-cross Abstract: In complex engineering systems, the dependencies among components or development activities are often

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

ZINA: Multimodal Fine-grained Hallucination Detection and Editing

arXiv:2506.13130v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) often generate hallucinations, where the output deviates from

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Making Prompts First-Class Citizens for Adaptive LLM Pipelines

arXiv:2508.05012v2 Announce Type: replace-cross Abstract: Modern LLM pipelines increasingly resemble complex data-centric applications: they retrieve data, corr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference

arXiv:2508.16703v2 Announce Type: replace-cross Abstract: On-device running Large Language Models (LLMs) is nowadays a critical enabler towards preserving user

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

arXiv:2509.24186v2 Announce Type: replace-cross Abstract: Accuracy-based evaluation of Large Language Models (LLMs) measures benchmark-specific performance rath

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

ACT: Agentic Classification Tree

arXiv:2509.26433v4 Announce Type: replace-cross Abstract: When used in high-stakes settings, AI systems are expected to produce decisions that are transparent,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Autonomy Reshapes How Personalization Affects Privacy Concerns and Trust in LLM Agents

arXiv:2510.04465v2 Announce Type: replace-cross Abstract: LLM agents require personal information for personalization in order to effectively act on users' beha

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

arXiv:2510.06800v3 Announce Type: replace-cross Abstract: As large language models (LLMs) advance in role-playing (RP) tasks, existing benchmarks quickly become

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Fewer Weights, More Problems: A Practical Attack on LLM Pruning

arXiv:2510.07985v3 Announce Type: replace-cross Abstract: Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

A Linguistics-Aware LLM Watermarking via Syntactic Predictability

arXiv:2510.13829v2 Announce Type: replace-cross Abstract: As large language models (LLMs) continue to advance rapidly, reliable governance tools have become cri

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

arXiv:2510.15148v2 Announce Type: replace-cross Abstract: Omni-modal large language models (OLLMs) aim to unify audio, vision, and text understanding within a s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

arXiv:2510.15746v2 Announce Type: replace-cross Abstract: Ideal or real - that is the question.In this work, we explore whether principles from game theory can

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

A Model Can Help Itself: Reward-Free Self-Training for LLM Reasoning

arXiv:2510.18814v2 Announce Type: replace-cross Abstract: Can language models improve their reasoning performance without external rewards, using only their own

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted MDE

arXiv:2510.25890v3 Announce Type: replace-cross Abstract: ATLAS is a constraint-guided generation framework for structured engineering artifacts whose outputs m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

arXiv:2511.06391v3 Announce Type: replace-cross Abstract: Optimization of offensive content moderation models for different types of hateful messages is typical

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

arXiv:2511.06448v2 Announce Type: replace-cross Abstract: In this work, we study the risks of collective financial fraud in large-scale multi-agent systems powe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

FAST-CAD: A Fairness-Aware Framework for Non-Contact Stroke Diagnosis

arXiv:2511.08887v4 Announce Type: replace-cross Abstract: Stroke is an acute cerebrovascular disease, and timely diagnosis significantly improves patient surviv

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models

arXiv:2512.18388v2 Announce Type: replace-cross Abstract: Generative AI has democratized content creation, but popular chatbot-based interfaces often prioritize

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation

arXiv:2601.00263v2 Announce Type: replace-cross Abstract: Counterfactuals refer to minimally edited inputs that cause a model's prediction to change, serving as

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Path Integral Solution for Dissipative Generative Dynamics

arXiv:2601.00860v2 Announce Type: replace-cross Abstract: Can purely mechanical systems generate intelligent language? We prove that dissipative quantum dynamic

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models

arXiv:2601.01162v2 Announce Type: replace-cross Abstract: Categorical data are prevalent in domains such as healthcare, marketing, and bioinformatics, where clu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Projected Autoregression: Autoregressive Language Generation in Continuous State Space

arXiv:2601.04854v3 Announce Type: replace-cross Abstract: Standard autoregressive language models generate text by repeatedly selecting a discrete next token, c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

arXiv:2601.11109v3 Announce Type: replace-cross Abstract: Vision-as-inverse-graphics, the concept of reconstructing images into editable programs, remains chall

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Self-Improving Pretraining: using post-trained models to pretrain better models

arXiv:2601.21343v3 Announce Type: replace-cross Abstract: Large language models are classically trained in stages: pretraining on raw text followed by post-trai

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Predicting Intermittent Job Failure Categories for Diagnosis Using Few-Shot Fine-Tuned Language Models

arXiv:2601.22264v2 Announce Type: replace-cross Abstract: In principle, Continuous Integration (CI) pipeline failures provide valuable feedback to developers on

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs

arXiv:2602.01554v2 Announce Type: replace-cross Abstract: Unified multimodal large language models (MLLMs) aim to unify image understanding and image generation

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models

arXiv:2602.04448v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) language models introduce unique challenges for safety alignment due to their

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

arXiv:2602.09924v3 Announce Type: replace-cross Abstract: Running LLMs with extended reasoning on every problem is expensive, but determining which inputs actua

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

arXiv:2602.14351v2 Announce Type: replace-cross Abstract: Model-based reinforcement learning promises strong sample efficiency but often underperforms in practi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets

arXiv:2602.14536v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have seen remarkable advancements, achieving state-of-the-art results in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Flow Map Language Models: One-step Language Modeling via Continuous Denoising

arXiv:2602.16813v2 Announce Type: replace-cross Abstract: Language models based on discrete diffusion have attracted widespread interest for their potential to

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Autorubric: Unifying Rubric-based LLM Evaluation

arXiv:2603.00077v2 Announce Type: replace-cross Abstract: Techniques for reliable rubric-based LLM evaluation -- ensemble judging, bias mitigation, few-shot cal