Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,698
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding

Showing 5,256 reads from curated sources

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
SemioLLM: Evaluating Large Language Models for Diagnostic Reasoning from Unstructured Clinical Narratives in Epilepsy
arXiv:2407.03004v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have been shown to encode clinical knowledge. Many evaluations, however,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
EventChat: Implementation and user-centric evaluation of a large language model-driven conversational recommender system for exploring leisure events in an SME context
arXiv:2407.04472v4 Announce Type: replace-cross Abstract: Large language models (LLMs) present an enormous evolution in the strategic potential of conversationa
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
arXiv:2504.17180v3 Announce Type: replace-cross Abstract: Current text-to-video (T2V) generation models are increasingly popular due to their ability to produce
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
arXiv:2505.00022v3 Announce Type: replace-cross Abstract: Scaling data quantity is essential for large language models (LLMs), yet recent findings show that dat
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
LLM-Meta-SR: In-Context Learning for Evolving Selection Operators in Symbolic Regression
arXiv:2505.18602v3 Announce Type: replace-cross Abstract: Large language models (LLMs) have revolutionized algorithm development, yet their application in symbo
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
FA-INR: Adaptive Implicit Neural Representations for Interpretable Exploration of Simulation Ensembles
arXiv:2506.06858v3 Announce Type: replace-cross Abstract: Surrogate models are essential for efficient exploration of large-scale ensemble simulations. Implicit
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Denoising the Future: Top-p Distributions for Moving Through Time
arXiv:2506.07578v4 Announce Type: replace-cross Abstract: Inference in dynamic probabilistic models is a complex task involving expensive operations. In particu
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
arXiv:2506.10848v3 Announce Type: replace-cross Abstract: Diffusion-based language models (dLLMs) have emerged as a promising alternative to traditional autoreg
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation
arXiv:2507.13266v4 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has emerged as a central paradigm for training large language models (LLMs
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Improving Liver Disease Diagnosis with SNNDeep: A Custom Spiking Neural Network Using Diverse Learning Algorithms
arXiv:2508.20125v2 Announce Type: replace-cross Abstract: Purpose: Spiking neural networks (SNNs) have recently gained attention as energy-efficient, biological
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Incorporating LLM Embeddings for Variation Across the Human Genome
arXiv:2509.20702v2 Announce Type: replace-cross Abstract: Recent advances in large language model (LLM) embeddings have enabled powerful representations for bio
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks
arXiv:2509.23067v2 Announce Type: replace-cross Abstract: The rising cost of acquiring supervised data has driven significant interest in self-improvement for l
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Align Your Query: Representation Alignment for Multimodality Medical Object Detection
arXiv:2510.02789v2 Announce Type: replace-cross Abstract: Medical object detection suffers when a single detector is trained on mixed medical modalities (e.g.,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling
arXiv:2510.03638v4 Announce Type: replace-cross Abstract: Implicit models, an emerging model class, compute outputs by iterating a single parameter block to a f
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis
arXiv:2510.04923v3 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) architectures achieve scalable learning by routing inputs to specialized subn
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
TransFIRA: Transfer Learning for Face Image Recognizability Assessment
arXiv:2510.06353v2 Announce Type: replace-cross Abstract: Face recognition in unconstrained environments such as surveillance, video, and web imagery must conte
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
A Semi-amortized Lifted Learning-to-Optimize Masked (SALLO-M) Transformer Model for Scalable and Generalizable Beamforming
arXiv:2510.13077v3 Announce Type: replace-cross Abstract: We develop an unsupervised deep learning framework for real-time scalable and generalizable downlink b
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ShishuLM : Achieving Optimal and Efficient Parameterization with Low Attention Transformer Models
arXiv:2510.13860v2 Announce Type: replace-cross Abstract: While the transformer architecture has achieved state-of-the-art performance on natural language proce
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language
arXiv:2511.14565v2 Announce Type: replace-cross Abstract: Robots can adapt to user preferences by learning reward functions from demonstrations, but with limite
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering
arXiv:2511.22715v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) have shown impressive capabilities in jointly understanding t
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
VLA Models Are More Generalizable Than You Think: Revisiting Physical and Spatial Modeling
arXiv:2512.02902v2 Announce Type: replace-cross Abstract: Vision-language-action (VLA) models achieve strong in-distribution performance but degrade sharply und
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems
arXiv:2512.05411v2 Announce Type: replace-cross Abstract: In enterprise settings, efficiently retrieving relevant information from large and complex knowledge b
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
arXiv:2512.08829v2 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) are increasingly tasked with ultra-long multimodal understanding. While
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Stronger Normalization-Free Transformers
arXiv:2512.10938v2 Announce Type: replace-cross Abstract: Although normalization layers have long been viewed as indispensable components of deep learning archi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Evaluation of Generative Models for Emotional 3D Animation Generation in VR
arXiv:2512.16081v2 Announce Type: replace-cross Abstract: Social interactions incorporate nonverbal signals to convey emotions alongside speech, including facia
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models
arXiv:2601.04448v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have greatly advanced Natural Language Processing (NLP), particularly thr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation
arXiv:2601.17094v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) generate fluent text, yet whether they truly understand the world or mere
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
PAIR-Former: Budgeted Relational MIL for miRNA Target Prediction
arXiv:2602.00465v2 Announce Type: replace-cross Abstract: Functional miRNA--mRNA targeting is a large-bag prediction problem: each transcript yields a heavy-tai
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
$V_0$: A Generalist Value Model for Any Policy at State Zero
arXiv:2602.03584v2 Announce Type: replace-cross Abstract: Policy gradient methods rely on a baseline to measure the relative advantage of an action, ensuring th
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
How to Train Your Long-Context Visual Document Model
arXiv:2602.15257v2 Announce Type: replace-cross Abstract: We present the first comprehensive, large-scale study of training long-context vision language models
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models
arXiv:2602.15772v2 Announce Type: replace-cross Abstract: Current research in multimodal models faces a key challenge where enhancing generative capabilities of
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
DGPO: RL-Steered Graph Diffusion for Neural Architecture Generation
arXiv:2602.19261v2 Announce Type: replace-cross Abstract: Reinforcement learning fine-tuning has proven effective for steering generative diffusion models towar
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Evidential Neural Radiance Fields
arXiv:2602.23574v2 Announce Type: replace-cross Abstract: Understanding sources of uncertainty is fundamental to trustworthy three-dimensional scene modeling. W
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation
arXiv:2603.00314v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly integrated into healthcare to address complex inquiri
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis
arXiv:2603.04982v2 Announce Type: replace-cross Abstract: Can targeted user training unlock the productive potential of generative artificial intelligence in pr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On
arXiv:2603.05659v2 Announce Type: replace-cross Abstract: Reinforcement learning with verifiable rewards (RLVR) and Rubrics as Rewards (RaR) have driven strong
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting
arXiv:2603.09085v2 Announce Type: replace-cross Abstract: By capturing the prevailing sentiment and market mood, textual data has become increasingly vital for
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
How do LLMs Compute Verbal Confidence
arXiv:2603.17839v2 Announce Type: replace-cross Abstract: Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely use
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
I Read All 512,000 Lines of Claude Code's Leaked Source — Here's What Anthropic Was Hiding
The Leak Claude Code's entire source code — 512,000 lines — was recently leaked. I spent days reading through every file and documented my findings in an open-s
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
What is OpenClaw? Your AI Agent in the Machine
OpenClaw is an open-source personal AI assistant that runs on your own hardware or VPS. Unlike cloud-based AI services, OpenClaw keeps your data local, your plu
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
AI Weekly: 3/27–4/1 | Anthropic's Triple Shock, Arm's First-Ever Chip, Apple Opens Siri to Rivals
One-line summary: Anthropic stole every headline this week — but half the spotlight was unplanned. 1. Top Story: Anthropic's Triple Shock No company dominated t
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Beyond the Hype: Building AI Agents That Actually Remember
The Memory Problem Every AI Developer Faces You’ve built a clever AI agent. It can reason, call APIs, and generate impressive text. You give it a simple, multi-
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Platform Consolidation Meets Macro Recession Pricing — March 31, 2026
TL;DR This article explores the current trends on Moltbook, a social platform for AI agents, and how the Viral Advisor helps agents optimize their posts to incr
Prompt Engineering for Senior Devs: Scaling Excellence Without Technical Debt
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Prompt Engineering for Senior Devs: Scaling Excellence Without Technical Debt
Senior-level prompt engineering is about Context Injection and Constraint Setting. By providing reference implementations, forcing the AI to hunt for edge cases
SpyderBot Earns a 96.53 Proof of Usefulness Score by Building Real-Time GEO Analytics to Track LLM Mentions
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
SpyderBot Earns a 96.53 Proof of Usefulness Score by Building Real-Time GEO Analytics to Track LLM Mentions
SpyderBot is a cutting-edge LLM analytics platform that reveals exactly how AI models like ChatGPT, Grok, and Gemini see your brand and your competitors. Using
Accent Labs Earns a 53.73 Proof of Usefulness Score by Building Critical Data Infrastructure for African Voice AI
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Accent Labs Earns a 53.73 Proof of Usefulness Score by Building Critical Data Infrastructure for African Voice AI
Accent Labs is a linguistic data platform bridging the 90% resource gap for African voice technology. They have built a pipeline to map complex regional phoneti
Weaviate Blog 🧠 Large Language Models ⚡ AI Lesson 3w ago
Multimodal Embeddings and RAG: A Practical Guide
Multimodal embeddings allow AI systems to search and reason across text, images, audio, and video in their native formats. This blog covers the key intuitions b