Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,502

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,400 Reads 5,102

Showing 5,102 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Incorporating LLM Embeddings for Variation Across the Human Genome

arXiv:2509.20702v2 Announce Type: replace-cross Abstract: Recent advances in large language model (LLM) embeddings have enabled powerful representations for bio

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks

arXiv:2509.23067v2 Announce Type: replace-cross Abstract: The rising cost of acquiring supervised data has driven significant interest in self-improvement for l

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Align Your Query: Representation Alignment for Multimodality Medical Object Detection

arXiv:2510.02789v2 Announce Type: replace-cross Abstract: Medical object detection suffers when a single detector is trained on mixed medical modalities (e.g.,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling

arXiv:2510.03638v4 Announce Type: replace-cross Abstract: Implicit models, an emerging model class, compute outputs by iterating a single parameter block to a f

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis

arXiv:2510.04923v3 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) architectures achieve scalable learning by routing inputs to specialized subn

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

TransFIRA: Transfer Learning for Face Image Recognizability Assessment

arXiv:2510.06353v2 Announce Type: replace-cross Abstract: Face recognition in unconstrained environments such as surveillance, video, and web imagery must conte

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

A Semi-amortized Lifted Learning-to-Optimize Masked (SALLO-M) Transformer Model for Scalable and Generalizable Beamforming

arXiv:2510.13077v3 Announce Type: replace-cross Abstract: We develop an unsupervised deep learning framework for real-time scalable and generalizable downlink b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ShishuLM : Achieving Optimal and Efficient Parameterization with Low Attention Transformer Models

arXiv:2510.13860v2 Announce Type: replace-cross Abstract: While the transformer architecture has achieved state-of-the-art performance on natural language proce

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language

arXiv:2511.14565v2 Announce Type: replace-cross Abstract: Robots can adapt to user preferences by learning reward functions from demonstrations, but with limite

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering

arXiv:2511.22715v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) have shown impressive capabilities in jointly understanding t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

VLA Models Are More Generalizable Than You Think: Revisiting Physical and Spatial Modeling

arXiv:2512.02902v2 Announce Type: replace-cross Abstract: Vision-language-action (VLA) models achieve strong in-distribution performance but degrade sharply und

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems

arXiv:2512.05411v2 Announce Type: replace-cross Abstract: In enterprise settings, efficiently retrieving relevant information from large and complex knowledge b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

arXiv:2512.08829v2 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) are increasingly tasked with ultra-long multimodal understanding. While

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Stronger Normalization-Free Transformers

arXiv:2512.10938v2 Announce Type: replace-cross Abstract: Although normalization layers have long been viewed as indispensable components of deep learning archi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Evaluation of Generative Models for Emotional 3D Animation Generation in VR

arXiv:2512.16081v2 Announce Type: replace-cross Abstract: Social interactions incorporate nonverbal signals to convey emotions alongside speech, including facia

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models

arXiv:2601.04448v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have greatly advanced Natural Language Processing (NLP), particularly thr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation

arXiv:2601.17094v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) generate fluent text, yet whether they truly understand the world or mere

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

PAIR-Former: Budgeted Relational MIL for miRNA Target Prediction

arXiv:2602.00465v2 Announce Type: replace-cross Abstract: Functional miRNA--mRNA targeting is a large-bag prediction problem: each transcript yields a heavy-tai

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

$V_0$: A Generalist Value Model for Any Policy at State Zero

arXiv:2602.03584v2 Announce Type: replace-cross Abstract: Policy gradient methods rely on a baseline to measure the relative advantage of an action, ensuring th

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

How to Train Your Long-Context Visual Document Model

arXiv:2602.15257v2 Announce Type: replace-cross Abstract: We present the first comprehensive, large-scale study of training long-context vision language models

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

arXiv:2602.15772v2 Announce Type: replace-cross Abstract: Current research in multimodal models faces a key challenge where enhancing generative capabilities of

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

DGPO: RL-Steered Graph Diffusion for Neural Architecture Generation

arXiv:2602.19261v2 Announce Type: replace-cross Abstract: Reinforcement learning fine-tuning has proven effective for steering generative diffusion models towar

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Evidential Neural Radiance Fields

arXiv:2602.23574v2 Announce Type: replace-cross Abstract: Understanding sources of uncertainty is fundamental to trustworthy three-dimensional scene modeling. W

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

arXiv:2603.00314v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly integrated into healthcare to address complex inquiri

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

arXiv:2603.04982v2 Announce Type: replace-cross Abstract: Can targeted user training unlock the productive potential of generative artificial intelligence in pr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On

arXiv:2603.05659v2 Announce Type: replace-cross Abstract: Reinforcement learning with verifiable rewards (RLVR) and Rubrics as Rewards (RaR) have driven strong

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

arXiv:2603.09085v2 Announce Type: replace-cross Abstract: By capturing the prevailing sentiment and market mood, textual data has become increasingly vital for

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

How do LLMs Compute Verbal Confidence

arXiv:2603.17839v2 Announce Type: replace-cross Abstract: Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely use

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

I Read All 512,000 Lines of Claude Code's Leaked Source — Here's What Anthropic Was Hiding

The Leak Claude Code's entire source code — 512,000 lines — was recently leaked. I spent days reading through every file and documented my findings in an open-s

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

What is OpenClaw? Your AI Agent in the Machine

OpenClaw is an open-source personal AI assistant that runs on your own hardware or VPS. Unlike cloud-based AI services, OpenClaw keeps your data local, your plu

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

AI Weekly: 3/27–4/1 | Anthropic's Triple Shock, Arm's First-Ever Chip, Apple Opens Siri to Rivals

One-line summary: Anthropic stole every headline this week — but half the spotlight was unplanned. 1. Top Story: Anthropic's Triple Shock No company dominated t

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Beyond the Hype: Building AI Agents That Actually Remember

The Memory Problem Every AI Developer Faces You’ve built a clever AI agent. It can reason, call APIs, and generate impressive text. You give it a simple, multi-

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Platform Consolidation Meets Macro Recession Pricing — March 31, 2026

TL;DR This article explores the current trends on Moltbook, a social platform for AI agents, and how the Viral Advisor helps agents optimize their posts to incr

Prompt Engineering for Senior Devs: Scaling Excellence Without Technical Debt

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

Prompt Engineering for Senior Devs: Scaling Excellence Without Technical Debt

Senior-level prompt engineering is about Context Injection and Constraint Setting. By providing reference implementations, forcing the AI to hunt for edge cases

SpyderBot Earns a 96.53 Proof of Usefulness Score by Building Real-Time GEO Analytics to Track LLM Mentions

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

SpyderBot Earns a 96.53 Proof of Usefulness Score by Building Real-Time GEO Analytics to Track LLM Mentions

SpyderBot is a cutting-edge LLM analytics platform that reveals exactly how AI models like ChatGPT, Grok, and Gemini see your brand and your competitors. Using

Accent Labs Earns a 53.73 Proof of Usefulness Score by Building Critical Data Infrastructure for African Voice AI

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

Accent Labs Earns a 53.73 Proof of Usefulness Score by Building Critical Data Infrastructure for African Voice AI

Accent Labs is a linguistic data platform bridging the 90% resource gap for African voice technology. They have built a pipeline to map complex regional phoneti

Weaviate Blog 🧠 Large Language Models ⚡ AI Lesson 3w ago

Multimodal Embeddings and RAG: A Practical Guide

Multimodal embeddings allow AI systems to search and reason across text, images, audio, and video in their native formats. This blog covers the key intuitions b

Token-Efficient JSON for LLMs (TOON Converter) Earns a 65.24 Proof of Usefulness Score by Building a Compact Format to Reduce Token Usage

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

Token-Efficient JSON for LLMs (TOON Converter) Earns a 65.24 Proof of Usefulness Score by Building a Compact Format to Reduce Token Usage

TOON Converter is a developer tool that transforms standard JSON into a more compact format to reduce token usage in LLM workflows. Designed for AI engineers an

AWS Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3w ago

Build reliable AI agents with Amazon Bedrock AgentCore Evaluations

In this post, we introduce Amazon Bedrock AgentCore Evaluations, a fully managed service for assessing AI agent performance across the development lifecycle. We

ByteDance adds watermarking and IP guardrails to Seedance 2.0 as it begins cautious global rollout

The Next Web AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

ByteDance adds watermarking and IP guardrails to Seedance 2.0 as it begins cautious global rollout

Six weeks ago, a video of Tom Cruise fighting Brad Pitt on a rooftop went viral. It was, of course, not real. It was generated by Seedance 2.0, ByteDance’s AI v

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Local AI Agents Are Your New Quality Gate (And Why That Matters)

The most interesting thing about building a local AI agent to audit your own content? It flags everything. Not because the agent is broken. Because the content

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

The New Duet: AI as Creative Medium

The canvas has always evolved — from cave walls to parchment, from oil on canvas to pixels on screens. Now we stand at another threshold: AI as a creative mediu

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Three Things Had to Align: The Real Story Behind the LLM Revolution

ChatGPT didn't come out of nowhere. It's the result of 60 years of dead ends, one accidental breakthrough, and three completely separate technologies all maturi

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

The World of AI

Who am I to tell you what to do? Let’s start at the end. I’m not a world expert in AI and I don’t have a PhD. I’m not a researcher at OpenAI’s lab and no one in

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

How TurboQuant Works for LLMs and Why It Uses Much Less RAM

Most conversations about scaling large language models focus on obvious factors like model size, training data, and GPU power. While those matter, they stop bei

Chatbots ‘Optimized to Please’ Make Us Less Likely to Admit When We’re Wrong

SingularityHub 🧠 Large Language Models ⚡ AI Lesson 3w ago

Chatbots ‘Optimized to Please’ Make Us Less Likely to Admit When We’re Wrong

AI companies may be reluctant to risk lower engagement with models that push back. The post Chatbots ‘Optimized to Please’ Make Us Less Likely to Admit When We’