Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,502
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding

Showing 5,102 reads from curated sources

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Incorporating LLM Embeddings for Variation Across the Human Genome
arXiv:2509.20702v2 Announce Type: replace-cross Abstract: Recent advances in large language model (LLM) embeddings have enabled powerful representations for bio
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks
arXiv:2509.23067v2 Announce Type: replace-cross Abstract: The rising cost of acquiring supervised data has driven significant interest in self-improvement for l
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Align Your Query: Representation Alignment for Multimodality Medical Object Detection
arXiv:2510.02789v2 Announce Type: replace-cross Abstract: Medical object detection suffers when a single detector is trained on mixed medical modalities (e.g.,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling
arXiv:2510.03638v4 Announce Type: replace-cross Abstract: Implicit models, an emerging model class, compute outputs by iterating a single parameter block to a f
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis
arXiv:2510.04923v3 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) architectures achieve scalable learning by routing inputs to specialized subn
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
TransFIRA: Transfer Learning for Face Image Recognizability Assessment
arXiv:2510.06353v2 Announce Type: replace-cross Abstract: Face recognition in unconstrained environments such as surveillance, video, and web imagery must conte
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
A Semi-amortized Lifted Learning-to-Optimize Masked (SALLO-M) Transformer Model for Scalable and Generalizable Beamforming
arXiv:2510.13077v3 Announce Type: replace-cross Abstract: We develop an unsupervised deep learning framework for real-time scalable and generalizable downlink b
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ShishuLM : Achieving Optimal and Efficient Parameterization with Low Attention Transformer Models
arXiv:2510.13860v2 Announce Type: replace-cross Abstract: While the transformer architecture has achieved state-of-the-art performance on natural language proce
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language
arXiv:2511.14565v2 Announce Type: replace-cross Abstract: Robots can adapt to user preferences by learning reward functions from demonstrations, but with limite
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering
arXiv:2511.22715v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) have shown impressive capabilities in jointly understanding t
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
VLA Models Are More Generalizable Than You Think: Revisiting Physical and Spatial Modeling
arXiv:2512.02902v2 Announce Type: replace-cross Abstract: Vision-language-action (VLA) models achieve strong in-distribution performance but degrade sharply und
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems
arXiv:2512.05411v2 Announce Type: replace-cross Abstract: In enterprise settings, efficiently retrieving relevant information from large and complex knowledge b
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
arXiv:2512.08829v2 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) are increasingly tasked with ultra-long multimodal understanding. While
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Stronger Normalization-Free Transformers
arXiv:2512.10938v2 Announce Type: replace-cross Abstract: Although normalization layers have long been viewed as indispensable components of deep learning archi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Evaluation of Generative Models for Emotional 3D Animation Generation in VR
arXiv:2512.16081v2 Announce Type: replace-cross Abstract: Social interactions incorporate nonverbal signals to convey emotions alongside speech, including facia
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models
arXiv:2601.04448v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have greatly advanced Natural Language Processing (NLP), particularly thr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation
arXiv:2601.17094v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) generate fluent text, yet whether they truly understand the world or mere
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
PAIR-Former: Budgeted Relational MIL for miRNA Target Prediction
arXiv:2602.00465v2 Announce Type: replace-cross Abstract: Functional miRNA--mRNA targeting is a large-bag prediction problem: each transcript yields a heavy-tai
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
$V_0$: A Generalist Value Model for Any Policy at State Zero
arXiv:2602.03584v2 Announce Type: replace-cross Abstract: Policy gradient methods rely on a baseline to measure the relative advantage of an action, ensuring th
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
How to Train Your Long-Context Visual Document Model
arXiv:2602.15257v2 Announce Type: replace-cross Abstract: We present the first comprehensive, large-scale study of training long-context vision language models
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models
arXiv:2602.15772v2 Announce Type: replace-cross Abstract: Current research in multimodal models faces a key challenge where enhancing generative capabilities of
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
DGPO: RL-Steered Graph Diffusion for Neural Architecture Generation
arXiv:2602.19261v2 Announce Type: replace-cross Abstract: Reinforcement learning fine-tuning has proven effective for steering generative diffusion models towar
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Evidential Neural Radiance Fields
arXiv:2602.23574v2 Announce Type: replace-cross Abstract: Understanding sources of uncertainty is fundamental to trustworthy three-dimensional scene modeling. W
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation
arXiv:2603.00314v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly integrated into healthcare to address complex inquiri
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis
arXiv:2603.04982v2 Announce Type: replace-cross Abstract: Can targeted user training unlock the productive potential of generative artificial intelligence in pr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On
arXiv:2603.05659v2 Announce Type: replace-cross Abstract: Reinforcement learning with verifiable rewards (RLVR) and Rubrics as Rewards (RaR) have driven strong
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting
arXiv:2603.09085v2 Announce Type: replace-cross Abstract: By capturing the prevailing sentiment and market mood, textual data has become increasingly vital for
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
How do LLMs Compute Verbal Confidence
arXiv:2603.17839v2 Announce Type: replace-cross Abstract: Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely use
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
I Read All 512,000 Lines of Claude Code's Leaked Source — Here's What Anthropic Was Hiding
The Leak Claude Code's entire source code — 512,000 lines — was recently leaked. I spent days reading through every file and documented my findings in an open-s
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
What is OpenClaw? Your AI Agent in the Machine
OpenClaw is an open-source personal AI assistant that runs on your own hardware or VPS. Unlike cloud-based AI services, OpenClaw keeps your data local, your plu
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
AI Weekly: 3/27–4/1 | Anthropic's Triple Shock, Arm's First-Ever Chip, Apple Opens Siri to Rivals
One-line summary: Anthropic stole every headline this week — but half the spotlight was unplanned. 1. Top Story: Anthropic's Triple Shock No company dominated t
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Beyond the Hype: Building AI Agents That Actually Remember
The Memory Problem Every AI Developer Faces You’ve built a clever AI agent. It can reason, call APIs, and generate impressive text. You give it a simple, multi-
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Platform Consolidation Meets Macro Recession Pricing — March 31, 2026
TL;DR This article explores the current trends on Moltbook, a social platform for AI agents, and how the Viral Advisor helps agents optimize their posts to incr
Prompt Engineering for Senior Devs: Scaling Excellence Without Technical Debt
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Prompt Engineering for Senior Devs: Scaling Excellence Without Technical Debt
Senior-level prompt engineering is about Context Injection and Constraint Setting. By providing reference implementations, forcing the AI to hunt for edge cases
SpyderBot Earns a 96.53 Proof of Usefulness Score by Building Real-Time GEO Analytics to Track LLM Mentions
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
SpyderBot Earns a 96.53 Proof of Usefulness Score by Building Real-Time GEO Analytics to Track LLM Mentions
SpyderBot is a cutting-edge LLM analytics platform that reveals exactly how AI models like ChatGPT, Grok, and Gemini see your brand and your competitors. Using
Accent Labs Earns a 53.73 Proof of Usefulness Score by Building Critical Data Infrastructure for African Voice AI
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Accent Labs Earns a 53.73 Proof of Usefulness Score by Building Critical Data Infrastructure for African Voice AI
Accent Labs is a linguistic data platform bridging the 90% resource gap for African voice technology. They have built a pipeline to map complex regional phoneti
Weaviate Blog 🧠 Large Language Models ⚡ AI Lesson 3w ago
Multimodal Embeddings and RAG: A Practical Guide
Multimodal embeddings allow AI systems to search and reason across text, images, audio, and video in their native formats. This blog covers the key intuitions b
Token-Efficient JSON for LLMs (TOON Converter) Earns a 65.24 Proof of Usefulness Score by Building a Compact Format to Reduce Token Usage
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Token-Efficient JSON for LLMs (TOON Converter) Earns a 65.24 Proof of Usefulness Score by Building a Compact Format to Reduce Token Usage
TOON Converter is a developer tool that transforms standard JSON into a more compact format to reduce token usage in LLM workflows. Designed for AI engineers an
AWS Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3w ago
Build reliable AI agents with Amazon Bedrock AgentCore Evaluations
In this post, we introduce Amazon Bedrock AgentCore Evaluations, a fully managed service for assessing AI agent performance across the development lifecycle. We
ByteDance adds watermarking and IP guardrails to Seedance 2.0 as it begins cautious global rollout
The Next Web AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
ByteDance adds watermarking and IP guardrails to Seedance 2.0 as it begins cautious global rollout
Six weeks ago, a video of Tom Cruise fighting Brad Pitt on a rooftop went viral. It was, of course, not real. It was generated by Seedance 2.0, ByteDance’s AI v
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Local AI Agents Are Your New Quality Gate (And Why That Matters)
The most interesting thing about building a local AI agent to audit your own content? It flags everything. Not because the agent is broken. Because the content
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
The New Duet: AI as Creative Medium
The canvas has always evolved — from cave walls to parchment, from oil on canvas to pixels on screens. Now we stand at another threshold: AI as a creative mediu
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Three Things Had to Align: The Real Story Behind the LLM Revolution
ChatGPT didn't come out of nowhere. It's the result of 60 years of dead ends, one accidental breakthrough, and three completely separate technologies all maturi
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
The World of AI
Who am I to tell you what to do? Let’s start at the end. I’m not a world expert in AI and I don’t have a PhD. I’m not a researcher at OpenAI’s lab and no one in
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
How TurboQuant Works for LLMs and Why It Uses Much Less RAM
Most conversations about scaling large language models focus on obvious factors like model size, training data, and GPU power. While those matter, they stop bei
Chatbots ‘Optimized to Please’ Make Us Less Likely to Admit When We’re Wrong
SingularityHub 🧠 Large Language Models ⚡ AI Lesson 3w ago
Chatbots ‘Optimized to Please’ Make Us Less Likely to Admit When We’re Wrong
AI companies may be reluctant to risk lower engagement with models that push back. The post Chatbots ‘Optimized to Please’ Make Us Less Likely to Admit When We’