Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,534

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,407 Reads 5,127

Showing 5,127 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, and LLaMA

arXiv:2512.12812v2 Announce Type: replace-cross Abstract: Prompt engineering has emerged as a critical factor influencing large language model (LLM) performance

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Measuring all the noises of LLM Evals

arXiv:2512.21326v2 Announce Type: replace-cross Abstract: Separating signal from noise is central to experiments. Applying well-established statistical methods

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese Large Language Models

arXiv:2601.01627v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly deployed in healthcare field, it becomes essential to

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Symphonym: Universal Phonetic Embeddings for Cross-Script Name Matching

arXiv:2601.06932v4 Announce Type: replace-cross Abstract: Matching place names across writing systems is a persistent obstacle to the integration of multilingua

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Sparse-RL: Breaking the Memory Wall in LLM Reinforcement Learning via Stable Sparse Rollouts

arXiv:2601.10079v2 Announce Type: replace-cross Abstract: Reinforcement Learning (RL) has become essential for eliciting complex reasoning capabilities in Large

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

LLMs versus the Halting Problem: Revisiting Program Termination Prediction

arXiv:2601.18987v4 Announce Type: replace-cross Abstract: Determining whether a program terminates is a central problem in computer science. Turing's foundation

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction

arXiv:2601.22452v2 Announce Type: replace-cross Abstract: As AI chatbots shift from tools to companions, critical questions arise: who controls the conversation

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

TextBFGS: A Case-Based Reasoning Approach to Code Optimization via Error-Operator Retrieval

arXiv:2602.00059v2 Announce Type: replace-cross Abstract: Iterative code generation with Large Language Models (LLMs) can be viewed as an optimization process g

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic Data-Driven Comparative Evaluation

arXiv:2602.00665v2 Announce Type: replace-cross Abstract: Customer-service question answering (QA) systems increasingly rely on conversational language understa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation

arXiv:2602.05548v3 Announce Type: replace-cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR), particularly GRPO, has become the standard for

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

A Theoretical Analysis of Test-Driven LLM Code Generation

arXiv:2602.06098v2 Announce Type: replace-cross Abstract: Coding assistants are increasingly utilized in test-driven software development, yet the theoretical m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

arXiv:2602.08482v2 Announce Type: replace-cross Abstract: Vessel trajectory data from the Automatic Identification System (AIS) is used widely in maritime analy

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

CoPE-VideoLM: Leveraging Codec Primitives For Efficient Video Language Modeling

arXiv:2602.13191v2 Announce Type: replace-cross Abstract: Video Language Models (VideoLMs) enable AI systems to understand temporal dynamics in videos. To fit w

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

MALLVI: A Multi-Agent Framework for Integrated Generalized Robotics Manipulation

arXiv:2602.16898v4 Announce Type: replace-cross Abstract: Task planning for robotic manipulation with large language models (LLMs) is an emerging area. Prior ap

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning

arXiv:2602.21655v2 Announce Type: replace-cross Abstract: Image captioning remains a fundamental task for vision language understanding, yet ground-truth superv

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Efficient Encoder-Free Fourier-based 3D Large Multimodal Model

arXiv:2602.23153v2 Announce Type: replace-cross Abstract: Large Multimodal Models (LMMs) that process 3D data typically rely on heavy, pre-trained visual encode

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

arXiv:2603.01305v2 Announce Type: replace-cross Abstract: Large multimodal models (LMMs) exhibit strong task generalization capabilities, offering new opportuni

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

MetaState: Persistent Working Memory Enhances Reasoning in Discrete Diffusion Language Models

arXiv:2603.01331v2 Announce Type: replace-cross Abstract: Discrete diffusion language models (dLLMs) generate text by iteratively denoising a masked sequence. H

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Towards Privacy-Preserving LLM Inference via Covariant Obfuscation (Technical Report)

arXiv:2603.01499v2 Announce Type: replace-cross Abstract: The rapid development of large language models (LLMs) has driven the widespread adoption of cloud-base

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection

arXiv:2603.04427v4 Announce Type: replace-cross Abstract: Standard Transformer attention uses identical dimensionality for queries, keys, and values, yet these

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

arXiv:2603.07554v2 Announce Type: replace-cross Abstract: Nepal Bhasha (Newari), an endangered language of the Kathmandu Valley, remains digitally marginalized

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules

arXiv:2603.08206v4 Announce Type: replace-cross Abstract: Tabular foundation models such as TabPFN and TabICL already produce full predictive distributions, yet

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People

arXiv:2603.09964v2 Announce Type: replace-cross Abstract: As social virtual reality (VR) grows more popular, addressing accessibility for blind and low vision (

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages

arXiv:2603.13793v2 Announce Type: replace-cross Abstract: Low resource languages present unique challenges for natural language processing due to the limited av

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces

arXiv:2603.14354v2 Announce Type: replace-cross Abstract: End-to-End autonomous driving (E2E-AD) systems face challenges in lifelong learning, including catastr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

EngGPT2: Sovereign, Efficient and Open Intelligence

arXiv:2603.16430v3 Announce Type: replace-cross Abstract: EngGPT2-16B-A3B is the latest iteration of Engineering Group's Italian LLM and it's built to be a Sove

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding

arXiv:2603.16739v2 Announce Type: replace-cross Abstract: Decoding the orchestration of neural activity in electroencephalography (EEG) signals is a central cha

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

arXiv:2603.18532v2 Announce Type: replace-cross Abstract: The strong performance of large vision-language models (VLMs) trained with reinforcement learning (RL)

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

SmaAT-QMix-UNet: A Parameter-Efficient Vector-Quantized UNet for Precipitation Nowcasting

arXiv:2603.21879v2 Announce Type: replace-cross Abstract: Weather forecasting supports critical socioeconomic activities and complements environmental protectio

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

I finally stopped wasting tokens with Universal Claude.md

Key Takeaways Universal Claude.md can cut token use by up to 63%, which means you actually spend way less money using LLMs. Developers are fed up with prompt ha

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Dev quietly rebels against Claude’s polite padding in AI outputs

Key Takeaways Devs have been quietly frustrated with Claude’s overly polite, wordy answers for a while. Trimming Claude’s output isn’t just about saving tokens,

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Universal Claude.md lets devs hack verbosity but risks breaking Claude

Key Takeaways Devs are using Universal Claude.md to cut down Claude's wordiness and save on tokens, which means lower API bills. Cutting Claude’s longer answers

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Open source Claude.md tool just slashed my token costs

Key Takeaways An open-source tool called Claude.md just helped someone cut their AI token costs by 63%, which is wild. Most LLMs like Claude spit out a ton of u

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

My AI remembered the wrong thing and broke my build. So I built memory governance.

Six weeks ago I gave my AI assistant a memory . It worked. No more re-explaining the project every session. Bugs got fixed once and stayed fixed. Then it follow

ZDNet 🧠 Large Language Models ⚡ AI Lesson 3w ago

This privacy-first chatbot is taking off - here's why and how to try it

Users are flocking to Duck.ai. Is it a reaction to increasing concerns about AI companies and privacy? Here's what you should know.

The Crow-9b-heretic Model by Crownelius: Here's What You Need to Know

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

The Crow-9b-heretic Model by Crownelius: Here's What You Need to Know

Crow-9B-HERETIC is a 9-billion-parameter language model built on the Qwen 3.5 architecture and distilled from Claude Opus 4.6. The model excels at reasoning tas

What Is LMEB? Long-Horizon Memory Embedding Benchmark Explained

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

What Is LMEB? Long-Horizon Memory Embedding Benchmark Explained

The benchmark itself isn't the solution. It's the beginning of a new research direction, one forced by reality rather than chosen by preference. Models that loo

AI Doesn’t Lie - It Reflects
How Fragmented Signals Distort What LLMs Think Your Company Is

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago

AI Doesn’t Lie - It Reflects How Fragmented Signals Distort What LLMs Think Your Company Is

AI systems don’t “understand” your company—they reconstruct it from public signals. When those signals are fragmented, outdated, or inconsistent, AI outputs bec

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

15% of Americans say they’d be willing to work for an AI boss, according to new poll

According to a Quinnipiac University poll, 15% of Americans say they'd be willing to have a job where their direct supervisor was an AI program that assigned ta

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Popular AI gateway startup LiteLLM ditches controversial startup Delve

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last week.

From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs

Machine Learning Mastery 🧠 Large Language Models ⚡ AI Lesson 3w ago

From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs

This article is divided into three parts; they are: • How Attention Works During Prefill • The Decode Phase of LLM Inference • KV Cache: How to Make Decode More

Apple Just Released iOS 26.5 For Developers, But 1 Major iPhone Feature Is Missing

Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 3w ago

Apple Just Released iOS 26.5 For Developers, But 1 Major iPhone Feature Is Missing

Another iPhone update has just reached its first developer beta. There was a chance it would include the first glimpse of the brand-new Siri, but so far there’s

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

Five Hundred Copies of the Same Message in Your Agent's Brain

You send your AI agent a message. The upstream model returns a 429 — rate limited, try again later. Your agent framework dutifully retries. And retries. And ret

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

How to Get Cited within AI Searches

4 core pillars to get cited within AI searches You must shift your strategy from traditional SEO to Generative Engine Optimization (GEO). AI engines do not read

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

How We Built an AI Layer That Understands an Entire Agency Workspace (Not Just One Module)

We shipped the AI layer for Kobin today — an agency operating system that replaces Slack, Notion, HubSpot, Linear, and Buffer. This is the technical story of ho

How AI’s capital explosion signals opportunity but also reveals a critical need for measurable ROI and meaningful impact

The Next Web AI 🧠 Large Language Models ⚡ AI Lesson 3w ago

How AI’s capital explosion signals opportunity but also reveals a critical need for measurable ROI and meaningful impact

The current wave of investment in artificial intelligence reflects one of the largest capital shifts in modern technology, yet questions around financial return