Core AI
Large Language Models
Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI
Skills in this topic
5 skills — Sign in to track your progress
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding
Showing 5,127 reads from curated sources
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, and LLaMA
arXiv:2512.12812v2 Announce Type: replace-cross Abstract: Prompt engineering has emerged as a critical factor influencing large language model (LLM) performance
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Measuring all the noises of LLM Evals
arXiv:2512.21326v2 Announce Type: replace-cross Abstract: Separating signal from noise is central to experiments. Applying well-established statistical methods
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese Large Language Models
arXiv:2601.01627v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly deployed in healthcare field, it becomes essential to
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Symphonym: Universal Phonetic Embeddings for Cross-Script Name Matching
arXiv:2601.06932v4 Announce Type: replace-cross Abstract: Matching place names across writing systems is a persistent obstacle to the integration of multilingua
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Sparse-RL: Breaking the Memory Wall in LLM Reinforcement Learning via Stable Sparse Rollouts
arXiv:2601.10079v2 Announce Type: replace-cross Abstract: Reinforcement Learning (RL) has become essential for eliciting complex reasoning capabilities in Large
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
LLMs versus the Halting Problem: Revisiting Program Termination Prediction
arXiv:2601.18987v4 Announce Type: replace-cross Abstract: Determining whether a program terminates is a central problem in computer science. Turing's foundation
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction
arXiv:2601.22452v2 Announce Type: replace-cross Abstract: As AI chatbots shift from tools to companions, critical questions arise: who controls the conversation
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
TextBFGS: A Case-Based Reasoning Approach to Code Optimization via Error-Operator Retrieval
arXiv:2602.00059v2 Announce Type: replace-cross Abstract: Iterative code generation with Large Language Models (LLMs) can be viewed as an optimization process g
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic Data-Driven Comparative Evaluation
arXiv:2602.00665v2 Announce Type: replace-cross Abstract: Customer-service question answering (QA) systems increasingly rely on conversational language understa
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation
arXiv:2602.05548v3 Announce Type: replace-cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR), particularly GRPO, has become the standard for
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
A Theoretical Analysis of Test-Driven LLM Code Generation
arXiv:2602.06098v2 Announce Type: replace-cross Abstract: Coding assistants are increasingly utilized in test-driven software development, yet the theoretical m
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform
arXiv:2602.08482v2 Announce Type: replace-cross Abstract: Vessel trajectory data from the Automatic Identification System (AIS) is used widely in maritime analy
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
CoPE-VideoLM: Leveraging Codec Primitives For Efficient Video Language Modeling
arXiv:2602.13191v2 Announce Type: replace-cross Abstract: Video Language Models (VideoLMs) enable AI systems to understand temporal dynamics in videos. To fit w
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
MALLVI: A Multi-Agent Framework for Integrated Generalized Robotics Manipulation
arXiv:2602.16898v4 Announce Type: replace-cross Abstract: Task planning for robotic manipulation with large language models (LLMs) is an emerging area. Prior ap
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning
arXiv:2602.21655v2 Announce Type: replace-cross Abstract: Image captioning remains a fundamental task for vision language understanding, yet ground-truth superv
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Efficient Encoder-Free Fourier-based 3D Large Multimodal Model
arXiv:2602.23153v2 Announce Type: replace-cross Abstract: Large Multimodal Models (LMMs) that process 3D data typically rely on heavy, pre-trained visual encode
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models
arXiv:2603.01305v2 Announce Type: replace-cross Abstract: Large multimodal models (LMMs) exhibit strong task generalization capabilities, offering new opportuni
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
MetaState: Persistent Working Memory Enhances Reasoning in Discrete Diffusion Language Models
arXiv:2603.01331v2 Announce Type: replace-cross Abstract: Discrete diffusion language models (dLLMs) generate text by iteratively denoising a masked sequence. H
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Towards Privacy-Preserving LLM Inference via Covariant Obfuscation (Technical Report)
arXiv:2603.01499v2 Announce Type: replace-cross Abstract: The rapid development of large language models (LLMs) has driven the widespread adoption of cloud-base
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection
arXiv:2603.04427v4 Announce Type: replace-cross Abstract: Standard Transformer attention uses identical dimensionality for queries, keys, and values, yet these
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
arXiv:2603.07554v2 Announce Type: replace-cross Abstract: Nepal Bhasha (Newari), an endangered language of the Kathmandu Valley, remains digitally marginalized
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules
arXiv:2603.08206v4 Announce Type: replace-cross Abstract: Tabular foundation models such as TabPFN and TabICL already produce full predictive distributions, yet
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People
arXiv:2603.09964v2 Announce Type: replace-cross Abstract: As social virtual reality (VR) grows more popular, addressing accessibility for blind and low vision (
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
arXiv:2603.13793v2 Announce Type: replace-cross Abstract: Low resource languages present unique challenges for natural language processing due to the limited av
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces
arXiv:2603.14354v2 Announce Type: replace-cross Abstract: End-to-End autonomous driving (E2E-AD) systems face challenges in lifelong learning, including catastr
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
EngGPT2: Sovereign, Efficient and Open Intelligence
arXiv:2603.16430v3 Announce Type: replace-cross Abstract: EngGPT2-16B-A3B is the latest iteration of Engineering Group's Italian LLM and it's built to be a Sove
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding
arXiv:2603.16739v2 Announce Type: replace-cross Abstract: Decoding the orchestration of neural activity in electroencephalography (EEG) signals is a central cha
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds
arXiv:2603.18532v2 Announce Type: replace-cross Abstract: The strong performance of large vision-language models (VLMs) trained with reinforcement learning (RL)
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
3w ago
SmaAT-QMix-UNet: A Parameter-Efficient Vector-Quantized UNet for Precipitation Nowcasting
arXiv:2603.21879v2 Announce Type: replace-cross Abstract: Weather forecasting supports critical socioeconomic activities and complements environmental protectio
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
I finally stopped wasting tokens with Universal Claude.md
Key Takeaways Universal Claude.md can cut token use by up to 63%, which means you actually spend way less money using LLMs. Developers are fed up with prompt ha
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Dev quietly rebels against Claude’s polite padding in AI outputs
Key Takeaways Devs have been quietly frustrated with Claude’s overly polite, wordy answers for a while. Trimming Claude’s output isn’t just about saving tokens,
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Universal Claude.md lets devs hack verbosity but risks breaking Claude
Key Takeaways Devs are using Universal Claude.md to cut down Claude's wordiness and save on tokens, which means lower API bills. Cutting Claude’s longer answers
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Open source Claude.md tool just slashed my token costs
Key Takeaways An open-source tool called Claude.md just helped someone cut their AI token costs by 63%, which is wild. Most LLMs like Claude spit out a ton of u
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
My AI remembered the wrong thing and broke my build. So I built memory governance.
Six weeks ago I gave my AI assistant a memory . It worked. No more re-explaining the project every session. Bugs got fixed once and stayed fixed. Then it follow
ZDNet
🧠 Large Language Models
⚡ AI Lesson
3w ago
This privacy-first chatbot is taking off - here's why and how to try it
Users are flocking to Duck.ai. Is it a reaction to increasing concerns about AI companies and privacy? Here's what you should know.

Hackernoon
🧠 Large Language Models
⚡ AI Lesson
3w ago
The Crow-9b-heretic Model by Crownelius: Here's What You Need to Know
Crow-9B-HERETIC is a 9-billion-parameter language model built on the Qwen 3.5 architecture and distilled from Claude Opus 4.6. The model excels at reasoning tas

Hackernoon
🧠 Large Language Models
⚡ AI Lesson
3w ago
What Is LMEB? Long-Horizon Memory Embedding Benchmark Explained
The benchmark itself isn't the solution. It's the beginning of a new research direction, one forced by reality rather than chosen by preference. Models that loo

Hackernoon
🧠 Large Language Models
⚡ AI Lesson
3w ago
AI Doesn’t Lie - It Reflects
How Fragmented Signals Distort What LLMs Think Your Company Is
AI systems don’t “understand” your company—they reconstruct it from public signals. When those signals are fragmented, outdated, or inconsistent, AI outputs bec
TechCrunch AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
15% of Americans say they’d be willing to work for an AI boss, according to new poll
According to a Quinnipiac University poll, 15% of Americans say they'd be willing to have a job where their direct supervisor was an AI program that assigned ta
TechCrunch AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Popular AI gateway startup LiteLLM ditches controversial startup Delve
LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last week.

Machine Learning Mastery
🧠 Large Language Models
⚡ AI Lesson
3w ago
From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs
This article is divided into three parts; they are: • How Attention Works During Prefill • The Decode Phase of LLM Inference • KV Cache: How to Make Decode More

Forbes Innovation
🧠 Large Language Models
⚡ AI Lesson
3w ago
Apple Just Released iOS 26.5 For Developers, But 1 Major iPhone Feature Is Missing
Another iPhone update has just reached its first developer beta. There was a chance it would include the first glimpse of the brand-new Siri, but so far there’s
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
Five Hundred Copies of the Same Message in Your Agent's Brain
You send your AI agent a message. The upstream model returns a 429 — rate limited, try again later. Your agent framework dutifully retries. And retries. And ret
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
How to Get Cited within AI Searches
4 core pillars to get cited within AI searches You must shift your strategy from traditional SEO to Generative Engine Optimization (GEO). AI engines do not read
Dev.to AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
How We Built an AI Layer That Understands an Entire Agency Workspace (Not Just One Module)
We shipped the AI layer for Kobin today — an agency operating system that replaces Slack, Notion, HubSpot, Linear, and Buffer. This is the technical story of ho

The Next Web AI
🧠 Large Language Models
⚡ AI Lesson
3w ago
How AI’s capital explosion signals opportunity but also reveals a critical need for measurable ROI and meaningful impact
The current wave of investment in artificial intelligence reflects one of the largest capital shifts in modern technology, yet questions around financial return
DeepCamp AI