Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

25,025

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,465 Reads 5,560

Showing 5,560 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 4y ago

We’ve created an improved version of OpenAI Codex, our AI system that translates natural language to code, and we are releasing it through our API in private be

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 4y ago

Introducing Triton: Open-source GPU programming for neural networks

We’re releasing Triton 1.0, an open-source Python-like programming language which enables researchers with no CUDA experience to write highly efficient GPU code

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4y ago

After five years, Distill will be taking a break.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 4y ago

Improving language model behavior by training on a curated dataset

Our latest research finds we can improve language model behavior with respect to specific behavioral values by fine-tuning on a small, curated dataset.

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 4y ago

Few-shot learning in practice: GPT-Neo and the 🤗 Accelerated Inference API

Lilian Weng's Blog 🧠 Large Language Models ⚡ AI Lesson 4y ago

Contrastive Representation Learning

The goal of contrastive representation learning is to learn such an embedding space in which similar sample pairs stay close to each other while dissimilar ones

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 4y ago

OpenAI Scholars 2021: Final projects

We’re proud to announce that the 2021 class of OpenAI Scholars has completed our six-month mentorship program and have produced an open-source research project

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 4y ago

Adversarial Reprogramming of Neural Cellular Automata

Reprogramming Neural CA to exhibit novel behaviour, using adversarial attacks.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 4y ago

Will Hurd joins OpenAI’s board of directors

OpenAI is committed to developing general-purpose artificial intelligence that benefits all humanity, and we believe that achieving our goal requires expertise

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5y ago

Branch Specialization

When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.

Weaviate Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Weaviate 1.2 release - transformer models

Weaviate v1.2 introduced support for transformers (DistilBERT, BERT, RoBERTa, Sentence-BERT, etc) to vectorize and semantically search through your data.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

GPT-3 powers the next generation of apps

Over 300 applications are delivering GPT-3–powered search, conversation, text completion, and other advanced AI features through our API.

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

The Partnership: Amazon SageMaker and Hugging Face

Lilian Weng's Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Reducing Toxicity in Language Models

Large pretrained language models are trained over a sizable collection of online data. They unavoidably acquire certain toxic behavior

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

Multimodal neurons in artificial neural networks

We’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually. This may explain CLIP’s accuracy i

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Simple considerations for simple people building fancy neural networks

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Retrieval Augmented Generation with Huggingface Transformers and Ray

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Hugging Face on PyTorch / XLA TPUs

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Faster TensorFlow models in Hugging Face Transformers

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Fit More and Train Faster With ZeRO via DeepSpeed and FairScale

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

Organizational update from OpenAI

It’s been a year of dramatic change and growth at OpenAI.

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5y ago

Understanding RL Vision

With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.

Hugging Face Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Hyperparameter Search with Transformers and Ray Tune

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

OpenAI licenses GPT-3 technology to Microsoft

OpenAI has agreed to license GPT-3 to Microsoft for their own products and services.

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5y ago

Thread: Differentiable Self-organizing Systems

A collection of articles and comments with the goal of understanding how to design robust and general purpose self-organizing systems.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

OpenAI Scholars 2020: Final projects

Our third class of OpenAI Scholars presented their final projects at virtual Demo Day, showcasing their research results from over the past five months.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

Procgen and MineRL Competitions

We’re excited to announce that OpenAI is co-organizing two NeurIPS 2020 competitions with AIcrowd, Carnegie Mellon University, and DeepMind, using Procgen Bench

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

We’re releasing an API for accessing new AI models developed by OpenAI.

Lilian Weng's Blog 🧠 Large Language Models ⚡ AI Lesson 5y ago

Exploration Strategies in Deep Reinforcement Learning

[Updated on 2020-06-17: Add “exploration via disagreement” in the “Forward Dynamics” section . Exploitation versus ex

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

AI and efficiency

We’re releasing an analysis showing that since 2012 the amount of compute needed to train a neural net to the same performance on ImageNet classification has be

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 5y ago

We’re introducing Jukebox, a neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artist styles. We’re releas

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Improving verifiability in AI development

We’ve contributed to a multi-stakeholder report by 58 co-authors at 30 organizations, including the Centre for the Future of Intelligence, Mila, Schwartz Reisma

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

OpenAI standardizes on PyTorch

We are standardizing OpenAI’s deep learning framework on PyTorch.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning a

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while

Lilian Weng's Blog 🧠 Large Language Models ⚡ AI Lesson 6y ago

Self-Supervised Representation Learning

[Updated on 2020-01-09: add a new section on Contrastive Predictive Coding ]. [Updated on 2020-04-13: add a “Momentum Contra

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

GPT-2: 1.5B release

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facili

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Solving Rubik’s Cube with a robot hand

We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using th

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

OpenAI Scholars 2020: Applications open

We are now accepting applications for our third class of OpenAI Scholars.

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6y ago

The Paths Perspective on Value Learning

A closer look at how Temporal Difference Learning merges paths of experience for greater statistical efficiency

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Fine-tuning GPT-2 from human preferences

We’ve fine-tuned the 774M parameter GPT-2 language model using human feedback for various tasks, successfully matching the preferences of the external human lab

Lilian Weng's Blog 🧠 Large Language Models ⚡ AI Lesson 6y ago

Evolution Strategies

Stochastic gradient descent is a universal choice for optimizing deep learning models. However, it is not the only option. With bl

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Testing robustness against unforeseen adversaries

We’ve developed a method to assess whether a neural network classifier can reliably defend against adversarial attacks not seen during training. Our method yiel

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

GPT-2: 6-month follow-up

We’re releasing the 774 million parameter GPT-2 language model after the release of our small 124M model in February, staged release of our medium 355M model in

Distill.pub 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6y ago

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Learning from Incorrectly Labeled Data

Section 3.2 of Ilyas et al. (2019) shows that training a model on only adversarial errors leads to non-trivial generalization on the original test set. We show

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Microsoft invests in and partners with OpenAI to support us building beneficial AGI

Microsoft is investing $1 billion in OpenAI to support us building artificial general intelligence (AGI) with widely distributed economic benefits. We’re partne

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 6y ago

Why responsible AI development needs cooperation on safety

We’ve written a policy research paper identifying four strategies that can be used today to improve the likelihood of long-term industry cooperation on safety n

Lilian Weng's Blog 🧠 Large Language Models ⚡ AI Lesson 6y ago

Meta Reinforcement Learning

In my earlier post on meta-learning , the problem is mainly defined in the context of few-shot classificati