Mesh-TensorFlow: Model Parallelism for Supercomputers (TF Dev Summit ‘19)
Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters. We introduce Mesh-TensorFlow, a language for specifying a general class of distributed tensor computations. Where data-parallelism can be viewed as splitting tensors and operations along the "batch" dimension, in Mesh-TensorFlow, the user can specify any tensor-dimensions to be split across any dimensions of a multi-dimensional mesh of processors. A Mesh-TensorFlow graph compiles into a SPMD program consisting of parallel operations coupled with collective communication primitives such as Allreduce. We use Mesh-TensorFlow to implement an efficient data-parallel, model-parallel version of the Transformer sequence-to-sequence model. Using TPU meshes of up to 512 cores, we train Transformer models with up to 5 billion parameters, surpassing state of the art results on WMT'14 English-to-French translation task and the one-billion-word language modeling benchmark. Mesh-Tensorflow is available at https://github.com/tensorflow/mesh .
Speaker: Noam Shazeer, Google
See the revamped dev site → https://www.tensorflow.org/
Watch all TensorFlow Dev Summit '19 sessions → http://bit.ly/TFDS19Sessions
Subscribe to the TensorFlow YouTube channel → https://bit.ly/TensorFlow1
Music by Terra Monk → http://bit.ly/TerraMonkTFDS
Event Photo Album → http://bit.ly/TFSummit19
event: TensorFlow Dev Summit 2019; re_ty: Publish; product: TensorFlow - TensorFlow Cor
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from TensorFlow · TensorFlow · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
The TensorFlow YouTube Channel is Here!
TensorFlow
Answering Your TF Questions #AskTensorFlow
TensorFlow
Chatting With the TensorFlow Community (TensorFlow Meets)
TensorFlow
All About TensorFlow Code (Coding TensorFlow)
TensorFlow
TensorFlow: an ML platform for solving impactful and challenging problems
TensorFlow
Keynote (TensorFlow Dev Summit 2018)
TensorFlow
tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)
TensorFlow
Eager Execution (TensorFlow Dev Summit 2018)
TensorFlow
Machine Learning in JavaScript (TensorFlow Dev Summit 2018)
TensorFlow
Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)
TensorFlow
The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)
TensorFlow
Distributed TensorFlow (TensorFlow Dev Summit 2018)
TensorFlow
Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Lite (TensorFlow Dev Summit 2018)
TensorFlow
Searching Over Ideas (TensorFlow Dev Summit 2018)
TensorFlow
Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)
TensorFlow
Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)
TensorFlow
Open Source Collaboration (TensorFlow Dev Summit 2018)
TensorFlow
Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Hub (TensorFlow Dev Summit 2018)
TensorFlow
Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)
TensorFlow
Real-World Robot Learning (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)
TensorFlow
Project Magenta (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Dev Summit 2018 - Livestream
TensorFlow
Introducing TensorFlow Lite (Coding TensorFlow)
TensorFlow
TensorFlow Dev Summit 2018 Highlights
TensorFlow
Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)
TensorFlow
TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow
TensorFlow
Using TensorFlow to enable research & production across many fields (TensorFlow Meets)
TensorFlow
Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)
TensorFlow
TensorFlow Lite for Android (Coding TensorFlow)
TensorFlow
Using the tf.data API to build input pipelines (TensorFlow Meets)
TensorFlow
Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow
TensorFlow
Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)
TensorFlow
TensorFlow Lite for iOS (Coding TensorFlow)
TensorFlow
Get started with TensorFlow's High-Level APIs (Google I/O '18)
TensorFlow
TensorFlow for JavaScript (Google I/O '18)
TensorFlow
TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)
TensorFlow
Get started with TensorFlow's High-Level APIs in 5 mins | Google I/O 2018
TensorFlow
TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)
TensorFlow
TensorFlow Lite for mobile developers (Google I/O '18)
TensorFlow
Advances in machine learning and TensorFlow (Google I/O '18)
TensorFlow
Distributed TensorFlow training (Google I/O '18)
TensorFlow
Classification using neural networks & ML regression models #AskTensorFlow
TensorFlow
TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)
TensorFlow
Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)
TensorFlow
How to get started with AI/ML, retraining models, & more! #AskTensorFlow
TensorFlow
TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)
TensorFlow
MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)
TensorFlow
The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)
TensorFlow
At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)
TensorFlow
NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)
TensorFlow
Try TensorFlow.js in your browser (Coding TensorFlow)
TensorFlow
TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)
TensorFlow
How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)
TensorFlow
Training models faster with TensorFlow Hub (TensorFlow Meets)
TensorFlow
Prepare your dataset for machine learning (Coding TensorFlow)
TensorFlow
Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)
TensorFlow
TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)
TensorFlow
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Tokenization in LLMs — The First Step Every Language Model Takes Before Understanding Anything |…
Medium · Deep Learning
Tokenization in LLMs — The First Step Every Language Model Takes Before Understanding Anything |…
Medium · LLM
LLMs in Market Forecasting — A Quantitative Research Primer
Medium · Machine Learning
Figure out AI before the rest of the Bank does
Medium · RAG
🎓
Tutor Explanation
DeepCamp AI