Writing in his bog, Google AI software engineer Yanping Huang said Boffins could"easily" scale performance.
Huang and colleagues Have penned an accompanying paper with the catchy and racy title GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. GPipe Uses asynchronous stochastic gradient descent; an optimisation algorithm used to update a given AI model's parameters as a training tool. It also uses pipeline parallelism - a task execution system in which one step's output is streamed as input to the next step.
According to the paper, GPipe’s performance gains come from better memory allocation for AI models. On second-generation Google Cloud tensor processing units (TPUs). Each TPU contains eight processor cores and 64 GB memory (8GB per core).
Without GPipe, a single core can only train up to 82 million model parameters.
GPipe partitions models across different accelerators and automatically splits small batches (i.e., "mini-batches") of training examples into smaller "micro-batches," and it pipelines execution across the micro-batches.
This means the course can run in parallel and accumulate gradients across the micro-batches.