Making large AI models cheaper, faster and more accessible
-
Updated
Jul 1, 2024 - Python
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)
A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Distributed Keras Engine, Make Keras faster with only one line of code.
Distributed training (multi-node) of a Transformer model
SC23 Deep Learning at Scale Tutorial Material
Orkhon: ML Inference Framework and Server Runtime
☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System
Official Repository for the paper: Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
Torch Automatic Distributed Neural Network (TorchAD-NN) training library. Built on top of TorchMPI, this module automatically parallelizes neural network training.
A mostly POSIX-compliant utility that scans a given interval for vampire numbers.
Add a description, image, and links to the data-parallelism topic page so that developers can more easily learn about it.
To associate your repository with the data-parallelism topic, visit your repo's landing page and select "manage topics."