Block or Report
Block or report whuaxiom
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (1)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training
[CVPR 2023 Best Paper] Planning-oriented Autonomous Driving
Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152, I…
Ongoing research training transformer models at scale
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Hackable and optimized Transformers building blocks, supporting a composable construction.
Fast and Easy Infinite Neural Networks in Python
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …
Transformers with Arbitrarily Large Context
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Code examples and resources for DBRX, a large language model developed by Databricks
High performance distributed framework for training deep learning recommendation models based on PyTorch.
The official PyTorch implementation of Google's Gemma models
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
The AI-native database built for LLM applications, providing incredibly fast full-text and vector search
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Universal LLM Deployment Engine with ML Compilation
FlashInfer: Kernel Library for LLM Serving
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.