Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152, I…

Python 483 80 Updated Jul 2, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 9,264 2,088 Updated Jul 1, 2024

InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 3,173 255 Updated Jul 2, 2024

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 10,866 971 Updated Jul 1, 2024

InternLM / InternEvo

Python 225 37 Updated Jul 2, 2024

cuda-mode / ring-attention

ring-attention experiments

Python 75 10 Updated Apr 10, 2024

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 7,971 564 Updated Jul 2, 2024

google / neural-tangents

Fast and Easy Infinite Neural Networks in Python

Jupyter Notebook 2,250 228 Updated Mar 1, 2024

lucidrains / ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Python 397 22 Updated Jul 2, 2024

exists-forall / striped_attention

Python 30 2 Updated Nov 10, 2023

rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 669 178 Updated Jul 3, 2024

gregorbachmann / Next-Token-Failures

Python 49 3 Updated Mar 12, 2024

lhao499 / ringattention

Transformers with Arbitrarily Large Context

Python 563 43 Updated Jun 22, 2024

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

1,869 131 Updated Jul 3, 2024

databricks / dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Python 2,473 231 Updated May 1, 2024

HyperGAI / HPT

HPT - Open Multimodal LLMs from HyperGAI

Python 301 14 Updated Jun 6, 2024

PersiaML / PERSIA

High performance distributed framework for training deep learning recommendation models based on PyTorch.

Rust 388 51 Updated Feb 7, 2024

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Python 5,138 488 Updated Jul 2, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,288 423 Updated May 3, 2024

openai / transformer-debugger

Python 3,968 231 Updated Jun 4, 2024

infiniflow / infinity

The AI-native database built for LLM applications, providing incredibly fast full-text and vector search

C++ 2,033 171 Updated Jul 3, 2024

openppl-public / ppl.nn

A primitive library for neural network

C++ 1,239 209 Updated Jun 21, 2024

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 17,684 1,407 Updated Jul 2, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 753 62 Updated Jul 2, 2024

sgl-project / sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Python 2,761 179 Updated Jul 2, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

axiom whuaxiom

Block or report whuaxiom

Lists (1)

Machine learning

Starred repositories

synonyms

thesaurus

high-frequency-trading