KnowingNothing

🥰

ZCHNO KnowingNothing

🥰

Time will tell

259 followers · 27 following

Achievements

x2 x2

Achievements

x2 x2

Highlights

Block or Report

Block or report KnowingNothing

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

microsoft / sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Python 61 7 Updated Jun 30, 2024

pku-liang / MAGIS

MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)

Python 28 2 Updated May 29, 2024

KnowingNothing / compiler-and-arch

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

337 29 Updated Jun 8, 2024

IST-DASLab / qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Python 255 22 Updated Nov 3, 2023

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,616 82 Updated Jan 21, 2024

mirage-project / mirage

A multi-level tensor algebra superoptimizer

C++ 262 16 Updated Jul 3, 2024

Infini-AI-Lab / Sequoia

scalable and robust tree-based speculative decoding algorithm

Python 280 29 Updated Jun 7, 2024

Infini-AI-Lab / TriForce

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 139 12 Updated Jun 26, 2024

maxbbraun / llama4micro

A "large" language model running on a microcontroller

C++ 459 32 Updated Dec 9, 2023

xai-org / grok-1

Grok open release

Python 49,143 8,314 Updated May 29, 2024

hwanz / SSR-V2ray-Trojan

机场推荐与机场评测

2,567 69 Updated Jul 1, 2024

volcengine / veScale

A PyTorch Native LLM Training Framework

Python 477 19 Updated May 31, 2024

huggingface / candle

Minimalist ML framework for Rust

Rust 14,260 799 Updated Jul 2, 2024

NVIDIA / thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,879 758 Updated Feb 8, 2024

NVIDIA / nccl

Optimized primitives for collective multi-GPU communication

C++ 2,955 756 Updated Jul 1, 2024

efeslab / Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 218 15 Updated Jul 2, 2024

ColfaxResearch / cutlass-kernels

Cuda 94 19 Updated Mar 22, 2024

nox-410 / tvm.tl

An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.

Python 46 3 Updated Jun 25, 2024

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,340 484 Updated Jul 3, 2024

federico-busato / Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

HTML 11,393 754 Updated May 16, 2024

flexflow / FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

C++ 1,581 218 Updated Jul 3, 2024

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 17,695 1,408 Updated Jul 2, 2024

microsoft / generative-ai-for-beginners

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Jupyter Notebook 53,824 27,842 Updated Jul 3, 2024

punica-ai / punica

Serving multiple LoRA finetuned LLM as one

Python 884 40 Updated May 8, 2024

pku-liang / TileFlow

TileFlow is a performance analysis tool based on Timeloop for fusion dataflows

C++ 51 5 Updated Apr 12, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 757 62 Updated Jul 3, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 9,275 2,091 Updated Jul 1, 2024

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Python 88,467 13,890 Updated Jul 3, 2024

tigerlab-ai / tiger

Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)

Jupyter Notebook 387 26 Updated Dec 2, 2023

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Python 33,234 4,640 Updated Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZCHNO KnowingNothing

Achievements

Achievements

Highlights

Block or report KnowingNothing

Stars

microsoft / sarathi-serve

pku-liang / MAGIS

KnowingNothing / compiler-and-arch

IST-DASLab / qmoe

S-LoRA / S-LoRA

mirage-project / mirage

Infini-AI-Lab / Sequoia

Infini-AI-Lab / TriForce

maxbbraun / llama4micro

xai-org / grok-1

hwanz / SSR-V2ray-Trojan

volcengine / veScale

huggingface / candle

NVIDIA / thrust

NVIDIA / nccl

efeslab / Atom

ColfaxResearch / cutlass-kernels

nox-410 / tvm.tl

pytorch-labs / gpt-fast

federico-busato / Modern-CPP-Programming

flexflow / FlexFlow

mlc-ai / mlc-llm

microsoft / generative-ai-for-beginners

punica-ai / punica

pku-liang / TileFlow

flashinfer-ai / flashinfer

NVIDIA / Megatron-LM

langchain-ai / langchain

tigerlab-ai / tiger

run-llama / llama_index