A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 16,516 3,821 Updated Sep 8, 2024

exaloop / codon

A high-performance, zero-overhead, extensible Python compiler using LLVM

C++ 14,312 502 Updated Sep 7, 2024

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 14,061 2,833 Updated Sep 8, 2024

ggerganov / ggml

Tensor library for machine learning

C++ 10,810 998 Updated Sep 8, 2024

google / sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 10,043 1,161 Updated Sep 1, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,116 896 Updated Sep 3, 2024

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,853 401 Updated Sep 6, 2024

wang-xinyu / tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

C++ 6,871 1,755 Updated Aug 28, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 5,760 884 Updated Mar 27, 2024

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,207 282 Updated Sep 6, 2024

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,165 328 Updated May 16, 2023

openxla / xla

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 2,561 400 Updated Sep 8, 2024

infiniflow / infinity

The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text

C++ 2,414 255 Updated Sep 6, 2024

astojanov / Clover

Clover: Quantized 4-bit Linear Algebra Library

C++ 110 5 Updated May 29, 2018

MITIBMxGraph / SALIENT

The official SALIENT system described in the paper "Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining".

C++ 38 7 Updated Jun 28, 2023

jaewoosong / pocketnn

The official, proof-of-concept C++ implementation of PocketNN.

C++ 31 9 Updated Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bruno Pio blap

Achievements

Achievements

Block or report blap

Lists (2)

Build LLMs

LLMs

Stars

nomic-ai / gpt4all

ggerganov / llama.cpp

google / flatbuffers

microsoft / LightGBM