[go: nahoru, domu]

Skip to content
View bigwater's full-sized avatar
🤯
🤯
Block or Report

Block or report bigwater

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LLM inference in C/C++

C++ 62,553 8,970 Updated Jul 27, 2024

A framework that helps implementing swizzle GPU kernels

Racket 37 6 Updated Feb 29, 2020
Python 26 4 Updated Apr 22, 2024

A benchmark framework for decision forest inferences

Python 9 1 Updated Jan 19, 2024

UNet diffusion model in pure CUDA

Cuda 538 26 Updated Jun 28, 2024

Framework for evaluating ANNS algorithms on billion scale datasets.

Jupyter Notebook 317 106 Updated Jul 25, 2024

A library for k-nearest neighbor search

C++ 373 85 Updated Apr 24, 2024

Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, allGather, reduceScatter and sendRecv operations.

21 7 Updated Aug 28, 2023

[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.

Python 50 7 Updated May 16, 2024

FlowDroid Static Data Flow Tracker

Java 1,032 295 Updated Jul 25, 2024
Python 1 1 Updated Jul 18, 2024

MPI benchmark to test and measure collective performance

C 50 19 Updated Jun 29, 2021

Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial

Cuda 156 51 Updated May 28, 2024

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 492 102 Updated Jul 22, 2024

Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite

Cuda 57 13 Updated Sep 12, 2018

Compressed Log Processor (CLP) is a free log management tool capable of compressing text logs and searching the compressed logs without decompression.

C++ 773 65 Updated Jul 26, 2024

GPU model checker

Cuda 10 3 Updated Apr 17, 2019

The LTSmin model checking toolset

C 51 30 Updated Mar 8, 2024

The Git repository for the mCRL2 toolset.

C++ 87 36 Updated Jul 23, 2024
C++ 25 13 Updated Jul 7, 2024

data for the TACO paper

Python 2 Updated Feb 26, 2024

A library for easy and efficient manipulation of tensor networks.

Python 1,790 352 Updated Sep 4, 2023

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 9,754 1,407 Updated Jul 17, 2024
Cuda 8 2 Updated Apr 28, 2023

Hummingbird compiles trained ML models into tensor computation for faster inference.

Python 3,325 275 Updated Jul 26, 2024
Next