monellz

Runxin Zhong monellz

23 followers · 33 following

Tsinghua University

Achievements

Highlights

Block or Report

Block or report monellz

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

ccfddl / ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 5,349 375 Updated Jul 9, 2024

humuyan / Korch

ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch

Python 9 Updated Jun 17, 2024

NX-AI / vision-lstm

xLSTM as Generic Vision Backbone

Python 325 21 Updated Jul 4, 2024

smonsays / hypernetwork-attention

Official code for the paper "Attention as a Hypernetwork"

Python 18 Updated Jun 22, 2024

heheda12345 / MagPy

Python 8 1 Updated Jun 5, 2024

Sevoii / StriveFrameViewer

Forked from Procdox/StriveFrameViewer

A mod for GGST that adds training mode features

C++ 46 2 Updated Jul 7, 2024

mirage-project / mirage

A multi-level tensor algebra superoptimizer

C++ 266 16 Updated Jul 9, 2024

home-assistant / core

🏡 Open source home automation that puts local control and privacy first.

Python 69,858 28,955 Updated Jul 9, 2024

yuweihao / MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Python 1,877 29 Updated Jun 6, 2024

NVIDIA / cccl

CUDA Core Compute Libraries

C++ 967 119 Updated Jul 9, 2024

tlc-pack / libflash_attn

Standalone Flash Attention v2 kernel without libtorch dependency

C++ 79 12 Updated May 21, 2024

alxndrTL / mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Python 781 65 Updated Jul 7, 2024

Blealtan / efficient-kan

An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

Python 3,535 308 Updated Jul 3, 2024

KindXiaoming / pykan

Kolmogorov Arnold Networks

Jupyter Notebook 13,695 1,209 Updated Jul 8, 2024

Infini-AI-Lab / TriForce

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 141 12 Updated Jul 4, 2024

hetailang / SqueezeAttention

Python 32 1 Updated Apr 13, 2024

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

Python 4,476 669 Updated Jun 17, 2024

d-matrix-ai / keyformer-llm

Python 28 2 Updated Mar 26, 2024

zeux / calm

CUDA/Metal accelerated language model inference

C 339 13 Updated Jun 7, 2024

OpenNLPLab / lightning-attention

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Python 174 15 Updated Apr 24, 2024

Kyriection / llama-recipes

Forked from meta-llama/llama-recipes

Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. S…

Jupyter Notebook 1 Updated Jun 18, 2024

thunlp / InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"

Python 233 20 Updated Apr 20, 2024

exists-forall / striped_attention

Python 30 2 Updated Nov 10, 2023

lucidrains / ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Python 405 24 Updated Jul 8, 2024

schwartz-lab-NLP / TOVA

Token Omission Via Attention

Python 113 6 Updated Feb 11, 2024

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 446 61 Updated Jun 5, 2024

VITA-Group / Ms-PoE

"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu,…

Python 16 1 Updated May 7, 2024

DRSY / EasyKV

Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)

Python 55 4 Updated Feb 13, 2024

Glaciohound / LM-Infinite

Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"

Python 94 10 Updated Apr 10, 2024

FMInference / FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,072 528 Updated Apr 19, 2024

Runxin Zhong monellz

Highlights

Block or report monellz

Starred repositories

Raspberry Pi

Game engine

Data structures

Continuous integration

C++

Awesome Lists

Algorithm