Gated Linear Attention Transformer Layer

Standalone module of Gated Linear Attention Transformer Layer (GLA) from Gated Linear Attention Transformers with Hardware-Efficient Training

This repo will not be maintained. Just track some useful git commit. Please refer to flash-linear-attention.

Requirement

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
flash-linear-attention @ c5e7b26		flash-linear-attention @ c5e7b26
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
fla		fla
gla.py		gla.py
time_counter.py		time_counter.py