Highlights
Block or Report
Block or report ArtificialZeng
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Repo for ShenNong-TCM-LLM (“神农”大模型,首个中医药中文大模型)
Ongoing research training transformer models at scale
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
首个中医大语言模型——“仲景”。受古代中医学巨匠张仲景深邃智慧启迪,专为传统中医领域打造的预训练大语言模型。 The first-ever Traditional Chinese Medicine large language model - "CMLM-ZhongJing". Inspired by the profound wisdom of the ancient Chinese me…
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
A machine learning compiler for GPUs, CPUs, and ML accelerators
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Flax is a neural network library for JAX that is designed for flexibility.
Minimal library to train LLMs on TPU in JAX with pjit().
Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b MoE model based on llama3.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
AGIGuest is an ambitious project aimed at advancing Artificial General Intelligence (AGI) by teaching AI to solve programming problems on LeetCode. AGIGuest 是一个雄心勃勃的项目,旨在通过教授人工智能解决 LeetCode 上的编程问题来…
类似按键精灵的鼠标键盘录制和自动化操作 模拟点击和键入 | automate mouse clicks and keyboard input
Its an open source LLM based on MOE Structure.
常用线性代数算法的python实现。包括lu、svd、qr、diagonalization、inv、pinv、lstsq、solve Ax、solve nullspace
Locating and editing factual associations in GPT (NeurIPS 2022)
A series of large language models trained from scratch by developers @01-ai
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Representation learning on large graphs using stochastic graph convolutions.