Block or Report
Block or report tosiyuki
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language: Python
Sort by: Most stars
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan…
openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source …
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
Large Language Model Text Generation Inference
Common used path planning algorithms with animations.
A framework for few-shot evaluation of language models.
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Segmentation models with pretrained backbones. Keras and TensorFlow Keras.
An unofficial PyTorch implementation of the audio LM VALL-E
A framework for prompt tuning using Intent-based Prompt Calibration
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Official repository of Evolutionary Optimization of Model Merging Recipes
Strong and Open Vision Language Assistant for Mobile Devices
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
D-Adaptation for SGD, Adam and AdaGrad
When do we not need larger vision models?
pdf-translator translates English PDF files into Japanese, preserving the original layout.
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12390