Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…

Jupyter Notebook 10,607 1,510 Updated Jul 23, 2024

brown-palm / AntGPT

Official code implemtation of paper AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Python 17 1 Updated Mar 16, 2024

meta-llama / llama

Inference code for Llama models

Python 54,389 9,335 Updated Jul 23, 2024

OFA-Sys / OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python 2,376 247 Updated Apr 24, 2024

yrcong / STTran

Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021

Jupyter Notebook 181 34 Updated Aug 22, 2022

state-spaces / mamba

Mamba SSM architecture

Python 11,863 987 Updated Jul 24, 2024

hkproj / mamba-notes

Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)

132 9 Updated Jan 7, 2024

dmlc / decord

An efficient video loader for deep learning with smart shuffling that's super easy to digest

C++ 1,742 150 Updated Jul 17, 2024

ollama / ollama

Get up and running with Llama 3, Mistral, Gemma 2, and other large language models.

Go 79,837 6,097 Updated Jul 24, 2024

ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 32,202 5,487 Updated Jul 24, 2024

UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Python 14,517 2,403 Updated Jul 22, 2024

EgocentricVision / EgocentricVision

🔍 Explore Egocentric Vision: research, data, challenges, real-world apps. Stay updated & contribute to our dynamic repository! Work-in-progress; join us!

59 6 Updated Jul 8, 2024

gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 31,139 2,325 Updated Jul 24, 2024

amitsou / DeepEgoHA-3R

A collection of the forefront of Egocentric Human Activity Recognition (HAR) and Action Anticipation through Deep Learning

6 Updated Feb 7, 2024

yukw777 / EILEV

EILEV: Efficient In-Context Learning in Vision-Language Models for Egocentric Videos

Python 104 9 Updated Jun 13, 2024

PKU-YuanGroup / LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Python 631 48 Updated Mar 25, 2024

PKU-YuanGroup / Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,721 192 Updated Jul 20, 2024

Yangyi-Chen / Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

478 28 Updated Jul 24, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,229 2,452 Updated Jul 15, 2024

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Anirudh Thatipelli Anirudh257

Block or report Anirudh257

Starred repositories

visual-text-fusion

efficient-training

action-anticipation

video-language

egocentric-vision

video-text-retrieval

open-set-recognition

online-action-detection

vision-and-language

foundation-model