-
University of California, Berkeley
- Berkeley, CA
- https://wilson1yan.github.io/
- @wilson1yan
Highlights
- Pro
Block or Report
Block or report wilson1yan
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
The official repository of "Video assistant towards large language model makes everything easy"
A framework for few-shot evaluation of language models.
Youtube-8m Videos, Frames and Ids Generator. Extract videos from youtube-8m. Extract frames from youtube-8m.
Video-P2P: Video Editing with Cross-attention Control
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
jax-triton contains integrations between JAX and OpenAI Triton
Language Quantized AutoEncoders
The simplest, fastest repository for training/finetuning medium-sized GPTs.
[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
🔥 CNN for Watermark Removal using Deep Image Prior with Pytorch 🔥.
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Using advances in generative modeling to learn reward functions from unlabeled videos.
Ongoing research training transformer models at scale
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
An Open-Ended Embodied Agent with Large Language Models