Block or Report
Block or report ykk648
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (10)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
Foundational Models for State-of-the-Art Speech and Text Translation
SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
Faster Whisper transcription with CTranslate2
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data (CVPR 24); Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer (ECCV 2024)
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Multilingual Voice Understanding Model
Bark Voice Cloning and Voice Cloning for Chinese Speech
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
使用OpenCV部署yolov8检测人脸和关键点以及人脸质量评价,包含C++和Python两个版本的程序,只依赖opencv库就可以运行,彻底摆脱对任何深度学习框架的依赖。
PINTO0309 / EasyFace
Forked from sithu31296/EasyFaceEasy-to-use Face Analysis Tool
Populate library namespace without incurring immediate import costs
A modular graph-based Retrieval-Augmented Generation (RAG) system
Enjoy the magic of Diffusion models!
Code for FreeTraj, a tuning-free method for trajectory-controllable video generation
Luma Web Examples, use lumalabs.ai captures directly in your three.js or other WebGL projects!
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Access large archives as a filesystem efficiently, e.g., TAR, RAR, ZIP, GZ, BZ2, XZ, ZSTD archives