[go: nahoru, domu]

Skip to content
View nangongmujd's full-sized avatar
Block or Report

Block or report nangongmujd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 24,403 5,039 Updated Aug 5, 2024

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 165,461 43,878 Updated Aug 4, 2024

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,594 1,140 Updated Aug 4, 2024

Easily train a good VC model with voice data <= 10 mins!

Python 21,858 3,315 Updated Aug 4, 2024

AI Toolkit for Healthcare Imaging

Python 5,575 1,020 Updated Aug 4, 2024

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 3,984 1,046 Updated Aug 4, 2024

Whisper with Medusa heads

Python 372 19 Updated Aug 4, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 30,238 3,475 Updated Aug 4, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 4,344 373 Updated Aug 4, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 11,157 2,324 Updated Aug 4, 2024

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 9,921 2,162 Updated Aug 4, 2024

Tools for handling speech data in machine learning projects.

Python 913 207 Updated Aug 3, 2024

A generative speech model for daily dialogue.

Python 28,731 3,143 Updated Aug 3, 2024

Brand new TTS solution

Python 6,894 530 Updated Aug 3, 2024

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 4,166 515 Updated Aug 3, 2024

✨✨Latest Advances on Multimodal Large Language Models

10,996 724 Updated Aug 2, 2024
C++ 3,821 530 Updated Aug 2, 2024

Multilingual Voice Understanding Model

Python 1,880 178 Updated Aug 2, 2024

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,625 1,027 Updated Aug 1, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 65,554 7,684 Updated Jul 31, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,521 360 Updated Jul 31, 2024

深度学习经典、新论文逐段精读

25,437 2,350 Updated Jul 31, 2024

Foundational model for human-like, expressive TTS

Python 3,598 626 Updated Jul 30, 2024

vits2 backbone with multilingual-bert

Python 7,623 1,082 Updated Jul 30, 2024

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 12,606 1,756 Updated Jul 30, 2024

⏩ Generating speech in a single forward pass without any attention!

Python 578 112 Updated Jul 29, 2024

一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

Python 5,490 606 Updated Jul 29, 2024

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 1,860 496 Updated Jul 27, 2024

pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。

Python 5,387 1,080 Updated Jul 27, 2024

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Python 35,227 5,840 Updated Jul 26, 2024
Next