nangongmujd

Follow

nangongmujd

Follow

2 followers · 26 following

Block or Report

Block or report nangongmujd

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 24,403 5,039 Updated Aug 5, 2024

Significant-Gravitas / AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 165,461 43,878 Updated Aug 4, 2024

voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,594 1,140 Updated Aug 4, 2024

RVC-Project / Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

Python 21,858 3,315 Updated Aug 4, 2024

Project-MONAI / MONAI

AI Toolkit for Healthcare Imaging

Python 5,575 1,020 Updated Aug 4, 2024

wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 3,984 1,046 Updated Aug 4, 2024

aiola-lab / whisper-medusa

Whisper with Medusa heads

Python 372 19 Updated Aug 4, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 30,238 3,475 Updated Aug 4, 2024

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 4,344 373 Updated Aug 4, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 11,157 2,324 Updated Aug 4, 2024

Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 9,921 2,162 Updated Aug 4, 2024

lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.

Python 913 207 Updated Aug 3, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 28,731 3,143 Updated Aug 3, 2024

fishaudio / fish-speech

Brand new TTS solution

Python 6,894 530 Updated Aug 3, 2024

myshell-ai / MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 4,166 515 Updated Aug 3, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

10,996 724 Updated Aug 2, 2024

GuijiAI / duix.ai

C++ 3,821 530 Updated Aug 2, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 1,880 178 Updated Aug 2, 2024

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,625 1,027 Updated Aug 1, 2024

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 65,554 7,684 Updated Jul 31, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,521 360 Updated Jul 31, 2024

mli / paper-reading

深度学习经典、新论文逐段精读

25,437 2,350 Updated Jul 31, 2024

metavoiceio / metavoice-src

Foundational model for human-like, expressive TTS

Python 3,598 626 Updated Jul 30, 2024

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 7,623 1,082 Updated Jul 30, 2024

neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 12,606 1,756 Updated Jul 30, 2024

as-ideas / ForwardTacotron

Forked from fatchord/WaveRNN

⏩ Generating speech in a single forward pass without any attention!

Python 578 112 Updated Jul 29, 2024

jianchang512 / ChatTTS-ui

一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

Python 5,490 606 Updated Jul 29, 2024

jik876 / hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 1,860 496 Updated Jul 27, 2024

shibing624 / pycorrector

pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，LLaMA等模型应用在纠错场景，开箱即用。

Python 5,387 1,080 Updated Jul 27, 2024

TencentARC / GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Python 35,227 5,840 Updated Jul 26, 2024