[go: nahoru, domu]

Skip to content
/ tacotron Public
forked from Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

License

Notifications You must be signed in to change notification settings

xuerq/tacotron

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A (Heavily Documented) TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Warning

As of May 17, 2017, this is still a first draft. You can run it following the steps below, but probably you should get poor results. I'll be working on debugging this weekend. (Code reviews and/or contributions are more than welcome!)

Requirements

  • NumPy >= 1.11.1
  • TensorFlow >= 1.0
  • librosa

Data

Since the original paper was based on their internal data, I use a freely available one, instead.

The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its text and audio recordings are freely avaiable here. Unfortunately, however, each of the audio files matches a chapter, not a verse, so is too long in most cases. I sliced them by verse manually. You can get them on my dropbox

Work Flow

  • STEP 1. Adjust hyper parameters in hyperparams.py if necessary.
  • STEP 2. Download the data and extract it.
  • STEP 3. Run train.py.
  • STEP 4. Run eval.py to get samples.

About

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%