-
Notifications
You must be signed in to change notification settings - Fork 808
Add LJU dataset, pretrained model and more #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Remove Multi-Band PWGAN + Add Multi-Band MelGAN-HF + Add LJSpeech Ultimate processor + Add postnet extraction scripts for Tac2 and FS2. + Add configs.
|
@ZDisket Is it finished ? |
|
@dathudeptrai Almost done, except for two things: add instructions to Tacotron2 readme to use MFA with mfa_extraction for generating durations which will then be turned into masks, and a script to phonemize filelists as the LJSpeechUltimate processor takes in a file called I was thinking I could either do the PR without the filelist phonemizing tool and instead link it in the Tumblr blog post, or include it here, but I'm not sure. |
|
@ZDisket can you use |
…s strings in processing
|
@dathudeptrai It's done. Now the processor will automatically phonemize strings when processing, and inference is as easy as this: I've tested everything I could think of so it should be ready to merge, although you are free to look at it if you've got the time. Four eyes better than two. |
|
LGTM :D |
commit 1368771 Merge: ab6efe4 07b49e9 Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Thu Mar 10 15:34:25 2022 +0700 Merge pull request TensorSpeech#748 from NeonBohdan/gt-repo Error german_transliterate only when using german_cleaners commit 07b49e9 Author: NeonBohdan <bohdan@neon.ai> Date: Mon Mar 7 13:49:21 2022 +0200 Fix german_transliterate module error commit ab6efe4 Merge: 05b059e d0e7d72 Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Thu Feb 10 11:28:28 2022 +0700 Merge pull request TensorSpeech#742 from hertz-pj/japenese Support Japenese TTS, and fix some bug. commit d0e7d72 Merge: eb6db12 05b059e Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Tue Feb 8 19:05:27 2022 +0700 Merge branch 'master' into japenese commit eb6db12 Author: hertz-pj <peiji.yang@foxmail.com> Date: Tue Feb 8 16:33:44 2022 +0800 fix a japenese fastspeech bug commit 3f921a5 Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Mon Jan 24 20:34:29 2022 +0700 😭 Upgrade to TF 2.7.0 commit c8f6b38 Author: hertz-pj <peiji.yang@foxmail.com> Date: Fri Jan 28 09:17:53 2022 +0800 fix bug of jsut dataset, add pyopenjtalk to setup.py commit 691c76a Author: hertz-pj <peiji.yang@foxmail.com> Date: Tue Jan 25 15:56:15 2022 +0800 resolve the conflicts commit c6ce93c Author: hertz-pj <peiji.yang@foxmail.com> Date: Mon Jan 24 16:16:00 2022 +0800 Support Japenese TTS commit 05b059e Merge: 070f9cd 9260b7f Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Sun Feb 6 12:21:26 2022 +0700 Merge pull request TensorSpeech#738 from hertz-pj/japenese Add support for Japenese TTS with JSUT dataset commit 9260b7f Merge: 0d05c18 070f9cd Author: hertz <peiji.yang@foxmail.com> Date: Sat Jan 29 23:32:36 2022 +0800 Merge branch 'TensorSpeech:master' into japenese commit 0d05c18 Author: hertz-pj <peijiyang@foxmail.com> Date: Fri Jan 28 09:17:53 2022 +0800 fix bug of jsut dataset, add pyopenjtalk to setup.py commit e771444 Author: hertz-pj <peijiyang@foxmail.com> Date: Tue Jan 25 15:56:15 2022 +0800 resolve the conflicts commit 070f9cd Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Mon Jan 24 20:34:29 2022 +0700 😭 Upgrade to TF 2.7.0 commit 4b3bc31 Author: hertz-pj <peijiyang@foxmail.com> Date: Mon Jan 24 16:16:00 2022 +0800 Support Japenese TTS commit 34358d8 Merge: 8786f59 cd3a5e1 Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Tue Oct 19 20:24:30 2021 +0700 Merge branch 'master' of https://github.com/TensorSpeech/TensorFlowTTS commit 8786f59 Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Tue Oct 19 20:20:33 2021 +0700 👜 Update README commit cd3a5e1 Merge: b77dffe 59e27bd Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Tue Sep 21 08:59:31 2021 +0700 Merge pull request TensorSpeech#670 from ZDisket/lju Add LJU dataset, pretrained model and more commit 59e27bd Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Sun Sep 19 10:11:51 2021 -0300 📈Reformat with black commit 2883b6e Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Sun Sep 19 10:05:47 2021 -0300 🔌LJU processor now takes in filelist.txt and automatically ARPAbetizes strings in processing commit 4ce7de9 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Sat Sep 18 01:52:51 2021 -0300 📑 Document FAL on Tacotron2 commit 221f1cd Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Thu Sep 16 00:23:11 2021 -0300 🌱 Add duration to mask exporter, modify Tacotron2 and dataloader to accept commit 5b15bb9 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Wed Sep 15 13:03:29 2021 -0300 🚩 Adjust configs and readmes commit 2493011 Merge: 7c1a0d9 b77dffe Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Wed Sep 15 12:13:53 2021 -0300 Merge remote-tracking branch 'upstream/master' into lju commit 7c1a0d9 Merge: a4b3d64 2959501 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Tue Aug 17 21:49:15 2021 -0300 Merge remote-tracking branch 'upstream/master' into lju commit a4b3d64 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Tue Aug 10 06:32:45 2021 -0300 😏 Add LJU to AutoProcessor commit 9f288f8 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Mon Aug 9 01:32:04 2021 -0300 🌙 Add and remove many things along with LJU processor - Remove Multi-Band PWGAN + Add Multi-Band MelGAN-HF + Add LJSpeech Ultimate processor + Add postnet extraction scripts for Tac2 and FS2. + Add configs.
This PR includes:
1. LJSpeech Ultimate dataloader and pretrained Tacotron2 and vocoder (audio samples)
2. Forced Alignment Guided Attention Loss (FAL) from paper (WIP)
3. Multi-Band MelGAN-HF (mb melgan g + hifigan d), with both train from scratch and proven finetuning configurations. Also remove multiband_pwgan
The pretrained model was trained for 100k steps with regular training and then for 20k with technique described in 2. As far as I am aware, this would make TensorFlowTTS the first and only open-source TTS repo with a high sampling rate (44.1KHz) pretrained model available.