Skip to content

Conversation

@ZDisket
Copy link
Collaborator

@ZDisket ZDisket commented Sep 15, 2021

This PR includes:
1. LJSpeech Ultimate dataloader and pretrained Tacotron2 and vocoder (audio samples)
2. Forced Alignment Guided Attention Loss (FAL) from paper (WIP)
3. Multi-Band MelGAN-HF (mb melgan g + hifigan d), with both train from scratch and proven finetuning configurations. Also remove multiband_pwgan

The pretrained model was trained for 100k steps with regular training and then for 20k with technique described in 2. As far as I am aware, this would make TensorFlowTTS the first and only open-source TTS repo with a high sampling rate (44.1KHz) pretrained model available.

  • Modify Tacotron2 dataloader to allow option for using premade attention masks
  • Add alignment mask generation function from MFA durations
  • Finish up READMEs

@ZDisket ZDisket self-assigned this Sep 15, 2021
@ZDisket ZDisket added the enhancement 🚀 New feature or request label Sep 16, 2021
@dathudeptrai
Copy link
Collaborator

@ZDisket Is it finished ?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Sep 17, 2021

@dathudeptrai Almost done, except for two things: add instructions to Tacotron2 readme to use MFA with mfa_extraction for generating durations which will then be turned into masks, and a script to phonemize filelists as the LJSpeechUltimate processor takes in a file called filelist_p.txt with format for each line wavs/filename.wav|ARPA'd transcript. For example:

wavs/bztwyzpugz.wav|{ AH1 DH ER0 Z B IH0 L AO1 NG D T UW0 DH IY0 AE0 B EY1 Z AH1 V T UH1 R }
wavs/exlxkuhttq.wav|{ DH AH0 T D IH1 S AH0 P L AH0 N } , { IH0 N T EH1 N T AO1 N S AH1 T AH0 L T IY0 Z AH1 V L AA1 JH IH0 K AE1 N D M IY1 T AH0 } - { F IH1 Z IH0 K } , { W AO1 Z IH0 N D IH1 F ER0 AH0 N T T UW0 L IH1 T ER0 EH2 R IY0 F AO1 R M } , { AE1 N D S UW1 N B IH0 K EY1 M EH0 N K AH1 M B ER0 D W IH1 TH DH IY0 T EH1 K N IH0 K AH0 L }
wavs/aqzqsgksho.wav|{ T UW0 L IH1 B ER0 AH0 L AY2 Z M AH0 N AE1 S T IH0 K S T AH1 D IY0 Z }

I was thinking I could either do the PR without the filelist phonemizing tool and instead link it in the Tumblr blog post, or include it here, but I'm not sure.

@dathudeptrai
Copy link
Collaborator

dathudeptrai commented Sep 19, 2021

@ZDisket can you use python -m black FILENAME for all your changed files ?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Sep 19, 2021

@dathudeptrai It's done. Now the processor will automatically phonemize strings when processing, and inference is as easy as this:

arpa_txt = processor.to_arpa(text)
ids = processor.text_to_sequence(arpa_txt)

I've tested everything I could think of so it should be ready to merge, although you are free to look at it if you've got the time. Four eyes better than two.

@dathudeptrai
Copy link
Collaborator

LGTM :D

@dathudeptrai dathudeptrai merged commit cd3a5e1 into TensorSpeech:master Sep 21, 2021
coolseaweed added a commit to coolseaweed/TensorFlowTTS that referenced this pull request Aug 5, 2022
commit 1368771
Merge: ab6efe4 07b49e9
Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com>
Date:   Thu Mar 10 15:34:25 2022 +0700

    Merge pull request TensorSpeech#748 from NeonBohdan/gt-repo

    Error german_transliterate only when using german_cleaners

commit 07b49e9
Author: NeonBohdan <bohdan@neon.ai>
Date:   Mon Mar 7 13:49:21 2022 +0200

    Fix german_transliterate module error

commit ab6efe4
Merge: 05b059e d0e7d72
Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com>
Date:   Thu Feb 10 11:28:28 2022 +0700

    Merge pull request TensorSpeech#742 from hertz-pj/japenese

    Support Japenese TTS, and fix some bug.

commit d0e7d72
Merge: eb6db12 05b059e
Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com>
Date:   Tue Feb 8 19:05:27 2022 +0700

    Merge branch 'master' into japenese

commit eb6db12
Author: hertz-pj <peiji.yang@foxmail.com>
Date:   Tue Feb 8 16:33:44 2022 +0800

    fix a japenese fastspeech bug

commit 3f921a5
Author: dathudeptrai <nguyenquananhminh@gmail.com>
Date:   Mon Jan 24 20:34:29 2022 +0700

    😭 Upgrade to TF 2.7.0

commit c8f6b38
Author: hertz-pj <peiji.yang@foxmail.com>
Date:   Fri Jan 28 09:17:53 2022 +0800

    fix bug of jsut dataset, add pyopenjtalk to setup.py

commit 691c76a
Author: hertz-pj <peiji.yang@foxmail.com>
Date:   Tue Jan 25 15:56:15 2022 +0800

    resolve the conflicts

commit c6ce93c
Author: hertz-pj <peiji.yang@foxmail.com>
Date:   Mon Jan 24 16:16:00 2022 +0800

    Support Japenese TTS

commit 05b059e
Merge: 070f9cd 9260b7f
Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com>
Date:   Sun Feb 6 12:21:26 2022 +0700

    Merge pull request TensorSpeech#738 from hertz-pj/japenese

    Add support for Japenese TTS with JSUT dataset

commit 9260b7f
Merge: 0d05c18 070f9cd
Author: hertz <peiji.yang@foxmail.com>
Date:   Sat Jan 29 23:32:36 2022 +0800

    Merge branch 'TensorSpeech:master' into japenese

commit 0d05c18
Author: hertz-pj <peijiyang@foxmail.com>
Date:   Fri Jan 28 09:17:53 2022 +0800

    fix bug of jsut dataset, add pyopenjtalk to setup.py

commit e771444
Author: hertz-pj <peijiyang@foxmail.com>
Date:   Tue Jan 25 15:56:15 2022 +0800

    resolve the conflicts

commit 070f9cd
Author: dathudeptrai <nguyenquananhminh@gmail.com>
Date:   Mon Jan 24 20:34:29 2022 +0700

    😭 Upgrade to TF 2.7.0

commit 4b3bc31
Author: hertz-pj <peijiyang@foxmail.com>
Date:   Mon Jan 24 16:16:00 2022 +0800

    Support Japenese TTS

commit 34358d8
Merge: 8786f59 cd3a5e1
Author: dathudeptrai <nguyenquananhminh@gmail.com>
Date:   Tue Oct 19 20:24:30 2021 +0700

    Merge branch 'master' of https://github.com/TensorSpeech/TensorFlowTTS

commit 8786f59
Author: dathudeptrai <nguyenquananhminh@gmail.com>
Date:   Tue Oct 19 20:20:33 2021 +0700

     👜 Update README

commit cd3a5e1
Merge: b77dffe 59e27bd
Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com>
Date:   Tue Sep 21 08:59:31 2021 +0700

    Merge pull request TensorSpeech#670 from ZDisket/lju

    Add LJU dataset, pretrained model and more

commit 59e27bd
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Sun Sep 19 10:11:51 2021 -0300

    📈Reformat with black

commit 2883b6e
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Sun Sep 19 10:05:47 2021 -0300

    🔌LJU processor now takes in filelist.txt and automatically ARPAbetizes strings in processing

commit 4ce7de9
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Sat Sep 18 01:52:51 2021 -0300

    📑 Document FAL on Tacotron2

commit 221f1cd
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Thu Sep 16 00:23:11 2021 -0300

    🌱 Add duration to mask exporter, modify Tacotron2 and dataloader to accept

commit 5b15bb9
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Wed Sep 15 13:03:29 2021 -0300

    🚩 Adjust configs and readmes

commit 2493011
Merge: 7c1a0d9 b77dffe
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Wed Sep 15 12:13:53 2021 -0300

    Merge remote-tracking branch 'upstream/master' into lju

commit 7c1a0d9
Merge: a4b3d64 2959501
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Tue Aug 17 21:49:15 2021 -0300

    Merge remote-tracking branch 'upstream/master' into lju

commit a4b3d64
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Tue Aug 10 06:32:45 2021 -0300

    😏 Add LJU to AutoProcessor

commit 9f288f8
Author: ZDisket <30500847+ZDisket@users.noreply.github.com>
Date:   Mon Aug 9 01:32:04 2021 -0300

    🌙 Add and remove many things along with LJU processor

    - Remove Multi-Band PWGAN
    + Add Multi-Band MelGAN-HF
    + Add LJSpeech Ultimate processor
    + Add postnet extraction scripts for Tac2 and FS2.
    + Add configs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement 🚀 New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants