Skip to content

SeCATrity/Foice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Foice: Can I Hear Your Face? Pervasive Attack on Voice Authentication Systems with a Single Face Image

This repository provides a PyTorch implementation of Foice.

Foice is a generative text-to-speech model that generates multiple synthetic audios from just a single image of the person’s face, without requiring any voice sample.

Feel free to check out our demo video👉: https://drive.google.com/file/d/1Be1fgyDookg839UyV7DJbdBgx-YlD9ge/view?usp=sharing

Dependencies

  • face_alignment pip install face-alignment
  • numpy
  • cv2
  • torch
  • torchvision

Pre-trained models

Face-dependent Voice Feature Extractor Face-independent Voice Feature Generator
link link

Voice generation

Foice reuses the synthesizer and vocoder from SV2TTS. You can find the pre-trained synthesizer using the link.

Put all pre-trained models in the folder "../F2V_models/".

Run End-to-End.ipynb to generate voice recordings from image.

Third-party related projects

TODO List

  • Add pre-trained model
  • Add training process

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published