10 Innovative Uses for Voice Cloning Technology

Published On Tue May 07 2024

Erik Bjorgan Makes Voice Cloning Easy with the Applio- and Piper-Based TextyMcSpeechy Project

Maker Erik Bjorgan has found a new use for his Raspberry Pi: voice cloning with a software workflow he calls TextyMcSpeechy. This innovative project allows users to clone their own voices or any other voice and use it for text-to-speech (TTS) applications.

Bjorgan explains that TextyMcSpeechy was born out of his need for a simple voice cloning solution for Piper's TTS functionality. Unable to find an easy-to-use tool, he decided to create one himself. By leveraging the power of the Piper neural network for on-device text-to-speech and the Applio transformer-based voice conversion tool, Bjorgan developed a software approach to speech generation.

Thorsten-Voice TTS in Raspberry Pi OS nutzen | Piper - YouTube

The Technology Behind TextyMcSpeechy

TextyMcSpeechy combines the capabilities of the Piper neural network and the Applio voice conversion tool to create custom voice models. With the help of Applio, the tool can train Piper to mimic a target voice using an existing voice dataset. This means that users can create TTS voices that sound like them or any other person, even if the dataset does not contain recordings of the target voice.

Bjorgan emphasizes the importance of having a voice dataset with similar tone and accent to the target voice for optimal results. He also notes that some datasets may include audio from multiple speakers, which can be both a challenge and an opportunity for voice cloning.

Amplified Voice Changer using a Raspberry Pi Zero - Raspberry Pi Spy

Hardware Requirements and Applications

While the training process may require a powerful workstation with an NVIDIA GPU for acceleration, the speech generation itself can be done on more modest hardware like a Raspberry Pi single-board computer. Bjorgan shares his plans to use the software for creating celebrity voice-enabled smart home assistants through Home Assistant's open AI conversations integration.

For those interested in exploring TextyMcSpeechy, the project is available on Erik Bjorgan's GitHub repository under the permissive MIT license. More information can be found in the Reddit thread dedicated to the project.

Create your AI digital voice clone locally with Piper TTS ...

With Erik Bjorgan's TextyMcSpeechy project, voice cloning for TTS applications becomes more accessible and versatile, opening up a world of possibilities for personalized voice synthesis.