Project Introduction
FishSpeech is a TTS voice generation tool developed by the FishAudio team, which, along with ChatTTS, is one of the super popular open-source TTS projects from the same period (June-July 2024). Speaking of its team members, they are various SVC experts on GitHub, the pioneers of AI voice cloning.
Main Features
• Zero-shot & Few-shot TTS: Just 10-30 seconds of voice samples are needed to generate high-quality speech, perfectly supporting voice cloning needs.
• Strong generalization capability without phoneme dependency: The Fish Speech model is phoneme-independent and can easily handle any language represented in text, making TTS application scenarios more extensive.
• Ultra-high accuracy: For 5 minutes of English text, the character error rate (CER) and word error rate (WER) are only about 2%.
• User-friendly multi-interface support:
• WebUI: A web user interface based on Gradio, compatible with mainstream browsers (Chrome, Firefox, Edge).
• GUI Inference: Provides a PyQt6 graphical interface that seamlessly collaborates with the API server.
• Easy deployment: Supports quick deployment whether locally or in the cloud, minimizing speed loss and providing great convenience for developers.
Official Website: https://fish.audio
GitHub Project Address: https://github.com/fishaudio/fish-speech
HF Demo: https://huggingface.co/spaces/fishaudio/fish-speech-1