Style-Bert-VITS2

Style-Bert-VITS2 is a text-to-speech system based on Bert-VITS2 that focuses on highly controllable voice styles and emotional expression. It takes the original Bert-VITS2 v2.1 and its Japanese-Extra variant and extends them so you can control emotion and speaking style with fine-grained intensity, not just choose a generic tone. The project targets both power users and beginners: Windows users without Git or Python can install and run it using bundled .bat scripts, while advanced users can work with virtual environments, uv, and Python tooling. It includes a full GUI editor to script dialogue, set different styles per line, edit dictionaries, and save/load projects, plus a separate web UI and Colab notebooks for training and experimentation. For those who only need synthesis, the project is published as a Python library (pip install style-bert-vits2) and can run on CPU without an NVIDIA GPU, though training still requires GPU hardware.

Features

Fine-grained control of emotion and speaking style via learned style vectors
Beginner-friendly Windows installers plus advanced Python/uv setup options
Graphical editor for multi-line scripts, per-line style settings, and dictionary editing
Dataset tools for slicing long recordings, auto-transcribing, and preparing training corpora
CPU-only inference support via the style-bert-vits2 Python package for synthesis-focused use cases
Built-in FastAPI server to expose text-to-speech as an HTTP API for external integrations

Project Samples

Project Activity

See All Activity >

License

Affero GNU Public License

Follow Style-Bert-VITS2

Style-Bert-VITS2 Web Site

Other Useful Business Software

Find Hidden Risks in Windows Task Scheduler

Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.

Download Free Tool

Rate This Project

User Reviews

Be the first to post a review of Style-Bert-VITS2!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

2025-11-28

Similar Business Software

Google Cloud Speech-to-Text

Google Cloud’s Speech API processes more than 1 billion voice minutes per month with close to human levels of understanding for many commonly spoken languages. Powered by the best of Google's AI research and technology, Google Cloud's Speech-to-Text API helps you accurately transcribe speech...

See Software
Qwen3-TTS

Qwen3-TTS is an open source series of advanced text-to-speech models developed by the Qwen team at Alibaba Cloud under the Apache-2.0 license, offering stable, expressive, and real-time speech generation with features such as voice cloning, voice design, and fine-grained control of prosody and...

See Software
Octave TTS

Hume AI has introduced Octave (Omni-capable Text and Voice Engine), a groundbreaking text-to-speech system that leverages large language model technology to understand and interpret the context of words, enabling it to generate speech with appropriate emotions, rhythm, and cadence, unlike...

See Software
Azure Text to Speech

Build apps and services that speak naturally. Differentiate your brand with a customized, realistic voice generator, and access voices with different speaking styles and emotional tones to fit your use case—from text readers and talkers to customer support chatbots. Enable fluid,...

See Software
ElevenLabs

The most realistic and versatile AI speech software, ever. Eleven brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling. Generate top-quality spoken audio in any voice and style with the most advanced and multipurpose AI...

See Software
Noiz AI

Noiz is a browser-based AI platform that offers multiple tools for content summarization, transcription, writing support, and voice generation. Users can upload PDFs, DOC/DOCX files, or raw text; Noiz then employs AI to produce concise, readable summaries that preserve key ideas, arguments,...

See Software

Report inappropriate content

Style-Bert-VITS2

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

Get an email when there's a new version of Style-Bert-VITS2

Features

Project Samples

Project Activity

Categories

License

Follow Style-Bert-VITS2

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered