DocsGetting StartedIkigenia PlatformLearning HubArcaniaGuidesUsing Coqui

Setting Up Coqui Text-to-Speech

Learn how to set up and configure Coqui as your local text-to-speech backend for the Arcania platform, with both manual and Docker installation methods.

Introduction to Coqui

[serious] While Coqui.ai's hosted service has been discontinued, you can still set up Coqui locally as a powerful text-to-speech backend for Arcania using the methods detailed below.

Installation Methods

Method 1: Manual Setup

  1. Create and navigate to Coqui directory:

    mkdir ~/coqui && cd ~/coqui
  2. Install Miniconda:

    curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh -o miniconda3.sh
    chmod +x ./miniconda3.sh
    ./miniconda3.sh
  3. Set up Python environment:

    conda create --name coqui python=3.10
    conda activate coqui
  4. Clone and prepare repository:

    git clone https://github.com/coqui-ai/TTS.git
  5. Install dependencies:

    brew install mecab espeak
    pip install numpy==1.21.6 flask_cors
    conda install scipy scikit-learn Cython
  6. Complete installation:

    cd TTS && make install
  7. Launch the server:

    python3 TTS/server/server.py --model_name tts_models/en/vctk/vits

Method 2: Docker Setup

  1. Pull the Docker image:

    docker pull ghcr.io/coqui-ai/tts --platform linux/amd64
  2. Launch container:

    docker run --rm -it -p 5002:5002 --entrypoint /bin/bash ghcr.io/coqui-ai/tts
  3. Configure and start server:

    pip install flask_cors
    python3 TTS/server/server.py --model_name tts_models/en/vctk/vits

Configuration

[neutral] After installation, configure your system with these essential steps:

CORS Configuration

Add these lines to Flask app in /TTS/server/server.py:

from flask_cors import CORS
 
CORS(app)

Enable in Arcania

  1. Navigate to Settings
  2. Select Text-to-Speech
  3. Choose TTS Backend
  4. Select Coqui
  5. Configure Voice ID when satisfied

Technical Resources

Additional Information

[serious] Coqui provides a robust local text-to-speech solution with multiple voice models and customization options. [happy] Follow the installation steps carefully to ensure optimal performance.