Skip to content

Install Prerequisites

Handling audio content in Botium Box is based on Botium Speech Processing, a unified, developer-friendly API to the best available free and Open-Source Speech-To-Text and Text-To-Speech services as well as cloud services.

Launch Botium Speech Processing Service

Botium Speech Processing comes with a reasonable default configuration.


Both of them are free and Open Source and a good match to get started with voice technologies, on the other hand they are without a doubt among the best free voice tools available.

Launching it can be done with a few command line calls.

$ git clone
$ cd botium-speech-processing
$ docker-compose up -d


Depending on network speed and hardware this step can take a while.

Using a Cloud Speech Services Provider

Botium supports all major cloud services. In Botium Speech Processing service, a default provider is configured.

For the major cloud providers there are additional docker-compose files. If using those, the installation is more slim, as there is only the frontend-service required. For instance, add your Azure subscription key and Azure region key to the file docker-compose-azure.yml and start the services:

$ docker-compose -f docker-compose-azure.yml up -d

Test Service

Pointing your browser to http://localhost will show the API explorer for Botium Speech Processing.

Connect Botium Speech Processing Service to Botium Box

In order to enable audio capabilities in Botium Box, you have to add two environment variables to Botium Box Server (see here for details):

  • SPEECH_PROCESSING_ENDPOINT - the URL to your Botium Speech Processing installation

  • SPEECH_PROCESSING_APIKEY - in case you configured Botium Speech Processing API Key protection