Skip to content

Reference: Test Set Insights

The Test Set Insights in Botium Box provide a rough estimation on the quality of your training data. A static analysis is done, which means that the data is not actually evaluated with an external chatbot engine like IBM Watson or Dialogflow, but just with an integrated state-of-the-art generic language model. Nevertheless, it provides some good first insights what might be wrong with your data.

Test Set Insights vs Botium Coach

Test Set Insights provides:

  • static analysis against an in-memory generic language model

  • quick evaluation

  • provides metrics on the data itself, the keywords and the similarity

Botium Coach provides:

  • evaluation against your chatbot engine of choice

  • possible long-term execution sessions required

  • provides metrics on how the data behaves in real-life

Test Data vs Training Data

Generally spoken, you should use all the training data available for getting the most out of the test set insights. Doing this kind of evaluation on test data only will not yield good results.

Enable Botium Coach Worker

You have to enable the optional component Coach Worker for using the test set insights. In docker-compose.yml or docker-compose-standalone.yml:

  • Uncomment the environment variable COACH_WORKER_ENDPOINT

  • Uncomment the coach container section to enable the container


In case the worker is running protected with an API Key, add an environment variable COACH_WORKER_APIKEY

Using Test Set Insights

The test set insights are available in the Insights section of the test set. You can see descriptions of the available metrics and the history there.

The Test Set Insights are automatically recalculated everytime when updating the test statistics on the Test Set Dashboard, or by clicking the Update Insights with Latest Test Set Data button in the Insights section.

If you are not interested in the insights, you can skip the implicit calculation with the Skip Auto-Calculation for Test Set Insights switch in the Test Set Settings.