TextToSpeech
TextToSpeech (TTS) is a text to speech service TANGO device server. It is based on Amazon Web Services (AWS), and writes its audio via Pulse Audio. The device server currently includes some simple file caching services, to reduce the interaction with AWS.
The main text to speech and audio playback code has been separated out into a static library that is linked to both the device server and the unit tests. This separation allows the TTS functionality to be tested per class, and TTS code developed in isolation to the device server. It also reduces the dependencies of the unit tests.
Version
The current release version is 1.2.0
Building
Ensure the dependencies are installed before build.
Dependencies
TextToSpeech has a number of dependencies both in its toolchain and shared library linkage. Dependencies have been broken down and are as follows:
Libraries
- Amazon AWS SDK for C++. This is available from github.
- The AWS SDK has several dependencies of its own, including libcurl, libopenssl and zlib. To compile the AWS SDK the development version of these dependencies must be installed. A compiled version is provided under /libs. This has been compiled and linked on Debian 9, if using these copy to an appropriate location within the library search path.
- PulseAudio development library libpulse-dev and headers.
- Suggest using a debian package, since this will resolve the dependencies correctly.
- Tango Controls 9 or higher.
- omniORB release 4 - libomniorb4 and libomnithread.
- libzmq - libzmq3-dev or libzmq5-dev.
Toolchain Dependencies
- C++14 compliant compiler and std++ library. The device server uses some modern C++ features, including the filesystem API (from C++14), threading, lamda's, futures (from C++11) etc.
- CMake 3.1 or greater is required to perform the build.
- Debian 9 (Stretch) as a compilation environment.
Build Flags
Custom build flags for the TextToSpeech device server:
Flag | Default | Use |
---|---|---|
TEXT_TO_SPEECH_BUILD_TESTS | ON | Build unit tests |
TEXT_TO_SPEECH_BUILD_DEBUG_SYMBOLS | OFF | Build the device server with debug systems |
TEXT_TO_SPEECH_LOG_TO_TANGO | ON | Send tts_library logging to the tango logging system |
TTS_ENABLE_DEBUG | ON | Build the subsystem tts_library with debug, this is piped to tango output |
TTS_ENABLE_TRACING | OFF | Build the subsystem tts_library with full code tracing, when ON implies TTS_ENABLE_TRACING=ON |
TTS_UNIT_TEST_ENABLE_DEBUG | OFF | Build the unit tests with debug messages |
TTS_UNIT_TEST_ENABLE_TRACING | OFF | Build the unit tests with full code tracing, when ON implies TTS_UNIT_TEST_ENABLE_DEBUG=ON |
The following is a list of common useful CMake flags and their use:
Flag | Use |
---|---|
CMAKE_PREFIX_PATH | Used to pass a prefix path for pkgconfig to search for the tango.pc file |
Build
Example Build Sequence
The build must be out of source, so first create a build directory:
mkdir build
cd build
This device server uses the tango.pc package config file to configure the build system. When tango is installed on linux (by package or source install), the tango.pc file is also installed. This will be picked up by default when cmake attempts to configure the build system.
Its possible to redirect CMakeś search to find a particular tango.pc file. This can be done simply by setting an environment variable:
export PKG_CONFIG_PATH=/segfs/tango/release/debian9/lib/pkgconfig
cmake ..
Or as a temporary environment variable:
PKG_CONFIG_PATH=/segfs/tango/release/debian9/lib/pkgconfig cmake ..
Or by passing the search location prefix in via CMAKE_PREFIX_PATH
cmake -DCMAKE_PREFIX_PATH=/segfs/tango/release/debian9 ..
Now build with make:
make
Assets
A second target exists to copy and convert the audio assets into raw pcm files (this is the format supported by the TextToSpeech device server currently). This must be run before testing the device server from the build directory, since it will provide a number of jingles for the device server. Once converted, these audio assets should be deployed with the device server. To run the copy and conversion:
make convert-audio
Installation
Along with the binary, the TextToSpeech device server requires a small script to configure the AWS keys and some pre-converted audio files. This script is located at scripts/setup.sh
and must be run befre the device server is started.
The audio files are converted via the build system and placed under the build
directory. See Assets. These must be in place for the device server to load jingles from.
Deployment
A suggested deployment strategy is as follows:
- Make a directory called TextToSpeechDir in the servers bin directory.
- Copy the TextToSpeech binary into TextToSpeechDir.
- Copy script scripts/setup.sh into TextToSpeechDir.
- Copy the build/jingles (see Assets) to TextToSpeechDir.
- Copy the script scripts/TextToSpeech to TextToSpeechDirś parent directory (servers bin directory).
- Setup with Astor as normal, except use the script as the executable.
Running Tests
The tts_library is covered by a number of unit tests to verify its functionality. There are based on the Catch2 Unit Test framework and are built by default (since TEXT_TO_SPEECH_BUILD_TESTS is set to ON). The tests require both a working internet connection (for AWS Polly) and sound hardware (for PulseAudio). To run the tests from the build directory:
./tests/unit-tests
To look at the available tests and tags, should you wish to run a subset of the test suite (for example, you do not have sound hardware), then tests and be listed:
./bin/unit-tests --list-tests
Or:
./bin/unit-tests --list-tags
To see more options for the unit-test command line binary:
./bin/unit-tests --help
Note: Some unit tests require valid audio, so the convert-audio
target must be built before running the tests.
Note: Polly unit tests require the setup.sh script be sourced first.
License
The code is released under the GPL3 license and a copy of this license is provided with the code.