[ Overview || TCP | C++ | Python | REST | WebSocket || Models | Customization | Deployment | Licensing ]
The Mod9 ASR WebSocket Interface is a higher-level interface than the protocol described in the TCP reference documentation, enabling client-side code in web browsers to access the full functionality of the ASR Engine.
Because it is intended for communication over the public Internet, a WebSocket connection should usually be protected with an encryption layer (i.e. using wss://
instead of ws://
in production).
The provided WebSocket server can be deployed in a Docker container that (opitonally registers and) loads an SSL certificate to securely encrypt communication.
Try websocket-demo.html to use the WebSocket interface directly from this browser.
Inspect the source code for that HTML page, noting in particular these lines of embedded JS:
function createWebSocket({ // Line 164
const websocket = new WebSocket(uri); // Line 173
websocket.onopen = async () => { // Line 199
websocket.send(optionsJSON); // Line 233
websocket.onmessage = async event => { // Line 246
const replyJSON = await event.data.text(); // Line 247
function closeWebSocket(websocket) { // Line 344
websocket.send(emptyMessage); // Line 351
function startStreamingAudio( // Line 365
audioSenderNode.port.onmessage = event => { // Line 397
webSocket.send(event.data); // Line 398
The Mod9 ASR WebSocket interface enables the straightforward code above to communicate with the ASR Engine.
The protocol for communicating with an ASR Engine indirectly via the WebSocket interface is very similar to communicating with the ASR Engine directly via its custom application-level protocol over TCP.
-
The client establishes a WebSocket connection with the server, enabling duplex communication in browsers.
-
In its first WebSocket message, the client sends a JSON-formatted object indicating request options. Unlike the protocol over TCP, this does not need to be formatted as a single line terminated by a newline.
-
Next, two processes may happen concurrently:
-
The server will send one or more WebSocket messages to the client, each a single-line JSON-formatted object. These replies will be formatted exactly as in the protocol over TCP, except without any newlines.
-
Depending on the specified request options, the client may send audio data to the server. The data bytes should be sent in non-empty WebSocket messages.
-
-
The client should terminate its audio data by sending an empty WebSocket message. The request may also be terminated as in the protocol over TCP, e.g. timing out or sending an end-of-file byte sequence.
-
The WebSocket server will reply with a final message and close the WebSocket connection.
The WebSocket server can be run via Docker (recommended) or as a standalone Python application.
Similar to the REST API, use the http-engine
entrypoint command to run the WebSocket server locally:
# Runs WebSocket server at ws://localhost:8080 (and also REST API at http://localhost:8080/rest/api)
docker run -it --rm -p 8080:80 mod9/asr http-engine
Or https-engine
for a remote server with interactive SSL certificate registration:
# Runs WebSocket server at wss://example.com (and https://example.com/rest/api, with encrypted transport).
docker run -it --rm -p 80:80 -p 443:443 mod9/asr https-engine
Click to expand
The WebSocket server is distributed within the Mod9 ASR Python SDK, which can be installed from PyPI:
pip3 install mod9-asr
This will install mod9-asr-websocket-server
in pip
's local scripts directory; it might need to be added to your PATH
:
export PATH=~/.local/bin:${PATH}
which mod9-asr-websocket-server
A Mod9 ASR Engine server is expected to be run locally (i.e. at localhost
), listening for TCP connections on port 9900
.
These defaults may be reconfigured with the MOD9_ASR_ENGINE_HOST
and MOD9_ASR_ENGINE_PORT
environment variables.
Follow the Python SDK's instructions to connect to the Mod9 ASR Engine and set the environment variables accordingly.
Alternatively, the WebSocket server can be passed command-line arguments to override the environment variables:
mod9-asr-websocket-server --engine-host=$HOST --engine-port=$PORT
The WebSocket server will listen at host address 127.0.0.1
and port 9980
by default; these may be set by command-line arguments.
For example, to allow external access on a standard HTTP port (which may require root permissions):
mod9-asr-websocket-server --host=0.0.0.0 --port=80
The Mod9 ASR Python SDK also installs a command-line WebSocket client that can facilitate development:
pip3 install mod9-asr
The arguments to the tool are the WebSocket server URI and JSON-encoded request options. For example:
mod9-asr-websocket-client wss://mod9.io '{"command": "ping"}'
Audio data may be relayed from from stdin:
curl -sL mod9.io/hi.wav | mod9-asr-websocket-client wss://mod9.io
In this case, the default request options '{"command": "recognize"}'
were implied.
To stream live audio from your microphone (using sox
) to a remote WebSocket server:
sox -dqV1 -traw -r16000 -c1 -b16 - | mod9-asr-websocket-client wss://mod9.io '{"format":"raw","rate":16000}'
Note that the examples above are similar to using nc
, a command-line TCP client:
nc mod9.io 9900 <<< '{"command": "ping"}'
curl -sL mod9.io/hi.wav | nc mod9.io 9900
(echo '{"format":"raw","rate":16000}'; sox -dqV1 -traw -r16000 -c1 -b16 - ) | nc mod9.io 9900
The critical distinction is that this communication to the server's port 9900 is over unencrypted TCP transport,
whereas the communication to wss://mod9.io
was using the WebSocket protocol with encryption (on port 443).
©2019-2022 Mod9 Technologies (Engine 1.9.5 : Python SDK 1.11.6)