Hannes Heikinheimo
Sep 19, 2023
1 min read
The main new feature with the Speechly Browser Client v2.0 is the capability to flexibly choose an audio source for the client via the Media Capture and Streams API. This is a significant evolution to the client and it's also a breaking change.
Previously the only way to provide the client with audio was to use the device’s default microphone. It was challenging to control which microphone was used if there were multiple microphones available. When initializing the Speechly Browser Client, the client behind the scenes silently chose the first audio capture device it found.
Moreover, if you had audio available for example as a live MediaStream
or in an audio file, and thus no need for a microphone, you had to resort to elaborate workarounds.
The Speechly Browser Client v2.0 fixes these issues.
Using the default microphone is still straightforward. It exists in a separate BrowserMicrophone
class that you can initialize and attach to the client, and everything works as before. However, a nice bonus of separating the default microphone implementation from the client is that you ask for the microphone permission only when the microphone is initialized, instead of when the client is created!
Furthermore, now you can also attach any existing MediaStream
object from which the audio will be read. This allows to easily integrate Speechly for example to WebRTC applications that expose incoming and outgoing audio as MediaStream
s.
Finally, to make things easier when dealing with audio files, we've added a uploadAudioData
function which decodes the given audio data and uploads it to the API. This currently works with popular file types such as WAV, MP3, M4A and others.
Check out Speechly Browser Client v2.0 on NPM.
// Using Yarn
yarn add @speechly/browser-client
// Using NPM
npm install --save @speechly/browser-client
Speechly Browser Client v2.0 extracts the microphone to a separate class and as a result the initialization looks a bit different. Also note that startContext
and stopContext
have been renamed to start
and stop
.
// Before
import { Client, Segment } from '@speechly/browser-client';
const client = new Client({ appId: 'your-app-id' });
await client.initialize();
client.onSegmentChange((segment: Segment) => {
console.log(
'Received new segment from the API:',
segment.intent,
segment.entities,
segment.words,
segment.isFinal,
);
});
await client.startContext();
setTimeout(async function () {
await client.stopContext();
}, 3000);
// After
import {
BrowserClient,
BrowserMicrophone,
Segment,
} from '@speechly/browser-client';
const client = new BrowserClient({ appId: 'your-app-id' });
const microphone = new BrowserMicrophone();
await microphone.initialize(); // must be called from a user triggered event!
await client.attach(microphone.mediaStream);
client.onSegmentChange((segment: Segment) => {
console.log(
'Received new segment from the API:',
segment.intent,
segment.entities,
segment.words,
segment.isFinal,
);
});
await client.start();
setTimeout(async function () {
await client.stop();
}, 3000);
You can now use the new uploadAudioData
function to send an AudioBuffer
directly to the client without using the microphone.
const client = new BrowserClient({ appId: 'your-app-id' });
const response = await fetch('url-to-audio-file');
const buffer = await response.arrayBuffer();
await client.uploadAudioData(buffer);
For more details, check out our GitHub repository. Happy developing!
Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.
Hannes Heikinheimo
Sep 19, 2023
1 min read
Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.
Matt Durgavich
Jul 06, 2023
5 min read
Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.
Markus Lång
Jun 01, 2023
1 min read