Way to get audio stream data during web call? Vapi AI #support

Way to get audio stream data during web call?

AKG

04/02/2025, 3:53 PM

Hey folks! Basically what I want to understand is that is there a way to get the incoming audio stream from Vapi's SDK so that I can combine it with my own screen recording stream? I am using React.js

Vapi Support Bot

04/02/2025, 3:54 PM

Hey there! 👋 I don't have specific information on how to directly combine Vapi's audio stream with a screen recording stream, but I can definitely help you with accessing the incoming audio stream from Vapi's SDK during a web call! According to the [Vapi documentation](https://docs.vapi.ai/sdks), the Vapi Web SDK automatically configures audio streaming to and from the client, making it perfect for web applications. There are a couple of ways you can access the audio stream: ## Using the Call Listen Feature You can use Vapi's "Call Listen" feature to get real-time audio data via WebSocket connections: 1. First, obtain the

listenUrl

by triggering the

/call

endpoint 2. Use this URL to connect to a WebSocket and stream the audio data in real-time 3. Process the audio data as needed Here's a JavaScript example of how to handle the incoming audio stream:

Copy code

javascript
const WebSocket = require('ws');
const fs = require('fs');

let pcmBuffer = Buffer.alloc(0);

const ws = new WebSocket("wss://aws-us-west-2-production1-phone-call-websocket.vapi.ai/7420f27a-30fd-4f49-a995-5549ae7cc00d/transport");

ws.on('open', () => console.log('WebSocket connection established'));

ws.on('message', (data, isBinary) => {
  if (isBinary) {
    pcmBuffer = Buffer.concat([pcmBuffer, data]);
    console.log(`Received PCM data, buffer size: ${pcmBuffer.length}`);
    // This is where you would process the audio data
    // or combine it with your screen recording stream
  } else {
    console.log('Received message:', JSON.parse(data.toString()));
  }
});

## Audio Stream Format According to the [Vapi documentation](https://docs.vapi.ai/documentation/advanced-concepts/customization/custom-transcriber), the audio stream is in raw stereo PCM format (16-bit) with the following specifications: - Encoding: linear16 - Container: raw - Sample Rate: 16000 - Channels: 2 For browser-based applications, you might want to use the

AudioContext

API to process the audio: ```javascript

Vapi Support Bot

04/02/2025, 3:54 PM

const ws = new WebSocket(listenUrl); const audioContext = new AudioContext(); ws.binaryType = "arraybuffer"; ws.onmessage = async (event) => { const data = event.data; if (data instanceof ArrayBuffer) { try { const audioBuffer = await audioContext.decodeAudioData(data); const source = audioContext.createBufferSource(); source.buffer = audioBuffer; source.connect(audioContext.destination); source.start(); } catch (error) { console.error("Error playing audio:", error); } } }; ``` Is there anything specific about accessing the audio stream that you'd like me to elaborate on? Sources: - [Vapi documentation](https://docs.vapi.ai/sdks) - [Vapi documentation](https://docs.vapi.ai/calls/call-features) - [Vapi documentation](https://docs.vapi.ai/documentation/advanced-concepts/customization/custom-transcriber) - [Discord discussion](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1335246255705690265)

Sahil

04/06/2025, 9:19 AM

checking if this is resolved/solved for you?

5 Views

Previous Next