Skip to main content

Building a Custom Voice Widget

This guide walks through building a full voice widget from scratch — a floating button on your website that lets visitors speak with a Truedy AI agent in real time, with live captions.

What you’ll build

  • A “Talk to AI” button on your website
  • Server-side endpoint that fetches a joinUrl from Truedy (keeps your API key safe)
  • Browser JS that connects audio using the Ultravox SDK
  • Live caption display during the call

Prerequisites

  • A Truedy API key (create one in Settings → API Keys)
  • A Truedy agent (create one in the dashboard)
  • Node.js backend (examples use Express, but any server works)

Step 1 — Create a server-side endpoint

Your backend holds the API key and proxies the joinUrl request. Never expose your Truedy API key in frontend code.
// server.ts (Express)
import express from 'express'
const app = express()
app.use(express.json())

app.post('/api/start-voice-call', async (req, res) => {
  const { agentId, variables } = req.body

  const response = await fetch('https://api.truedy.ai/api/public/v1/webrtc/call', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.TRUEDY_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      agent_id: agentId,
      variables,           // optional — personalise the agent prompt
    }),
  })

  if (!response.ok) {
    return res.status(response.status).json({ error: 'Failed to start call' })
  }

  const data = await response.json()
  // Only return joinUrl + agentName — never forward your API key
  res.json({ joinUrl: data.joinUrl, agentName: data.agentName })
})

Step 2 — Install the Ultravox SDK

npm install ultravox-client

Step 3 — Frontend widget

<!-- index.html — add before </body> -->
<button id="voice-btn">🎙 Talk to AI</button>
<div id="captions" style="display:none"></div>
<script type="module" src="/widget.js"></script>
// widget.js
import { UltravoxSession } from 'ultravox-client'

const AGENT_ID = 'YOUR_AGENT_UUID'

let session = null
const btn = document.getElementById('voice-btn')
const captions = document.getElementById('captions')

btn.addEventListener('click', async () => {
  if (session) {
    // End active call
    await session.leaveCall()
    session = null
    btn.textContent = '🎙 Talk to AI'
    captions.style.display = 'none'
    return
  }

  btn.textContent = 'Connecting...'
  btn.disabled = true

  try {
    // 1. Get joinUrl from your backend (never call Truedy directly from the browser)
    const res = await fetch('/api/start-voice-call', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        agentId: AGENT_ID,
        variables: {
          // Personalise per visitor if you have the data
          page_url: window.location.href,
        },
      }),
    })
    const { joinUrl, agentName } = await res.json()

    // 2. Connect audio
    session = new UltravoxSession()

    session.addEventListener('status', () => {
      const statusLabels = {
        connecting: 'Connecting...',
        idle: `Connected to ${agentName}`,
        listening: '🎤 Listening...',
        thinking: '💭 Thinking...',
        speaking: '🤖 Speaking...',
        disconnected: 'Call ended',
      }
      btn.textContent = statusLabels[session.status] ?? session.status
    })

    session.addEventListener('transcripts', () => {
      captions.style.display = 'block'
      captions.innerHTML = session.transcripts
        .slice(-6)   // last 6 lines
        .map(t => `<p class="${t.speaker}">${t.text}</p>`)
        .join('')
    })

    await session.joinCall(joinUrl)
    btn.textContent = '⏹ End call'
    btn.disabled = false

  } catch (err) {
    console.error(err)
    btn.textContent = '🎙 Talk to AI'
    btn.disabled = false
  }
})

Step 4 — Style the widget (optional)

#voice-btn {
  position: fixed;
  bottom: 24px;
  right: 24px;
  background: #6366f1;
  color: white;
  border: none;
  border-radius: 999px;
  padding: 14px 24px;
  font-size: 15px;
  cursor: pointer;
  box-shadow: 0 4px 20px rgba(99, 102, 241, 0.4);
  transition: background 0.2s;
  z-index: 9999;
}

#voice-btn:hover { background: #4f46e5; }
#voice-btn:disabled { opacity: 0.6; cursor: default; }

#captions {
  position: fixed;
  bottom: 90px;
  right: 24px;
  width: 320px;
  background: rgba(0,0,0,0.8);
  color: white;
  border-radius: 12px;
  padding: 12px 16px;
  font-size: 13px;
  line-height: 1.6;
  z-index: 9999;
}

#captions .agent { color: #a5b4fc; }
#captions .user  { color: #d1fae5; }

Passing context to the agent

Use variables to give the agent information about the current visitor. Any {{placeholder}} in your agent’s prompt is replaced at call time: Agent instructions (set in Truedy dashboard):
You are a support agent for {{company_name}}.
The visitor is on page: {{page_url}}.
{{#plan}}Their current plan is {{plan}}.{{/plan}}
From your backend:
body: JSON.stringify({
  agent_id: AGENT_ID,
  variables: {
    company_name: 'Acme Inc',
    page_url: req.body.pageUrl,
    plan: req.user?.subscriptionPlan ?? '',
  },
})

Keeping the API key safe

✅ Do❌ Don’t
Store API key in server env varsPut API key in frontend JS
Backend proxies /api/public/v1/webrtc/callCall Truedy API directly from browser
Return only joinUrl + agentName to frontendReturn full Truedy response to frontend
Rate-limit your /api/start-voice-call endpointLeave it open to abuse

Next steps