Building Voice AI Assistants with VAPI: From Setup to Production

Voice AI has matured significantly. VAPI makes it straightforward to build voice assistants that can actually do things—not just chat, but call APIs, look up data, and take actions.

Why VAPI?

VAPI handles the hard parts of voice:

Speech-to-text transcription
LLM integration (OpenAI, Anthropic, custom)
Text-to-speech with natural voices (ElevenLabs, etc.)
Real-time streaming for low latency
Tool/function calling during conversations

You focus on what your assistant does. VAPI handles how it speaks and listens.

Basic Architecture

Your webhook receives tool calls, executes them, and returns results. The LLM incorporates results into its response.

Setting Up an Assistant

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Create assistant via API
curl -X POST "https://api.vapi.ai/assistant" \
  -H "Authorization: Bearer $VAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Atlas",
    "model": {
      "provider": "openai",
      "model": "gpt-4o",
      "messages": [{
        "role": "system",
        "content": "You are Atlas, a helpful voice assistant. Keep responses concise for voice."
      }],
      "temperature": 0.7
    },
    "voice": {
      "provider": "11labs",
      "voiceId": "your-voice-id"
    },
    "serverUrl": "https://your-domain.com/vapi/webhook"
  }'

Adding Tools

Define tools the assistant can call:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
{
  "model": {
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string"}
            },
            "required": ["location"]
          }
        }
      },
      {
        "type": "function", 
        "function": {
          "name": "check_email",
          "description": "Check for unread emails",
          "parameters": {"type": "object", "properties": {}}
        }
      }
    ]
  }
}

Webhook Handler (FastAPI)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
from fastapi import FastAPI
import httpx

app = FastAPI()

@app.post("/vapi/webhook")
async def vapi_webhook(request: dict):
    message = request.get("message", {})
    msg_type = message.get("type")
    
    if msg_type == "tool-calls":
        tool_calls = message.get("toolCalls", [])
        results = []
        
        for tc in tool_calls:
            func = tc.get("function", {})
            name = func.get("name")
            args = json.loads(func.get("arguments", "{}"))
            
            result = await handle_tool(name, args)
            results.append({
                "toolCallId": tc.get("id"),
                "result": json.dumps(result)
            })
        
        return {"results": results}
    
    return {"status": "ok"}


async def handle_tool(name: str, args: dict) -> dict:
    if name == "get_weather":
        location = args.get("location", "New York")
        # Call weather API
        async with httpx.AsyncClient() as client:
            resp = await client.get(f"https://api.weather.com/...")
            data = resp.json()
            return {"response": f"Currently {data['temp']}°F in {location}"}
    
    elif name == "check_email":
        # Check email via IMAP or API
        count = await get_unread_count()
        return {"response": f"You have {count} unread emails"}
    
    return {"response": "Unknown tool"}

Webhook Security

VAPI sends requests to your webhook—make sure it’s secure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import hmac
import hashlib

def verify_vapi_signature(payload: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(expected, signature)

@app.post("/vapi/webhook")
async def webhook(request: Request):
    body = await request.body()
    signature = request.headers.get("x-vapi-signature", "")
    
    if not verify_vapi_signature(body, signature, WEBHOOK_SECRET):
        raise HTTPException(status_code=401)
    
    # Process request...

Exposing Your Webhook

For development, use a tunnel:

1
2
3
4
5
# Cloudflared (recommended for production)
cloudflared tunnel --url http://localhost:8095

# Or ngrok for quick testing
ngrok http 8095

For production, deploy behind a proper domain with SSL.

Choosing an LLM Provider

VAPI supports multiple providers:

1
2
3
4
5
6
7
8
// OpenAI
{"provider": "openai", "model": "gpt-4o"}

// Anthropic (requires API key in VAPI credentials)
{"provider": "anthropic", "model": "claude-3-5-sonnet-20241022"}

// Custom (route to your own endpoint)
{"provider": "custom-llm", "url": "https://your-llm.com/v1/chat"}

Note: Using Anthropic requires adding your API key to VAPI’s credential settings—it’s not automatic.

Voice Selection

Natural-sounding voices matter for UX:

1
2
3
4
5
6
7
8
{
  "voice": {
    "provider": "11labs",
    "voiceId": "onwK4e9ZLuTAKqWW03F9",
    "stability": 0.5,
    "similarityBoost": 0.75
  }
}

ElevenLabs offers the most natural voices. VAPI also supports Azure, PlayHT, and others.

Handling Conversation Context

VAPI maintains conversation state, but you can inject context:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
@app.post("/vapi/webhook")
async def webhook(request: dict):
    msg_type = request.get("message", {}).get("type")
    
    # Inject customer data at call start
    if msg_type == "assistant-request":
        caller = request.get("call", {}).get("customer", {})
        phone = caller.get("number", "")
        
        # Look up customer
        customer = await get_customer_by_phone(phone)
        
        return {
            "assistant": {
                "model": {
                    "messages": [{
                        "role": "system",
                        "content": f"Customer: {customer['name']}. Account status: {customer['status']}."
                    }]
                }
            }
        }

Error Handling

Voice UX requires graceful degradation:

1
2
3
4
5
6
7
8
9
async def handle_tool(name: str, args: dict) -> dict:
    try:
        result = await execute_tool(name, args)
        return {"response": result}
    except TimeoutError:
        return {"response": "That's taking longer than expected. Let me try again."}
    except Exception as e:
        logger.error(f"Tool {name} failed: {e}")
        return {"response": "I wasn't able to complete that action."}

Testing Your Assistant

VAPI provides a web-based testing interface, but you can also test programmatically:

1
2
3
4
5
6
7
8
9
# Start a test call
response = requests.post(
    "https://api.vapi.ai/call/web",
    headers={"Authorization": f"Bearer {VAPI_API_KEY}"},
    json={"assistantId": "your-assistant-id"}
)

call_url = response.json()["webCallUrl"]
print(f"Test at: {call_url}")

Production Checklist

Webhook reliability — Use a queue for async tool execution
Latency — Keep tool responses under 3 seconds
Error messages — Make them conversational, not technical
Logging — Record all interactions for debugging
Rate limiting — Protect against abuse
Fallbacks — Have defaults when tools fail

Voice AI is no longer science fiction. With VAPI handling the voice pipeline and your webhook handling the logic, you can build surprisingly capable voice assistants in a weekend.

The key insight: treat tools like API endpoints. If you can build a REST API, you can build a voice assistant.

Why VAPI?#

Basic Architecture#

Setting Up an Assistant#

Adding Tools#

Webhook Handler (FastAPI)#

Webhook Security#

Exposing Your Webhook#

Choosing an LLM Provider#

Voice Selection#

Handling Conversation Context#

Error Handling#

Testing Your Assistant#

Production Checklist#

📬 Get the Newsletter