How to Build a Prompt for Voice AI with Contact and Memory: A Developer's Guide

TL;DR

Voice AI prompts fail when they lack context injection and memory state management. Build prompts that reference conversation history, user metadata, and call state—not generic instructions. Use system prompts with explicit turn-taking rules, instruction following patterns for function calls, and conversational memory to prevent repetition. This prevents the bot from asking "What's your name?" twice and keeps latency under 200ms on follow-ups.

Prerequisites

API Keys & Credentials

You'll need a VAPI API key (grab it from your dashboard—it's the Bearer token for all requests). If integrating Twilio for phone routing, grab your Account SID and Auth Token from console.twilio.com. Store these in .env files, never hardcoded.

System Requirements

Node.js 16+ (for async/await and fetch support). A code editor (VS Code works). Postman or curl for testing webhook payloads.

VAPI Knowledge

Familiarity with VAPI's assistant configuration structure—specifically how systemPrompt, model, voice, and transcriber objects work together. You don't need to be an expert, but you should understand that the system prompt is where instruction following happens.

Conversational Memory Basics

Understand the difference between stateless (single-turn) and stateful (multi-turn) conversations. Know that memory requires either session storage or external databases to persist context between calls.

Optional: Twilio Integration

If routing calls through Twilio, basic familiarity with webhooks and how Twilio forwards call events to your server.

VAPI: Get Started with VAPI → Get VAPI

Step-by-Step Tutorial

Configuration & Setup

Most voice AI prompts fail because developers treat them like chatbot instructions. Phone conversations need context retention, interruption handling, and explicit memory management. Here's how to build a production-grade prompt system.

First, set up your Vapi assistant with memory-aware configuration:

javascript

// Assistant config with conversation memory and contact context
const assistantConfig = {
  name: "Memory-Enabled Voice Agent",
  model: {
    provider: "openai",
    model: "gpt-4",
    temperature: 0.7,
    messages: [
      {
        role: "system",
        content: `You are a voice assistant with access to conversation history and contact information.

MEMORY MANAGEMENT:
- Reference previous interactions using {{contact.lastCallDate}} and {{contact.notes}}
- Update contact context after each call
- Acknowledge returning callers: "Welcome back, I see we last spoke on {{contact.lastCallDate}}"

CONVERSATION RULES:
- Keep responses under 30 words (voice constraint)
- Use natural speech patterns, avoid lists
- Handle interruptions gracefully - don't restart sentences
- Confirm understanding before taking action

CONTACT CONTEXT AVAILABLE:
- Name: {{contact.name}}
- Phone: {{contact.phone}}
- Previous issues: {{contact.history}}
- Preferences: {{contact.preferences}}`
      }
    ]
  },
  voice: {
    provider: "11labs",
    voiceId: "21m00Tcm4TlvDq8ikWAM"
  },
  firstMessage: "Hi {{contact.name}}, thanks for calling. How can I help you today?"
};

Critical prompt engineering pattern: Use template variables for dynamic context injection. The {{contact.*}} syntax pulls real-time data from your CRM or database, making each conversation personalized without bloating the system prompt.

Architecture & Flow

The memory system requires three components: prompt template, contact data store, and conversation state manager. When a call starts, Vapi injects contact context into the system prompt. During the conversation, the assistant references this context naturally. After the call, you update the contact record with new information.

Race condition warning: If you update contact data mid-call, the assistant won't see changes until the next session. Design your prompt to work with stale data or implement real-time context updates via function calling.

Step-by-Step Implementation

Step 1: Structure your system prompt with memory sections

Break your prompt into: Identity (who the assistant is), Memory Access (how to reference past data), Conversation Rules (voice-specific constraints), and Action Protocols (when to call functions).

Step 2: Implement contact context injection

Before creating a call, fetch contact data from your database and inject it into the assistant config. Use the metadata field to pass additional context that doesn't fit in the prompt:

javascript

const callConfig = {
  assistant: assistantConfig,
  customer: {
    number: "+1234567890"
  },
  metadata: {
    contactId: "cust_123",
    lastCallSummary: "Requested product demo, scheduled for next week",
    accountStatus: "premium"
  }
};

Step 3: Design memory-aware response patterns

Train your prompt to acknowledge context: "I see you called about X last time" instead of generic greetings. Use conditional logic in your prompt: "If {{contact.lastCallDate}} is within 7 days, reference the previous issue. Otherwise, treat as new conversation."

Step 4: Implement post-call memory updates

Use Vapi's webhook events to capture conversation summaries and update your contact database. The end-of-call-report webhook contains the full transcript and assistant analysis.

Error Handling & Edge Cases

Missing contact data: Always provide fallback values in your prompt template. If {{contact.name}} is null, default to "there" instead of breaking the greeting.

Memory hallucination: GPT-4 will sometimes invent past conversations if your prompt says "reference previous calls" but no history exists. Add explicit guards: "Only reference past interactions if {{contact.history}} is not empty."

Context window overflow: Long conversation histories exceed token limits. Implement summarization: store only the last 3 call summaries, not full transcripts.

Testing & Validation

Test with three contact profiles: new user (no history), returning user (1-2 past calls), and power user (10+ interactions). Verify the assistant adjusts its tone and references appropriately. Monitor for false memory references using your call logs.

Common Issues & Fixes

Issue: Assistant repeats information already in contact notes.
Fix: Add to prompt: "Do not ask for information already in {{contact.preferences}}. Confirm instead: 'I have your email as X, is that still correct?'"

Issue: Memory context makes responses too long for voice.
Fix: Separate detailed context (for assistant reasoning) from response instructions (keep under 30 words). Use a two-part prompt structure.

System Diagram

Audio processing pipeline from microphone input to speaker output.

mermaid

graph LR
    A[Microphone] --> B[Audio Buffer]
    B --> C[Voice Activity Detection]
    C -->|Speech Detected| D[Speech-to-Text]
    C -->|No Speech| E[Error: No Input Detected]
    D --> F[Intent Detection]
    F --> G[Response Generation]
    G --> H[Text-to-Speech]
    H --> I[Speaker]
    D -->|Error: Unrecognized Speech| J[Error Handling]
    J --> F
    F -->|Error: Intent Not Found| K[Fallback Response]
    K --> G

Testing & Validation

Most prompt failures surface during live calls, not in theory. Test with real voice input—text-based testing misses latency, barge-in, and memory persistence issues.

Local Testing

Use the Vapi dashboard's Call button to validate your assistant configuration before production. This tests the complete flow: greeting delivery, variable extraction from speech, and memory injection into subsequent responses.

javascript

// Test assistant with memory injection via metadata
const testCallConfig = {
  assistant: {
    name: "Support Agent",
    model: {
      provider: "openai",
      model: "gpt-4",
      temperature: 0.7,
      messages: [
        {
          role: "system",
          content: `You are a support agent. Customer context: {{customer.Name}} from {{customer.Phone}}. Previous issues: {{customer.issues}}. Last call summary: {{metadata.lastCallSummary}}`
        }
      ]
    },
    voice: {
      provider: "11labs",
      voiceId: "21m00Tcm4TlvDq8ikWAM"
    },
    firstMessage: "Hi {{customer.Name}}, I see you called about {{customer.issues}}. How can I help today?"
  },
  customer: {
    number: "+14155551234"
  },
  metadata: {
    contactId: "test_001",
    lastCallSummary: "Billing inquiry resolved",
    accountStatus: "active"
  }
};

// Trigger test call via dashboard or SDK
// Verify: Does greeting use customer.Name? Does prompt reference lastCallSummary?

Validation checklist:

Memory injection: Say "What did we discuss last time?" → Assistant should reference lastCallSummary
Variable substitution: Greeting must use actual customer.Name, not literal {{customer.Name}}
Barge-in handling: Interrupt mid-sentence → Does context persist after interruption?

Webhook Validation

If using server-side memory updates, validate webhook delivery with request logging. Most memory bugs stem from webhook timeouts (>5s) or malformed payloads.

javascript

// Server endpoint to receive call events
app.post('/webhook/vapi', (req, res) => {
  const event = req.body;
  
  console.log('Event received:', {
    type: event.message?.type,
    callId: event.message?.call?.id,
    timestamp: new Date().toISOString()
  });
  
  // Validate memory payload structure
  if (event.message?.type === 'end-of-call-report') {
    const summary = event.message.call?.analysis?.summary;
    if (!summary) {
      console.error('Missing call summary in webhook');
    }
  }
  
  // Respond within 5s to avoid timeout
  res.status(200).json({ received: true });
});

Common webhook failures:

Timeout: Processing takes >5s → Vapi retries, causing duplicate memory writes
Missing signature validation: Implement x-vapi-signature header check (see Vapi webhook docs)
Race condition: Call ends while assistant is mid-response → Summary may be incomplete

Test with ngrok: ngrok http 3000 → Update assistant's serverUrl in dashboard → Make test call → Check ngrok logs for webhook delivery timing.

Real-World Example

Barge-In Scenario

Production voice AI breaks when users interrupt mid-sentence. Here's what happens when a customer cuts off your agent during a billing explanation:

javascript

// Streaming STT handler with partial transcript processing
const handleStreamingTranscript = (event) => {
  const { partialTranscript, isFinal, timestamp } = event;
  
  // Detect interruption: user speaks while agent is talking
  if (isAgentSpeaking && partialTranscript.length > 3) {
    // Cancel TTS immediately - don't wait for full transcript
    flushAudioBuffer();
    isAgentSpeaking = false;
    
    // Update conversation context with interruption point
    assistantConfig.messages.push({
      role: 'system',
      content: `User interrupted at: "${lastAgentMessage.substring(0, 50)}..."`
    });
  }
  
  // Only process final transcripts to avoid duplicate responses
  if (isFinal) {
    processUserInput(partialTranscript);
  }
};

// Buffer flush prevents old audio playing after interrupt
const flushAudioBuffer = () => {
  if (audioQueue.length > 0) {
    audioQueue = []; // Clear queued TTS chunks
    currentPlayback?.stop(); // Kill active audio stream
  }
};

Event Logs

Real event sequence from a barge-in at 14:32:18 UTC:

14:32:18.234 - speech-started: { partialTranscript: "Actually I need to—" }
14:32:18.235 - agent-speaking: true (TTS playing: "Your account balance is...")
14:32:18.240 - interrupt-detected: Flushing 3 audio chunks from buffer
14:32:18.890 - speech-final: { transcript: "Actually I need to update my card" }
14:32:19.120 - function-call: updatePaymentMethod({ contactId: "c_abc123" })

The 656ms gap between interrupt detection and final transcript is why you MUST act on partials. Waiting for isFinal means the agent keeps talking for another half-second.

Edge Cases

Multiple rapid interrupts: User says "wait—no actually—" within 2 seconds. Solution: 300ms debounce on partialTranscript events before flushing buffer. Otherwise you cancel TTS on breathing sounds.

False positive from background noise: Dog barks trigger VAD. Check partialTranscript.length > 3 before treating it as real speech. Single-word fragments are usually noise.

Context loss on interrupt: Agent was explaining a 3-step process, user cuts in at step 2. Store lastAgentMessage in assistantConfig.messages so the LLM knows what was already said: "I was explaining payment options when you interrupted."

Common Issues & Fixes

Prompt Drift After Multiple Turns

Problem: After 5-7 conversation turns, assistants ignore system instructions and start hallucinating or breaking character. This happens because the context window fills with conversation history, pushing the original system prompt out of the model's attention.

Fix: Implement conversation memory compression. Instead of sending the full transcript, summarize completed turns and inject fresh context:

javascript

// Compress conversation history before hitting token limits
const compressMemory = (messages) => {
  if (messages.length < 10) return messages;
  
  const recentMessages = messages.slice(-5); // Keep last 5 turns
  const olderMessages = messages.slice(0, -5);
  
  // Summarize older context
  const summary = {
    role: "system",
    content: `Previous conversation summary: Customer ${olderMessages.filter(m => m.role === 'user').length} questions about ${metadata.issues.join(', ')}. Preferences: ${metadata.Preferences}.`
  };
  
  return [summary, ...recentMessages];
};

// Update assistantConfig before each call
assistantConfig.model.messages = [
  { role: "system", content: assistantConfig.model.messages[0].content },
  ...compressMemory(conversationHistory)
];

Why this breaks: GPT-4 has 8K context. At ~4 tokens/word, a 10-turn conversation (200 words/turn) = 8K tokens. Your system prompt gets truncated first.

Contact Data Not Persisting Between Calls

Problem: You pass metadata.contactId and metadata.lastCallSummary, but the assistant forgets the customer on the next call. This happens because Vapi doesn't store metadata server-side—you must retrieve it on every inbound call.

Fix: Use webhook events to fetch contact data when calls start:

javascript

// Webhook handler - fetch contact context on call start
app.post('/webhook/vapi', async (req, res) => {
  const event = req.body;
  
  if (event.type === 'call-started') {
    const phoneNumber = event.call.customer.number;
    
    // Fetch from YOUR database (not Vapi's)
    const contact = await db.contacts.findOne({ phone: phoneNumber });
    
    if (contact) {
      // Inject memory into assistant context
      const contextUpdate = {
        assistant: {
          model: {
            messages: [
              ...assistantConfig.model.messages,
              {
                role: "system",
                content: `Contact: ${contact.Name}. Last call summary: ${contact.lastCallSummary}. Account status: ${contact.accountStatus}.`
              }
            ]
          }
        }
      };
      
      // Update assistant mid-call (if supported) or store for next turn
      await fetch(`https://api.vapi.ai/call/${event.call.id}`, {
        method: 'PATCH',
        headers: {
          'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify(contextUpdate)
      });
    }
  }
  
  res.sendStatus(200);
});

Critical: Vapi's metadata field is write-only for YOUR tracking. It doesn't auto-populate on inbound calls. You must query your CRM/database using customer.number as the lookup key.

Voice Interruptions Losing Context

Problem: When users barge in mid-sentence, the assistant restarts its response from scratch instead of continuing the thought. This happens because flushAudioBuffer() clears TTS state but doesn't preserve the assistant's reasoning.

Fix: Store partial responses before flushing:

javascript

let partialResponse = "";

const handleStreamingTranscript = (text, isFinal) => {
  if (!isFinal) {
    partialResponse += text; // Accumulate partial reasoning
  }
  
  if (bargeInDetected) {
    flushAudioBuffer();
    
    // Resume with context instead of restarting
    const resumePrompt = {
      role: "system",
      content: `User interrupted. You were saying: "${partialResponse}". Acknowledge interruption and continue or pivot based on user input.`
    };
    
    assistantConfig.model.messages.push(resumePrompt);
    partialResponse = ""; // Reset after injection
  }
};

Latency impact: Storing partials adds 15-30ms per turn but prevents the "robotic restart" UX that tanks CSAT scores.

Complete Working Example

Here's a production-ready voice assistant that handles contact memory, conversation context, and streaming responses. This combines all the patterns from previous sections into a single deployable server.

Full Server Code

This example shows a complete Express server with contact lookup, memory compression, and streaming transcript handling. All routes are included: webhook handler, outbound call trigger, and health check.

javascript

// server.js - Production voice assistant with contact memory
const express = require('express');
const crypto = require('crypto');
const app = express();

app.use(express.json());

// In-memory contact database (replace with your CRM)
const contacts = {
  '+14155551234': {
    Name: 'Sarah Chen',
    Phone: '+14155551234',
    issues: ['billing dispute', 'service outage'],
    Preferences: 'prefers email follow-ups',
    accountStatus: 'premium'
  }
};

// Session memory store with TTL cleanup
const sessions = {};
const SESSION_TTL = 3600000; // 1 hour

// Compress conversation history to prevent token overflow
function compressMemory(recentMessages, olderMessages) {
  const summary = olderMessages
    .filter(msg => msg.role === 'assistant')
    .map(msg => msg.content.substring(0, 100))
    .join('; ');
  
  return {
    role: 'system',
    content: `Previous context: ${summary}. Continue naturally.`
  };
}

// Webhook handler - receives all call events
app.post('/webhook/vapi', async (req, res) => {
  const event = req.body;
  const callId = event.call?.id;
  
  // Verify webhook signature (production requirement)
  const signature = req.headers['x-vapi-signature'];
  const expectedSig = crypto
    .createHmac('sha256', process.env.VAPI_SERVER_SECRET)
    .update(JSON.stringify(req.body))
    .digest('hex');
  
  if (signature !== expectedSig) {
    return res.status(401).json({ error: 'Invalid signature' });
  }

  // Handle assistant-request: inject contact context
  if (event.message?.type === 'assistant-request') {
    const phoneNumber = event.call?.customer?.number;
    const contact = contacts[phoneNumber] || { Name: 'Unknown Caller' };
    
    // Build context-aware system prompt
    const contextUpdate = {
      assistant: {
        model: {
          provider: 'openai',
          model: 'gpt-4',
          messages: [
            {
              role: 'system',
              content: `You are a support agent. Customer: ${contact.Name}. 
Account status: ${contact.accountStatus || 'standard'}. 
Previous issues: ${contact.issues?.join(', ') || 'none'}. 
Preferences: ${contact.Preferences || 'none'}.
Use this context naturally - don't recite it.`
            }
          ]
        }
      }
    };
    
    return res.json(contextUpdate);
  }

  // Handle transcript updates: compress memory if needed
  if (event.message?.type === 'transcript') {
    if (!sessions[callId]) {
      sessions[callId] = { messages: [], created: Date.now() };
    }
    
    const session = sessions[callId];
    session.messages.push({
      role: event.message.role,
      content: event.message.transcript
    });
    
    // Compress if history exceeds 20 messages
    if (session.messages.length > 20) {
      const recentMessages = session.messages.slice(-10);
      const olderMessages = session.messages.slice(0, -10);
      const summary = compressMemory(recentMessages, olderMessages);
      
      session.messages = [summary, ...recentMessages];
    }
  }

  // Handle end-of-call-report: persist summary
  if (event.message?.type === 'end-of-call-report') {
    const phoneNumber = event.call?.customer?.number;
    const summary = event.message.summary;
    
    if (contacts[phoneNumber]) {
      contacts[phoneNumber].lastCallSummary = summary;
      contacts[phoneNumber].lastCallAt = new Date().toISOString();
    }
    
    // Cleanup session after 5 minutes
    setTimeout(() => delete sessions[callId], 300000);
  }

  res.sendStatus(200);
});

// Trigger outbound call with pre-loaded context
app.post('/call/outbound', async (req, res) => {
  const { phoneNumber, metadata } = req.body;
  const contact = contacts[phoneNumber];
  
  if (!contact) {
    return res.status(404).json({ error: 'Contact not found' });
  }

  try {
    const response = await fetch('https://api.vapi.ai/call', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        assistant: {
          model: {
            provider: 'openai',
            model: 'gpt-4',
            messages: [
              {
                role: 'system',
                content: `Calling ${contact.Name}. Account: ${contact.accountStatus}. 
Reason: ${metadata?.reason || 'follow-up'}. Be concise and respectful.`
              }
            ]
          },
          voice: {
            provider: 'elevenlabs',
            voiceId: '21m00Tcm4TlvDq8ikWAM'
          },
          firstMessage: `Hi ${contact.Name}, this is a follow-up call.`
        },
        customer: {
          number: phoneNumber
        }
      })
    });

    if (!response.ok) {
      throw new Error(`HTTP ${response.status}: ${await response.text()}`);
    }

    const callData = await response.json();
    res.json({ callId: callData.id, status: 'initiated' });
  } catch (error) {
    console.error('Outbound call failed:', error);
    res.status(500).json({ error: error.message });
  }
});

// Health check
app.get('/health', (req, res) => {
  res.json({ 
    status: 'ok', 
    sessions: Object.keys(sessions).length,
    contacts: Object.keys(contacts).length
  });
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Voice assistant server running on port ${PORT}`);
});

Run Instructions

Prerequisites:

Node.js 18+
Vapi account with API key
Public HTTPS endpoint (use ngrok for testing)

Setup:

bash

npm install express
export VAPI_API_KEY='your_api_key_here'
export VAPI_SERVER_SECRET='your_webhook_secret'
node server.js

Configure Vapi webhook:

Go to dashboard.vapi.ai → Settings → Server URL
Set to https://your-domain.com/webhook/vapi
Copy the server secret to VAPI_SERVER_SECRET

Test the flow:

bash

# Trigger outbound call
curl -X POST http://localhost:3000/call/outbound \
  -H "Content-Type: application/json" \
  -d '{"phoneNumber": "+14155551234", "metadata": {"reason": "billing follow-up"}}'

The assistant will inject contact context automatically, compress memory after 20 messages, and persist call summaries. Replace the in-memory contacts object with your CRM integration for production use.

FAQ

Technical Questions

How do I structure a system prompt for voice AI to maintain context across multiple calls?

Use a hierarchical prompt structure with three layers: (1) Core instructions defining the agent's role and constraints, (2) Contact context injected from your database (name, account status, previous issues), and (3) Conversation memory containing recent exchanges. In the assistantConfig, set the messages array with a system role containing your core instructions, then append contact metadata and conversation history before each call. This prevents the model from forgetting caller identity or previous resolutions mid-conversation.

What's the difference between prompt engineering for voice vs. text AI?

Voice AI requires shorter, more direct instructions because latency compounds with every token. Text prompts can be verbose; voice prompts must be concise. Include explicit turn-taking rules ("Wait for the caller to finish before responding"), silence handling ("If silence exceeds 2 seconds, ask a clarifying question"), and error recovery ("If you don't understand, ask the caller to repeat once"). Voice also demands personality consistency—the model's tone must match your brand across interruptions and context switches.

How do I prevent the model from hallucinating contact information?

Inject verified contact data into the metadata field of callConfig, not into the system prompt as free text. Use structured JSON with explicit fields: { contactId, Name, Phone, accountStatus, lastCallSummary }. In your system prompt, reference these fields by name: "Use the caller's Name from metadata. Never invent account details." This forces the model to reference injected data rather than generate plausible-sounding information.

Performance

Why does my voice AI response latency spike after 5+ turns in a conversation?

The messages array grows with each exchange. By turn 10, you're sending 20+ previous messages to the LLM, increasing token count and latency by 200-400ms. Implement compressMemory() to summarize old exchanges into a single lastCallSummary field after 5 turns. Keep only the last 3-4 exchanges in the active messages array. This reduces token overhead while preserving context.

How do I handle memory limits for long-running calls?

Store full conversation history in your database, not in the prompt. Keep only a rolling window of 5-10 recent messages in assistantConfig.messages. After each turn, append the exchange to your database and prune the in-memory array. For calls exceeding 30 minutes, generate a summary every 10 minutes and replace older messages with the summary. This prevents memory bloat and keeps latency flat.

Platform Comparison

Should I use Twilio or VAPI for voice AI with memory?

VAPI handles prompt management, transcription, and TTS natively—use it for the core voice pipeline. Twilio excels at call routing, PSTN integration, and compliance (HIPAA, PCI). In production, VAPI manages the AI conversation; Twilio manages the phone infrastructure. Use Twilio's CallSid to link calls to your database, then pass contact context to VAPI via metadata. This separation prevents vendor lock-in and lets each platform do what it does best.

Can I use the same prompt across voice and text channels?

No. Voice prompts must be 30-40% shorter and include explicit silence/interruption handling. Text prompts can be verbose and assume the user reads carefully. Maintain two versions of your system prompt: one for voice (concise, turn-taking rules), one for text (detailed, formatting-aware). Store both in your database and select the correct version based on the channel in assistantConfig.messages.

Resources

Twilio: Get Twilio Voice API → https://www.twilio.com/try-twilio

VAPI Documentation

Assistant Configuration Reference – Model selection, voice settings, system prompts, memory handling
Function Calling Guide – Contact lookup, context injection, real-time data retrieval
Webhook Events – Call lifecycle, transcript streaming, conversation analysis

Twilio Integration

Twilio Voice API – Phone number management, call routing, SIP integration
Twilio Node.js SDK – Call creation, metadata passing

Prompt Engineering Resources

OpenAI Prompt Engineering Guide – System prompt design, instruction following, few-shot examples
Anthropic Prompt Best Practices – Conversational memory, context windows, token optimization

GitHub References

VAPI Examples Repository – Production-grade webhook handlers, session management, memory compression patterns

References

https://docs.vapi.ai/quickstart/phone
https://docs.vapi.ai/assistants/quickstart
https://docs.vapi.ai/workflows/quickstart
https://docs.vapi.ai/quickstart/introduction
https://docs.vapi.ai/chat/quickstart
https://docs.vapi.ai/quickstart/web
https://docs.vapi.ai/observability/evals-quickstart

How to Build a Prompt for Voice AI with Contact and Memory: A Developer's Guide

How to Build a Prompt for Voice AI with Contact and Memory: A Developer's Guide

TL;DR

Prerequisites

Step-by-Step Tutorial

Configuration & Setup

Architecture & Flow

Step-by-Step Implementation

Error Handling & Edge Cases

Testing & Validation

Common Issues & Fixes

System Diagram

Testing & Validation

Local Testing

Webhook Validation

Real-World Example

Barge-In Scenario

Event Logs

Edge Cases

Common Issues & Fixes

Prompt Drift After Multiple Turns

Contact Data Not Persisting Between Calls

Voice Interruptions Losing Context

Complete Working Example

Full Server Code

Run Instructions

FAQ

Technical Questions

Performance

Platform Comparison

Resources

References

Topics

Written by

Found this helpful?

Continue Reading

Create Voice Flows with SDKs and Low-Code Builders for Non-Engineers

How to Deploy a Voice AI Agent Using Railway for eCommerce Success

Implementing Real-Time Streaming with VAPI for Live Support Chat Systems