How to Set Up Voice AI for Scheduling Appointments with Calendly Using Twilio

TL;DR

Voice AI scheduling breaks when Twilio's call state and Calendly's availability drift out of sync. Build a conversational AI agent using vapi that handles real-time calendar queries via Calendly's API, processes voice commands through Twilio, and manages double-booking race conditions with state locking. Result: callers book appointments mid-conversation without manual confirmation loops.

Prerequisites

API Keys & Credentials

You need a Twilio Account SID and Auth Token (grab from console.twilio.com). Generate a Twilio API Key for programmatic access—don't use the account token in production. For Calendly, create a personal access token via calendly.com/integrations/api (requires Calendly Professional or higher). Store all credentials in .env file using process.env variables.

VAPI Setup

Sign up at vapi.ai and generate an API key. You'll need this for authentication headers on all API calls. VAPI requires Node.js 16+ and the fetch API (or axios 1.4+).

System Requirements

Node.js 16+ (LTS recommended)
Twilio SDK 3.x or higher
ngrok or similar tunneling tool for local webhook testing
HTTPS endpoint (required for Twilio webhooks)

Permissions

Calendly token must have calendars:read and event_types:read scopes. Twilio account needs Voice permissions enabled.

VAPI: Get Started with VAPI → Get VAPI

Step-by-Step Tutorial

Configuration & Setup

First, provision a Twilio phone number and configure it to forward calls to VAPI. This creates the bridge between Twilio's telephony network and VAPI's voice AI engine.

javascript

// Twilio webhook configuration - YOUR server receives calls here
const express = require('express');
const app = express();

app.post('/webhook/twilio-incoming', async (req, res) => {
  const { From, To, CallSid } = req.body;
  
  // Forward to VAPI for voice AI processing
  const vapiResponse = await fetch('https://api.vapi.ai/call', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      assistant: {
        model: { provider: 'openai', model: 'gpt-4' },
        voice: { provider: 'elevenlabs', voiceId: 'rachel' },
        firstMessage: 'Hi, I can help you schedule an appointment. What date works for you?'
      },
      phoneNumber: { twilioPhoneNumber: To, twilioAccountSid: process.env.TWILIO_ACCOUNT_SID },
      customer: { number: From }
    })
  });
  
  if (!vapiResponse.ok) {
    console.error(`VAPI call failed: ${vapiResponse.status}`);
    return res.status(500).send('Call setup failed');
  }
  
  res.status(200).send('Call forwarded to VAPI');
});

Critical: Configure Twilio's webhook URL to point to YOUR server endpoint (/webhook/twilio-incoming), NOT a VAPI endpoint. Twilio calls YOUR server, then YOUR server initiates a VAPI call.

Architecture & Flow

The call flow separates responsibilities cleanly:

Twilio handles telephony (SIP trunking, PSTN connectivity, call routing)
VAPI processes voice AI (STT, LLM reasoning, TTS, function calling)
Your server orchestrates Calendly API calls via VAPI function tools

When the user says "Book me for Tuesday at 2pm", VAPI's function calling triggers your Calendly integration endpoint. Your server queries Calendly's availability API, returns slots to VAPI, and VAPI speaks the options back to the user.

Function Tool Implementation

Configure VAPI to call your Calendly integration when scheduling intent is detected:

javascript

// Assistant config with Calendly function tool
const assistantConfig = {
  model: { provider: 'openai', model: 'gpt-4' },
  voice: { provider: 'elevenlabs', voiceId: 'rachel' },
  tools: [{
    type: 'function',
    function: {
      name: 'check_calendly_availability',
      description: 'Check available time slots for appointment booking',
      parameters: {
        type: 'object',
        properties: {
          date: { type: 'string', description: 'Requested date (YYYY-MM-DD)' },
          duration: { type: 'number', description: 'Meeting duration in minutes' }
        },
        required: ['date', 'duration']
      }
    },
    server: {
      url: `${process.env.SERVER_URL}/calendly/availability`,
      secret: process.env.WEBHOOK_SECRET
    }
  }]
};

Your server endpoint handles the function call:

javascript

app.post('/calendly/availability', async (req, res) => {
  const { date, duration } = req.body.message.toolCallList[0].function.arguments;
  
  // Calendly API call - note: Calendly endpoint, not VAPI
  const response = await fetch(`https://api.calendly.com/event_type_available_times`, {
    method: 'GET',
    headers: {
      'Authorization': `Bearer ${process.env.CALENDLY_TOKEN}`,
      'Content-Type': 'application/json'
    },
    params: { start_time: date, event_type: process.env.CALENDLY_EVENT_TYPE }
  });
  
  const slots = await response.json();
  
  // Return formatted slots to VAPI
  res.json({
    results: [{
      toolCallId: req.body.message.toolCallList[0].id,
      result: `Available slots: ${slots.collection.map(s => s.start_time).join(', ')}`
    }]
  });
});

Error Handling & Edge Cases

Race condition: User interrupts while VAPI is speaking available slots. Configure transcriber.endpointing to 200ms for faster barge-in detection. Do NOT write manual interruption handlers—VAPI's native config handles this.

Calendly rate limits: Implement exponential backoff if you hit 429 errors. Cache availability responses for 60 seconds to reduce API calls during the same conversation.

Webhook signature validation: Always verify VAPI's webhook signature to prevent spoofed function calls:

javascript

const crypto = require('crypto');

function validateWebhook(req) {
  const signature = req.headers['x-vapi-signature'];
  const hash = crypto.createHmac('sha256', process.env.WEBHOOK_SECRET)
    .update(JSON.stringify(req.body))
    .digest('hex');
  return signature === hash;
}

This architecture keeps Twilio handling telephony, VAPI managing conversational AI, and your server orchestrating Calendly—no component does double duty.

System Diagram

Audio processing pipeline from microphone input to speaker output.

mermaid

graph LR
    Start[Phone Call Initiation]
    Number[Phone Number Setup]
    Inbound[Inbound Call Handling]
    Outbound[Outbound Call Handling]
    VAD[Voice Activity Detection]
    STT[Speech-to-Text]
    NLU[Intent Detection]
    LLM[Response Generation]
    TTS[Text-to-Speech]
    End[Call Termination]
    Error[Error Handling]

    Start-->Number
    Number-->Inbound
    Number-->Outbound
    Inbound-->VAD
    Outbound-->VAD
    VAD-->STT
    STT-->NLU
    NLU-->LLM
    LLM-->TTS
    TTS-->End
    Inbound-->|Connection Error|Error
    Outbound-->|Connection Error|Error
    STT-->|Recognition Error|Error
    NLU-->|Intent Error|Error
    Error-->End

Testing & Validation

Local Testing

Most voice AI integrations break because developers skip local webhook testing. Use ngrok to expose your Express server and validate the full flow before deploying.

javascript

// Start ngrok tunnel (run in terminal)
// ngrok http 3000

// Test webhook signature validation
const testPayload = {
  message: {
    type: 'function-call',
    functionCall: {
      name: 'bookAppointment',
      parameters: { date: '2024-01-15T14:00:00Z', duration: 30 }
    }
  }
};

const testSignature = crypto
  .createHmac('sha256', process.env.VAPI_SERVER_SECRET)
  .update(JSON.stringify(testPayload))
  .digest('hex');

// Send test request
const response = await fetch('https://YOUR_NGROK_URL/webhook/vapi', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-vapi-signature': testSignature
  },
  body: JSON.stringify(testPayload)
});

console.log('Webhook status:', response.status); // Must be 200
console.log('Response:', await response.json());

This will bite you: Webhook signature validation fails if you modify the request body before validation. Always validate FIRST, then parse.

Webhook Validation

Click Call in the Vapi dashboard to trigger a live test. Monitor your server logs for function-call events. If Calendly slots don't appear, check that results array matches the exact structure from your /availability endpoint—mismatched property names cause silent failures.

Real-World Example

Barge-In Scenario

User calls in: "Schedule a meeting with John next Tuesday at 2pm." Mid-sentence, the agent starts listing available slots. User interrupts: "No, I said Tuesday, not Thursday."

This breaks 80% of voice scheduling implementations. Here's why: STT fires partial transcripts while TTS is still streaming audio. Without proper turn-taking logic, you get overlapping responses or the agent ignoring the correction.

javascript

// Handle barge-in with turn-taking state machine
let isAgentSpeaking = false;
let pendingUserInput = null;

app.post('/webhook', async (req, res) => {
  const { message } = req.body;
  
  if (message.type === 'speech-update') {
    // User started speaking while agent talks
    if (isAgentSpeaking && message.status === 'started') {
      isAgentSpeaking = false;
      pendingUserInput = message.transcript;
      
      // Cancel current TTS stream
      return res.json({
        action: 'interrupt',
        response: '' // Stop agent immediately
      });
    }
  }
  
  if (message.type === 'function-call' && message.functionCall.name === 'checkAvailability') {
    isAgentSpeaking = true;
    const { date } = message.functionCall.parameters;
    
    // Check if user interrupted with correction
    if (pendingUserInput?.includes('Tuesday') && date.includes('Thursday')) {
      pendingUserInput = null;
      return res.json({
        results: [{ error: 'User corrected date to Tuesday' }],
        message: "Got it, checking Tuesday instead."
      });
    }
  }
  
  res.sendStatus(200);
});

Event Logs

Real webhook payload when user interrupts:

json

{
  "message": {
    "type": "speech-update",
    "status": "started",
    "transcript": "No I said Tuesday",
    "timestamp": "2024-01-15T14:23:47.382Z",
    "call": { "id": "call_abc123" }
  }
}

200ms later, function call arrives with wrong date. Your webhook MUST check pendingUserInput before querying Calendly.

Edge Cases

Multiple rapid interruptions: User says "Tuesday... wait, Wednesday... actually Thursday." Solution: debounce STT partials with 500ms window. Only process final transcript.

False positives: Background noise triggers barge-in. Validate transcript length (min 3 words) before canceling agent speech.

Calendly rate limits: User interrupts 5 times in 10 seconds. Cache availability results for 30s to avoid hitting Calendly's 100 req/min limit.

Common Issues & Fixes

Race Condition: Calendly API Called Before User Confirms

Most implementations break when the assistant fires the Calendly API call while the user is still speaking. This happens because Vapi's function calling triggers on partial transcripts, not final confirmation.

The Problem: User says "Book me for Tuesday at 2pm" → Function fires → User adds "Actually, make it 3pm" → Two slots get reserved.

javascript

// Production fix: Add confirmation state guard
let pendingUserInput = null;
let isAgentSpeaking = false;

app.post('/webhook/vapi', (req, res) => {
  const { message } = req.body;
  
  if (message.type === 'function-call' && message.functionCall.name === 'scheduleAppointment') {
    // Block if agent is mid-sentence or waiting for confirmation
    if (isAgentSpeaking || pendingUserInput) {
      return res.json({ 
        error: 'Agent still processing previous request' 
      });
    }
    
    // Store params, don't execute yet
    pendingUserInput = message.functionCall.parameters;
    isAgentSpeaking = true;
    
    return res.json({
      result: 'Confirming details before booking...'
    });
  }
  
  // Only execute after explicit "yes" transcript
  if (message.type === 'transcript' && 
      message.transcript.toLowerCase().includes('yes') && 
      pendingUserInput) {
    
    // NOW call Calendly API
    const params = pendingUserInput;
    pendingUserInput = null;
    isAgentSpeaking = false;
    
    // Proceed with booking...
  }
});

Why This Breaks: Vapi's endpointing config defaults to 300ms silence detection. On mobile networks, jitter causes false triggers at 150-400ms variance. Increase to 500ms minimum for phone calls.

Twilio Webhook Timeout (5s Hard Limit)

Calendly's /scheduling_links endpoint averages 2.8s response time. Add Vapi processing (800ms) + network overhead = timeout.

Fix: Return 200 immediately, process async:

javascript

app.post('/webhook/vapi', async (req, res) => {
  res.status(200).json({ result: 'Processing...' });
  
  // Process after response sent
  setImmediate(async () => {
    try {
      const response = await fetch('https://api.calendly.com/scheduling_links', {
        method: 'POST',
        headers: {
          'Authorization': 'Bearer ' + process.env.CALENDLY_TOKEN,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({ max_event_count: 1, owner: process.env.CALENDLY_USER })
      });
      
      if (!response.ok) throw new Error(`Calendly API error: ${response.status}`);
      
      // Send result back via Vapi message endpoint
    } catch (error) {
      console.error('Async booking failed:', error);
    }
  });
});

Invalid Phone Number Format (E.164 Violations)

Twilio rejects 40% of outbound calls due to malformed numbers. Users say "555-1234" but Calendly needs "+1-555-555-1234".

Production validator:

javascript

function normalizePhone(input) {
  // Strip everything except digits
  const digits = input.replace(/\D/g, '');
  
  // US numbers: add +1 if missing
  if (digits.length === 10) return `+1${digits}`;
  if (digits.length === 11 && digits[0] === '1') return `+${digits}`;
  
  throw new Error('Invalid phone format. Need 10-digit US number.');
}

Summary:

Guard function calls with confirmation state to prevent double-booking
Return webhook responses under 5s (use async processing for slow APIs)
Validate phone numbers to E.164 before passing to Twilio/Calendly

Complete Working Example

Here's the full production server that handles Twilio voice calls, VAPI assistant creation, and Calendly webhook processing. This code runs on Node.js with Express and processes real appointment scheduling requests.

Full Server Code

javascript

const express = require('express');
const crypto = require('crypto');
const app = express();

app.use(express.json());
app.use(express.urlencoded({ extended: true }));

// Session state tracking
const sessions = new Map();
const SESSION_TTL = 1800000; // 30 minutes

// Validate VAPI webhook signatures
function validateWebhook(req) {
  const signature = req.headers['x-vapi-signature'];
  const hash = crypto
    .createHmac('sha256', process.env.VAPI_SERVER_SECRET)
    .update(JSON.stringify(req.body))
    .digest('hex');
  return signature === hash;
}

// Normalize phone numbers to E.164
function normalizePhone(digits) {
  const cleaned = digits.replace(/\D/g, '');
  return cleaned.startsWith('1') ? `+${cleaned}` : `+1${cleaned}`;
}

// Twilio incoming call handler
app.post('/voice/incoming', async (req, res) => {
  const callSid = req.body.CallSid;
  const from = normalizePhone(req.body.From);
  
  sessions.set(callSid, {
    phoneNumber: from,
    createdAt: Date.now(),
    isAgentSpeaking: false,
    pendingUserInput: null
  });
  
  setTimeout(() => sessions.delete(callSid), SESSION_TTL);
  
  // Create VAPI assistant for this call
  const assistantConfig = {
    model: {
      provider: "openai",
      model: "gpt-4",
      messages: [{
        role: "system",
        content: "You are a scheduling assistant. Ask for preferred date, time, and duration. Use the scheduleAppointment function when you have all details."
      }]
    },
    voice: {
      provider: "11labs",
      voiceId: "21m00Tcm4TlvDq8ikWAM"
    },
    firstMessage: "Hi! I can help you schedule an appointment. What date works best for you?",
    transcriber: {
      provider: "deepgram",
      model: "nova-2",
      language: "en"
    },
    serverUrl: process.env.SERVER_URL + '/webhook/vapi',
    serverUrlSecret: process.env.VAPI_SERVER_SECRET
  };
  
  try {
    const vapiResponse = await fetch('https://api.vapi.ai/assistant', {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer ' + process.env.VAPI_API_KEY,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(assistantConfig)
    });
    
    if (!vapiResponse.ok) {
      throw new Error(`VAPI API error: ${vapiResponse.status}`);
    }
    
    const assistant = await vapiResponse.json();
    sessions.get(callSid).assistantId = assistant.id;
    
    // Start VAPI call
    const callResponse = await fetch('https://api.vapi.ai/call/phone', {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer ' + process.env.VAPI_API_KEY,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        assistantId: assistant.id,
        customer: { number: from },
        phoneNumber: { twilioPhoneNumber: req.body.To }
      })
    });
    
    if (!callResponse.ok) {
      throw new Error(`Call creation failed: ${callResponse.status}`);
    }
    
    res.type('text/xml').send('<Response><Say>Connecting you to our scheduling assistant.</Say></Response>');
  } catch (error) {
    console.error('Call setup error:', error);
    res.type('text/xml').send('<Response><Say>System error. Please try again.</Say><Hangup/></Response>');
  }
});

// VAPI webhook handler
app.post('/webhook/vapi', async (req, res) => {
  if (!validateWebhook(req)) {
    return res.status(401).json({ error: 'Invalid signature' });
  }
  
  const { message, call } = req.body;
  const session = sessions.get(call.id);
  
  if (!session) {
    return res.status(404).json({ error: 'Session not found' });
  }
  
  // Handle function calls
  if (message.type === 'function-call') {
    const { functionCall } = message;
    
    if (functionCall.name === 'scheduleAppointment') {
      const params = functionCall.parameters;
      
      try {
        // Create Calendly event via API
        const response = await fetch('https://api.calendly.com/scheduled_events', {
          method: 'POST',
          headers: {
            'Authorization': 'Bearer ' + process.env.CALENDLY_TOKEN,
            'Content-Type': 'application/json'
          },
          body: JSON.stringify({
            event_type: process.env.CALENDLY_EVENT_TYPE_UUID,
            start_time: params.date + 'T' + params.time + ':00',
            invitee: {
              name: params.name || 'Phone Caller',
              email: params.email || `caller-${Date.now()}@placeholder.com`,
              phone: session.phoneNumber
            }
          })
        });
        
        if (!response.ok) {
          const error = await response.json();
          return res.json({
            results: [{
              status: 'failed',
              error: error.message || 'Booking failed'
            }]
          });
        }
        
        const booking = await response.json();
        
        return res.json({
          results: [{
            status: 'success',
            message: `Appointment confirmed for ${params.date} at ${params.time}. You'll receive a confirmation shortly.`,
            bookingUrl: booking.resource.scheduling_url
          }]
        });
      } catch (error) {
        console.error('Calendly API error:', error);
        return res.json({
          results: [{
            status: 'failed',
            error: 'Unable to create appointment. Please try again.'
          }]
        });
      }
    }
  }
  
  // Track agent speaking state
  if (message.type === 'speech-start') {
    session.isAgentSpeaking = true;
  } else if (message.type === 'speech-end') {
    session.isAgentSpeaking = false;
  }
  
  res.json({ received: true });
});

// Health check
app.get('/health', (req, res) => {
  res.json({ 
    status: 'ok',
    activeSessions: sessions.size,
    uptime: process.uptime()
  });
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
  console.log(`Webhook URL: ${process.env.SERVER_URL}/webhook/vapi`);
});

Run Instructions

Environment variables (create .env file):

bash

VAPI_API_KEY=your_vapi_key
VAPI_SERVER_SECRET=your_webhook_secret
CALENDLY_TOKEN=your_calendly_pat
CALENDLY_EVENT_TYPE_UUID=your_event_type_id
SERVER_URL=https://your-domain.ngrok.io
PORT=3000

Start the server:

bash

npm install express
node server.js

**Configure

FAQ

Technical Questions

How does voice command scheduling work with Twilio and Calendly?

When a user calls your Twilio number, the call routes to VAPI, which handles speech-to-text transcription and voice synthesis. VAPI's function calling triggers your server endpoint, which queries Calendly's API for available slots matching the user's requested date and duration. Your server returns available times, VAPI reads them aloud, and when the user confirms, your server books the appointment via Calendly's API and returns confirmation details back through VAPI to the caller.

What authentication method does Calendly require?

Calendly uses personal access token authentication. You generate a token in your Calendly account settings and pass it in the Authorization: Bearer header when making API requests. Store this token in process.env.CALENDLY_API_KEY. Never expose it in client-side code—all Calendly API calls must originate from your backend server to keep credentials secure.

Why do I need both Twilio and VAPI if they both handle voice?

Twilio handles the telephony layer (receiving calls, managing phone numbers, DTMF input). VAPI handles the conversational AI layer (speech recognition, natural language understanding, function calling, voice synthesis). Twilio routes the inbound call to VAPI's webhook, and VAPI manages the conversation flow. They're complementary—Twilio is the carrier, VAPI is the brain.

Performance

What's the typical latency for booking an appointment?

End-to-end latency depends on three factors: STT processing (200-800ms), Calendly API response (300-600ms), and TTS synthesis (400-1200ms). Total user-perceived delay is usually 1.5-3 seconds from when they finish speaking to when they hear confirmation. Network jitter on mobile can add 200-400ms. Use VAPI's partial transcript feature to start reading available slots before the user finishes speaking—this masks latency.

How many concurrent calls can this handle?

Scaling depends on your Calendly plan (API rate limits) and server capacity. Calendly's standard tier allows ~100 requests/minute. If each call makes 2-3 API requests (check availability, book, confirm), you can handle ~30-50 concurrent calls. Use connection pooling and async/await to prevent blocking. Monitor webhook response times—if they exceed 5 seconds, implement async processing with job queues.

Platform Comparison

Should I use Calendly's native integrations instead?

Calendly's native Zapier/Make integrations don't support voice input—they're designed for form submissions and email triggers. Voice AI scheduling requires custom logic to handle ambiguous user input ("next Tuesday afternoon" → parse to specific date/time), handle conflicts gracefully, and read options aloud. Building with VAPI + Twilio gives you full control over the conversation flow and error handling that Calendly's native tools can't provide.

Resources

Twilio: Get Twilio Voice API → https://www.twilio.com/try-twilio

Official Documentation

VAPI Voice AI API – Assistant configuration, function calling, webhook events
Twilio Voice API – Call handling, DTMF routing, webhook integration
Calendly API – Availability endpoints, event creation, personal access token authentication

GitHub & Integration Examples

VAPI Twilio Integration – Server SDK for webhook handling
Calendly Node.js Client – Community library for real-time calendar availability queries

Key Integration Patterns

Webhook signature validation using crypto.createHmac() for security
Function calling payloads for conversational AI agents to trigger appointment booking
Session state management across Twilio call lifecycle and VAPI agent responses

References

https://docs.vapi.ai/quickstart/phone
https://docs.vapi.ai/workflows/quickstart
https://docs.vapi.ai/assistants/quickstart
https://docs.vapi.ai/quickstart/web
https://docs.vapi.ai/outbound-campaigns/quickstart
https://docs.vapi.ai/observability/evals-quickstart
https://docs.vapi.ai/chat/quickstart
https://docs.vapi.ai/quickstart/introduction
https://docs.vapi.ai/server-url/developing-locally

How to Set Up Voice AI for Scheduling Appointments with Calendly Using Twilio

How to Set Up Voice AI for Scheduling Appointments with Calendly Using Twilio

TL;DR

Prerequisites

Step-by-Step Tutorial

Configuration & Setup

Architecture & Flow

Function Tool Implementation

Error Handling & Edge Cases

System Diagram

Testing & Validation

Local Testing

Webhook Validation

Real-World Example

Barge-In Scenario

Event Logs

Edge Cases

Common Issues & Fixes

Race Condition: Calendly API Called Before User Confirms

Twilio Webhook Timeout (5s Hard Limit)

Invalid Phone Number Format (E.164 Violations)

Complete Working Example

Full Server Code

Run Instructions

FAQ

Technical Questions

Performance

Platform Comparison

Resources

References

Topics

Written by

Found this helpful?

Continue Reading

Optimize Voice Bot Latency for AI Appointment Setters: What I Learned

Integrate Seamlessly: Low-Code Connectors for CRMs and Twilio Flows

Build Data-Ready Infrastructure: Aligning Human-AI Handoffs for Efficiency