Table of Contents
How to Build a Voice Bot for HVAC Customer Inquiries with VAPI
TL;DR
Most HVAC support lines drop 40% of inbound calls during peak season. Build a voice bot that handles appointment scheduling, service history lookups, and emergency triage without human intervention. Stack VAPI's LLM-powered agent with Twilio's SIP trunking to route calls directly into your existing phone infrastructure. Result: 24/7 availability, zero dropped calls, and your team handles only complex issues.
Prerequisites
API Keys & Credentials
You'll need a VAPI API key (grab it from your VAPI dashboard) and a Twilio account with an active phone number and API credentials (Account SID and Auth Token). Store these in a .env file—never hardcode them.
System Requirements
Node.js 16+ with npm or yarn. A server capable of receiving webhooks (ngrok works for local testing, but use a real domain in production). HTTPS is mandatory—Twilio and VAPI reject HTTP callbacks.
Knowledge Assumptions
You should be comfortable with REST APIs, async/await in JavaScript, and basic webhook handling. Understanding SIP trunking concepts helps but isn't required. Familiarity with prompt engineering for LLMs is useful for tuning bot responses, but we'll cover the essentials.
Optional but Recommended
Postman or similar tool for testing API requests before integration. A basic understanding of how voice AI agents handle call routing and state management will accelerate your implementation.
Twilio: Get Twilio Voice API → Get Twilio
Step-by-Step Tutorial
Configuration & Setup
First, provision a Twilio phone number and grab your VAPI API key from the dashboard. You'll need both to route calls through VAPI's infrastructure.
// Environment configuration
const config = {
vapiApiKey: process.env.VAPI_API_KEY,
twilioAccountSid: process.env.TWILIO_ACCOUNT_SID,
twilioAuthToken: process.env.TWILIO_AUTH_TOKEN,
twilioPhoneNumber: process.env.TWILIO_PHONE_NUMBER,
serverUrl: process.env.SERVER_URL // Your ngrok/production URL
};
Critical: VAPI handles the voice AI layer. Twilio handles telephony routing. Don't try to merge these responsibilities—you'll create race conditions where both platforms try to process the same audio stream.
Architecture & Flow
flowchart LR
A[Customer Calls] --> B[Twilio Number]
B --> C[VAPI Assistant]
C --> D[Function Call: Check Availability]
D --> E[Your Server /webhook]
E --> F[HVAC Scheduling DB]
F --> E
E --> C
C --> A
The flow is linear: Twilio receives the call, forwards to VAPI, VAPI triggers function calls to your server when the customer wants to schedule. Your server queries availability and returns slots. VAPI speaks the response.
Step-by-Step Implementation
Step 1: Create the HVAC Assistant
Configure the assistant with domain-specific context. This isn't a generic chatbot—it needs to understand HVAC terminology and handle service requests.
const assistantConfig = {
name: "HVAC Support Assistant",
model: {
provider: "openai",
model: "gpt-4",
temperature: 0.7,
systemPrompt: `You are an HVAC service assistant. Handle:
- Emergency repairs (no heat/AC, gas leaks)
- Routine maintenance scheduling
- Service history inquiries
- Warranty questions
For emergencies, prioritize same-day dispatch. For maintenance, offer 3 available slots.
Always collect: address, issue description, preferred contact method.`
},
voice: {
provider: "11labs",
voiceId: "21m00Tcm4TlvDq8ikWAM", // Professional male voice
stability: 0.5,
similarityBoost: 0.75
},
transcriber: {
provider: "deepgram",
model: "nova-2",
language: "en-US"
},
firstMessage: "Thanks for calling ABC HVAC. Are you experiencing an emergency, or would you like to schedule maintenance?"
};
Step 2: Add Function Calling for Scheduling
The assistant needs to query your scheduling system. Define the function schema:
assistantConfig.functions = [
{
name: "check_availability",
description: "Check available service slots for HVAC appointments",
parameters: {
type: "object",
properties: {
serviceType: {
type: "string",
enum: ["emergency", "maintenance", "inspection"],
description: "Type of service needed"
},
zipCode: {
type: "string",
description: "Customer's zip code for technician routing"
},
preferredDate: {
type: "string",
description: "ISO date string (YYYY-MM-DD)"
}
},
required: ["serviceType", "zipCode"]
}
}
];
Step 3: Build the Webhook Handler
Your server receives function calls from VAPI and returns available slots:
// YOUR server endpoint (not a VAPI API endpoint)
app.post('/webhook/vapi', async (req, res) => {
const { message } = req.body;
if (message.type === 'function-call' && message.functionCall.name === 'check_availability') {
const { serviceType, zipCode, preferredDate } = message.functionCall.parameters;
// Query your scheduling database
const availableSlots = await db.query(`
SELECT time_slot FROM technician_schedule
WHERE zip_code = $1 AND service_type = $2 AND date >= $3
LIMIT 3
`, [zipCode, serviceType, preferredDate || new Date()]);
return res.json({
result: {
slots: availableSlots.map(s => s.time_slot),
message: serviceType === 'emergency'
? "We have a technician available within 2 hours."
: `Available slots: ${availableSlots.map(s => s.time_slot).join(', ')}`
}
});
}
});
Advertisement
Error Handling & Edge Cases
Barge-in during slot reading: Configure transcriber.endpointing to 200ms. Customers interrupt when they hear their preferred time—don't make them wait through all three slots.
No available slots: Return a fallback message with callback scheduling: "All technicians are booked today. I can have dispatch call you within 30 minutes to arrange service."
Zip code validation: Reject invalid formats immediately. Don't waste LLM tokens on malformed data: if (!/^\d{5}$/.test(zipCode)) return res.status(400).json({error: "Invalid zip"});
Testing & Validation
Call your Twilio number. Test the emergency path first—it should skip availability checks and route to immediate dispatch. Then test maintenance scheduling with valid/invalid zip codes. Monitor VAPI's dashboard for function call latency. Anything over 800ms feels sluggish to callers.
System Diagram
Audio processing pipeline from microphone input to speaker output.
graph LR
Start[User Input]
AudioCapture[Audio Capture]
NoiseReduction[Noise Reduction]
VAD[Voice Activity Detection]
STT[Speech-to-Text]
IntentDetection[Intent Detection]
DialogManager[Dialog Management]
APIIntegration[API Integration]
ResponseGen[Response Generation]
TTS[Text-to-Speech]
End[User Output]
Error[Error Handling]
Start-->AudioCapture
AudioCapture-->NoiseReduction
NoiseReduction-->VAD
VAD-->STT
STT-->IntentDetection
IntentDetection-->DialogManager
DialogManager-->APIIntegration
APIIntegration-->ResponseGen
ResponseGen-->TTS
TTS-->End
VAD-->|No Voice Detected|Error
STT-->|Transcription Error|Error
APIIntegration-->|API Error|Error
Error-->End
Testing & Validation
Most HVAC voice bots fail in production because devs skip webhook validation. Here's how to test locally before deploying.
Local Testing
Use ngrok to expose your webhook server for real-time testing with VAPI. This catches race conditions that curl tests miss—like when a customer interrupts mid-sentence during service type selection.
// Test webhook handler locally with ngrok
const express = require('express');
const crypto = require('crypto');
const app = express();
app.use(express.json());
app.post('/webhook/vapi', (req, res) => {
const signature = req.headers['x-vapi-signature'];
const payload = JSON.stringify(req.body);
// Validate webhook signature (CRITICAL - prevents replay attacks)
const expectedSignature = crypto
.createHmac('sha256', process.env.VAPI_SERVER_SECRET)
.update(payload)
.digest('hex');
if (signature !== expectedSignature) {
return res.status(401).json({ error: 'Invalid signature' });
}
// Log function call parameters for debugging
if (req.body.message?.type === 'function-call') {
const { serviceType, zipCode, preferredDate } = req.body.message.functionCall.parameters;
console.log('HVAC Request:', { serviceType, zipCode, preferredDate });
// Simulate slot lookup (replace with real DB query)
const result = availableSlots.find(slot =>
slot.date === preferredDate && slot.zipCode === zipCode
);
return res.json({ result: result || { available: false } });
}
res.sendStatus(200);
});
app.listen(3000, () => console.log('Webhook server running on port 3000'));
Start ngrok: ngrok http 3000. Copy the HTTPS URL to your assistant's serverUrl config. Test with a real call—don't rely on curl alone.
Webhook Validation
Production failure: Webhooks timeout after 5 seconds. If your slot lookup hits a slow database, VAPI hangs up. Implement async processing:
// Handle slow operations asynchronously
app.post('/webhook/vapi', async (req, res) => {
// Acknowledge immediately (prevents timeout)
res.sendStatus(200);
// Process in background
if (req.body.message?.type === 'function-call') {
processHVACRequest(req.body.message.functionCall.parameters)
.catch(err => console.error('Background processing failed:', err));
}
});
Test edge cases: invalid zip codes (should return error), past dates (should reject), concurrent calls (check for race conditions in session state). Use VAPI's dashboard to inspect call logs—look for function-call events with your parameters.
Real-World Example
Barge-In Scenario
Customer calls at 2 PM on a Tuesday. Agent starts: "Thank you for calling Arctic Air HVAC. I can help you schedule a service appointment or answer questions about—" Customer interrupts: "My AC stopped working."
This is where most HVAC bots fail. The agent keeps talking over the customer, or worse, processes both the canned intro AND the customer's urgent issue as separate intents. Here's what actually happens in production:
// Streaming STT handler - processes partial transcripts
app.post('/webhook/vapi', async (req, res) => {
const payload = req.body;
if (payload.message?.type === 'transcript' && payload.message.transcriptType === 'partial') {
const partialText = payload.message.transcript.toLowerCase();
// Detect interruption keywords
if (partialText.includes('stopped') || partialText.includes('broken') || partialText.includes('not working')) {
// Cancel current TTS immediately - don't wait for full transcript
await fetch('https://api.vapi.ai/call/' + payload.call.id, {
method: 'PATCH',
headers: {
'Authorization': 'Bearer ' + process.env.VAPI_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
assistant: {
firstMessage: null, // Stop intro message
model: {
messages: [{ role: 'system', content: 'Customer has urgent AC failure. Skip intro. Ask: When did it stop working?' }]
}
}
})
});
}
}
res.sendStatus(200);
});
Event Logs (Real Timestamps):
14:00:01.234 - call.started - callId: abc123
14:00:01.456 - assistant.speech.started - "Thank you for calling..."
14:00:03.120 - transcript.partial - "my ac" (confidence: 0.72)
14:00:03.340 - transcript.partial - "my ac stopped" (confidence: 0.89)
14:00:03.341 - PATCH /call/abc123 - Cancel intro, inject urgent prompt
14:00:03.567 - assistant.speech.stopped - Intro cancelled mid-sentence
14:00:03.890 - assistant.speech.started - "When did it stop working?"
Edge Cases
Multiple rapid interruptions: Customer says "stopped working" then immediately adds "and it's 95 degrees." Your partial handler fires twice. Solution: debounce interruptions with 300ms window. If two partials arrive within 300ms, only process the second (more complete) one.
False positives: Background noise triggers VAD. Customer's dog barks, bot thinks they spoke. Set transcriber.endpointing.minVolume to 0.5 (default 0.3 catches breathing). Add confidence threshold: only act on partials with confidence > 0.85.
Latency jitter on mobile: Customer on LTE sees 200-600ms STT delays. Partial arrives AFTER agent finishes speaking. Use server-side timestamp comparison: if (partialTimestamp < speechEndTimestamp + 500ms) { /* treat as interruption */ }.
Common Issues & Fixes
Most HVAC voice bots break when customers interrupt mid-sentence or when STT misinterprets technical terms like "SEER rating" as "sear rating". Here's what actually fails in production.
Race Condition: Duplicate Service Requests
Problem: Customer says "schedule maintenance" while bot is still processing the previous utterance → two calendar entries get created.
Why it breaks: VAPI's webhook fires function-call events asynchronously. If your server doesn't track request state, concurrent calls to your scheduling API will duplicate bookings.
// Production fix: Request deduplication
const processingRequests = new Map();
app.post('/webhook/vapi', async (req, res) => {
const { call, message } = req.body;
const requestId = `${call.id}-${message.functionCall?.name}-${Date.now()}`;
// Guard against duplicate processing
if (processingRequests.has(call.id)) {
console.warn(`Duplicate request blocked: ${call.id}`);
return res.json({ result: "Processing previous request" });
}
processingRequests.set(call.id, true);
try {
const result = await scheduleService(message.functionCall.parameters);
return res.json({ result });
} finally {
// Cleanup after 5s to allow legitimate retries
setTimeout(() => processingRequests.delete(call.id), 5000);
}
});
STT Misinterpretation: HVAC Jargon
Problem: "R-22 refrigerant" transcribes as "are twenty-two refrigerant" → function call fails validation.
Fix: Add context hints in your assistant's firstMessage:
const assistantConfig = {
model: { provider: "openai", model: "gpt-4" },
transcriber: {
provider: "deepgram",
language: "en",
keywords: ["SEER", "R-22", "R-410A", "HVAC", "BTU"] // Boost recognition
},
firstMessage: "I can help with HVAC service. Common terms: SEER ratings, R-22 refrigerant, BTU capacity."
};
Webhook Timeout: Slow CRM Lookups
Problem: Fetching customer history from Salesforce takes 3-8 seconds → VAPI times out after 5s → call drops.
Fix: Return immediately, process async:
app.post('/webhook/vapi', async (req, res) => {
const { call, message } = req.body;
// Acknowledge immediately
res.json({ result: "Looking up your account..." });
// Process in background
fetchCustomerHistory(message.functionCall.parameters.phoneNumber)
.then(history => {
// Send follow-up via VAPI API (not shown in docs, describe only)
// In production: POST updated context back to active call
});
});
Complete Working Example
Here's the full production-ready server that handles HVAC inquiries, schedules appointments, and processes customer requests. This combines all routes into a single Express server with proper error handling and webhook validation.
Full Server Code
// server.js - Complete HVAC Voice Bot Server
const express = require('express');
const crypto = require('crypto');
const app = express();
app.use(express.json());
// In-memory storage (replace with database in production)
const availableSlots = {
'2024-01-15': ['09:00', '11:00', '14:00', '16:00'],
'2024-01-16': ['10:00', '13:00', '15:00'],
'2024-01-17': ['09:00', '12:00', '14:00', '16:00']
};
const processingRequests = new Map(); // Prevent duplicate processing
// Assistant configuration
const assistantConfig = {
name: "HVAC Support Assistant",
model: {
provider: "openai",
model: "gpt-4",
temperature: 0.7,
messages: [{
role: "system",
content: "You are an HVAC service assistant. Ask for service type, location, and preferred date. Be concise and professional."
}]
},
voice: {
provider: "11labs",
voiceId: "21m00Tcm4TlvDq8ikWAM",
stability: 0.5,
similarityBoost: 0.75
},
transcriber: {
provider: "deepgram",
model: "nova-2",
language: "en-US",
keywords: ["HVAC", "furnace", "AC", "air conditioning", "heating", "repair", "maintenance"]
},
firstMessage: "Hi, this is HVAC Support. What service do you need today?",
serverUrl: process.env.SERVER_URL || "https://your-domain.ngrok.io",
serverUrlSecret: process.env.VAPI_SERVER_SECRET
};
// Function definitions for VAPI
const functions = [{
name: "checkAvailability",
description: "Check available appointment slots for HVAC service",
parameters: {
type: "object",
properties: {
serviceType: {
type: "string",
enum: ["repair", "maintenance", "installation", "emergency"],
description: "Type of HVAC service needed"
},
zipCode: {
type: "string",
description: "Customer's ZIP code"
},
preferredDate: {
type: "string",
description: "Preferred date in YYYY-MM-DD format"
}
},
required: ["serviceType", "zipCode", "preferredDate"]
}
}];
// Webhook signature validation
function validateWebhookSignature(payload, signature) {
const secret = process.env.VAPI_SERVER_SECRET;
if (!secret) {
console.error('VAPI_SERVER_SECRET not configured');
return false;
}
const expectedSignature = crypto
.createHmac('sha256', secret)
.update(JSON.stringify(payload))
.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(signature),
Buffer.from(expectedSignature)
);
}
// Main webhook handler
app.post('/webhook/vapi', async (req, res) => {
const signature = req.headers['x-vapi-signature'];
const payload = req.body;
// Validate webhook signature
if (!validateWebhookSignature(payload, signature)) {
console.error('Invalid webhook signature');
return res.status(401).json({ error: 'Unauthorized' });
}
// Prevent duplicate processing
const requestId = payload.message?.id || `${Date.now()}-${Math.random()}`;
if (processingRequests.has(requestId)) {
return res.status(200).json({ status: 'already_processing' });
}
processingRequests.set(requestId, true);
setTimeout(() => processingRequests.delete(requestId), 30000); // Cleanup after 30s
try {
const { type, call, message } = payload;
// Handle function calls
if (type === 'function-call' && message?.functionCall) {
const { name, parameters } = message.functionCall;
if (name === 'checkAvailability') {
const { serviceType, zipCode, preferredDate } = parameters;
// Validate service area (example: only serve 90210-90299)
const zipNum = parseInt(zipCode);
if (zipNum < 90210 || zipNum > 90299) {
return res.json({
result: {
success: false,
message: `Sorry, we don't service the ${zipCode} area yet. We currently serve ZIP codes 90210-90299.`
}
});
}
// Check availability
const slots = availableSlots[preferredDate];
if (!slots || slots.length === 0) {
return res.json({
result: {
success: false,
message: `No slots available on ${preferredDate}. Would you like to try ${Object.keys(availableSlots)[0]}?`
}
});
}
// Return available slots
return res.json({
result: {
success: true,
serviceType,
date: preferredDate,
availableSlots: slots,
message: `We have ${slots.length} slots available on ${preferredDate}: ${slots.join(', ')}. Which time works best?`
}
});
}
}
// Handle partial transcripts (for barge-in detection)
if (type === 'transcript' && message?.transcriptType === 'partial') {
const partialText = message.transcript.toLowerCase();
// Detect urgent keywords
if (partialText.includes('emergency') || partialText.includes('urgent')) {
console.log(`[URGENT] Detected emergency keyword in call ${call.id}`);
// Trigger priority routing (implement your logic here)
}
}
// Handle call status updates
if (type === 'status-update') {
console.log(`Call ${call.id} status: ${payload.status}`);
if (payload.status === 'ended') {
// Cleanup session data
processingRequests.delete(call.id);
console.log(`Call ${call.id} ended. Duration: ${call.duration}s`);
}
}
res.status(200).json({ received: true });
} catch (error) {
console.error('Webhook processing error:', error);
processingRequests.delete(requestId);
res.status(500).json({ error: 'Internal server error' });
}
});
// Health check endpoint
app.get('/health', (req, res) => {
res.json({
status: 'healthy',
timestamp: new Date().toISOString(),
activeRequests: processingRequests.size
});
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`HVAC Voice Bot server running on port ${PORT}`);
console.log(`Webhook URL: ${process.env.SERVER_URL}/webhook/vapi`);
});
Run Instructions
1. Install dependencies:
npm install express crypto
2. Set environment variables:
export VAPI_SERVER_SECRET="your_webhook_secret_from_vapi_dashboard"
export SERVER_URL="https://your-domain.ngrok.io"
export PORT=3000
3. Start the server:
node server.js
## FAQ
### Technical Questions
**How do I connect VAPI to Twilio for inbound HVAC calls?**
VAPI integrates with Twilio via SIP trunking. Configure your Twilio phone number to route inbound calls to VAPI's SIP endpoint, then set up your `assistantConfig` with the appropriate model, voice, and transcriber settings. VAPI handles the voice AI logic while Twilio manages the telephony layer. You'll need your Twilio Account SID, Auth Token, and a dedicated SIP trunk endpoint. The connection is bidirectional—VAPI can also initiate outbound calls through Twilio for appointment confirmations.
**What's the difference between using VAPI's native function calling versus building a custom proxy?**
Native function calling lets you define `functions` directly in your `assistantConfig` with parameters like `serviceType`, `zipCode`, and `preferredDate`. VAPI handles the LLM reasoning and invokes your webhook automatically. A custom proxy means you build middleware that intercepts all VAPI requests and responses, giving you control over prompt engineering, request validation, and response transformation. Native is faster (lower latency); custom proxy is more flexible but adds complexity. For HVAC scheduling, native function calling is sufficient unless you need advanced prompt engineering or multi-step orchestration.
**How do I validate webhook signatures from VAPI?**
Use HMAC-SHA256 validation. VAPI sends a `X-Signature` header with each webhook. Extract the signature, reconstruct the payload hash using your `secret`, and compare using `validateWebhookSignature()`. If signatures don't match, reject the request immediately. This prevents spoofed webhook calls that could corrupt your scheduling database.
### Performance
**What latency should I expect for HVAC appointment scheduling?**
End-to-end latency typically breaks down as: STT processing (200-400ms), LLM inference (500-1500ms), function execution (100-300ms), TTS generation (300-800ms). Total: 1.1-3.0 seconds from user speech to bot response. Network jitter on mobile adds 100-200ms. For HVAC scheduling, this is acceptable—customers expect natural conversation pauses. If latency exceeds 3 seconds consistently, check your function execution time; database queries for `availableSlots` should complete in <200ms.
**How many concurrent calls can a single VAPI instance handle?**
VAPI scales horizontally. A single assistant configuration can handle hundreds of concurrent calls, but your backend (the webhook handler for scheduling) is the bottleneck. If you're querying a database for `availableSlots`, ensure your database connection pool supports concurrent requests. For HVAC businesses with <50 concurrent calls, a standard Node.js Express server with 10-20 database connections is sufficient. Monitor webhook response times; if they exceed 5 seconds, implement async processing with job queues.
### Platform Comparison
**Should I use VAPI or build voice AI directly with Twilio?**
Twilio's Voice API requires you to build STT, LLM integration, and TTS orchestration yourself. VAPI abstracts this complexity—you define your `assistantConfig` once, and VAPI handles the entire voice pipeline. VAPI is faster to deploy (days vs. weeks) and cheaper at scale because it optimizes model selection and caching. Twilio is better if you need deep telephony control (call transfer, IVR trees, DTMF). For HVAC scheduling, VAPI wins on time-to-market and cost.
**Can I use VAPI with other phone providers besides Twilio?**
Yes. VAPI supports multiple carriers via SIP trunking. You can use Twilio, Bandwidth, Vonage, or any SIP-compliant provider. The integration method is identical—route inbound calls to VAPI's SIP endpoint, configure your `assistantConfig`, and handle webhooks. Switching providers requires only updating your SIP trunk configuration; your VAPI code remains unchanged.
## Resources
**VAPI**: Get Started with VAPI → [https://vapi.ai/?aff=misal](https://vapi.ai/?aff=misal)
**VAPI Documentation:** [Official VAPI API Reference](https://docs.vapi.ai) – Complete endpoint specs, assistant configuration schemas, webhook event payloads, and voice provider integrations (ElevenLabs, Google, Azure).
**Twilio SIP Trunking:** [Twilio SIP Trunks Guide](https://www.twilio.com/docs/sip-trunking) – Configure inbound/outbound call routing, number management, and failover strategies for production HVAC call centers.
**GitHub Reference:** [VAPI + Twilio Integration Examples](https://github.com/VapiAI) – Sample implementations for webhook signature validation, function calling patterns, and session state management.
**Prompt Engineering for HVAC:** OpenAI's [Prompt Engineering Guide](https://platform.openai.com/docs/guides/prompt-engineering) – Techniques for structuring system prompts to handle service scheduling, availability queries, and escalation logic.
## References
1. https://docs.vapi.ai/quickstart/introduction
2. https://docs.vapi.ai/assistants
3. https://docs.vapi.ai/quickstart/phone
4. https://docs.vapi.ai/workflows/quickstart
5. https://docs.vapi.ai/quickstart/web
6. https://docs.vapi.ai/chat/quickstart
7. https://docs.vapi.ai/
8. https://docs.vapi.ai/tools/custom-tools
9. https://docs.vapi.ai/server-url
10. https://docs.vapi.ai/outbound-campaigns/quickstart
11. https://docs.vapi.ai/assistants/quickstart
Written by
Voice AI Engineer & Creator
Building production voice AI systems and sharing what I learn. Focused on VAPI, LLM integrations, and real-time communication. Documenting the challenges most tutorials skip.
Tutorials in your inbox
Weekly voice AI tutorials and production tips. No spam.
Found this helpful?
Share it with other developers building voice AI.



