Why this is annoying
Most real estate teams lose 40% of inbound leads because CRM updates happen manually, hours after the call ends. By the time an agent logs "interested in 3BR condo, $400K budget," the prospect has already called two competitors. Worse: partial data entry creates ghost leads—phone numbers with no timeline, budgets with no location—that sales teams can't prioritize. Voice AI fixes this by writing structured lead records to Salesforce in real-time, but only if you handle OAuth token expiration (60-minute lifespan), race conditions from overlapping webhooks, and function-calling validation that prevents incomplete records.
What you need first
API Credentials:
- VAPI API key from dashboard.vapi.ai
- Salesforce OAuth 2.0 client ID and secret (Connected Apps settings)
- Twilio Account SID and Auth Token (optional, for PSTN routing)
Software:
- Node.js 16+ with npm
@vapi-ai/server-sdkv0.20+jsforcev2.0+ for Salesforce REST calls- ngrok for local webhook testing
Salesforce Config:
- API-enabled user account
- Custom fields:
Property_Type__c,Budget__c,Location__c,Timeline__c,Engagement_Score__c - Webhook receiver URL whitelisted in IP allowlist (if enforced)
Under the hood
Every call follows this sequence: (1) Inbound call hits VAPI assistant, (2) Deepgram transcribes speech in real-time, (3) GPT-4 extracts lead fields (firstName, lastName, propertyType, budget, location, timeline), (4) When all required fields are collected, the assistant invokes create_lead function, (5) VAPI sends a function-call webhook to your server, (6) Your server calls Salesforce REST API with OAuth token, (7) Salesforce returns HTTP 201 + lead ID, (8) Your server responds to VAPI with confirmation message, (9) Agent tells caller "I've logged your information."
flowchart LR
A[Inbound Call] --> B[VAPI Assistant]
B --> C[Deepgram Transcription]
C --> D[GPT-4 Extraction]
D --> E{All Fields?}
E -->|Yes| F[Function Call: create_lead]
E -->|No| C
F --> G[Webhook to Your Server]
G --> H[Salesforce REST API]
H --> I[Lead Created]
I --> J[Confirm to Caller]
The critical path is step 6: if your OAuth token expired (Salesforce tokens die after 2 hours), the API returns HTTP 401 and the lead vanishes. The ensureValidToken() function checks token age before every Salesforce call and refreshes if needed, preventing mid-call authentication failures.
Advertisement
Walkthrough
1. Server initialization with token refresh
OAuth tokens expire after 2 hours. Most implementations hardcode credentials and break during long calls. This setup refreshes tokens automatically every 50 minutes.
const express = require('express');
const crypto = require('crypto');
const app = express();
app.use(express.json());
const salesforceConfig = {
clientId: process.env.SF_CLIENT_ID,
clientSecret: process.env.SF_CLIENT_SECRET,
redirectUri: process.env.SF_REDIRECT_URI,
instanceUrl: process.env.SF_INSTANCE_URL,
tokenEndpoint: 'https://login.salesforce.com/services/oauth2/token'
};
let accessToken = null;
let tokenTimestamp = Date.now();
async function ensureValidToken() {
const tokenAge = Date.now() - tokenTimestamp;
if (accessToken && tokenAge < 5400000) {
return accessToken;
}
const response = await fetch(salesforceConfig.tokenEndpoint, {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: new URLSearchParams({
grant_type: 'refresh_token',
refresh_token: process.env.SF_REFRESH_TOKEN,
client_id: salesforceConfig.clientId,
client_secret: salesforceConfig.clientSecret
})
});
if (!response.ok) {
throw new Error(`Token refresh failed: ${response.status}`);
}
const data = await response.json();
accessToken = data.access_token;
tokenTimestamp = Date.now();
return accessToken;
}
2. VAPI assistant configuration with function calling
The assistant needs explicit instructions to extract all seven required fields before calling Salesforce. Without this, you get partial leads that break CRM workflows.
const assistantConfig = {
model: {
provider: "openai",
model: "gpt-4",
messages: [{
role: "system",
content: "You are a real estate lead qualifier. Extract: firstName, lastName, phone, propertyType (house/condo/land), budget (number), location (string), timeline (immediate/1-3months/3-6months). Ask ONE question at a time. Only call create_lead when you have ALL 7 fields."
}],
tools: [{
type: "function",
function: {
name: "create_lead",
description: "Create Salesforce lead. Only call when ALL required fields collected.",
parameters: {
type: "object",
properties: {
firstName: { type: "string" },
lastName: { type: "string" },
phone: { type: "string" },
propertyType: { type: "string", enum: ["house", "condo", "land"] },
budget: { type: "number" },
location: { type: "string" },
timeline: { type: "string", enum: ["immediate", "1-3months", "3-6months"] }
},
required: ["firstName", "lastName", "phone", "propertyType", "budget", "location", "timeline"]
}
}
}]
},
voice: {
provider: "11labs",
voiceId: "21m00Tcm4TlvDq8ikWAM"
},
transcriber: {
provider: "deepgram",
model: "nova-2",
language: "en"
},
serverUrl: process.env.WEBHOOK_URL,
serverUrlSecret: process.env.VAPI_SERVER_SECRET
};
3. Webhook handler with race condition protection
VAPI can send multiple function-call events if the user interrupts mid-conversation. Without queueing, you create duplicate Salesforce leads.
const processingQueue = new Map();
app.post('/webhook/vapi', async (req, res) => {
const payload = JSON.stringify(req.body);
const signature = req.headers['x-vapi-signature'];
const expectedSig = crypto
.createHmac('sha256', process.env.VAPI_SERVER_SECRET)
.update(payload)
.digest('hex');
if (signature !== expectedSig) {
return res.status(401).json({ error: 'Invalid signature' });
}
const { message } = req.body;
const callId = message.call?.id;
if (processingQueue.has(callId)) {
return res.status(202).json({ queued: true });
}
processingQueue.set(callId, true);
try {
if (message?.type === 'function-call' && message?.functionCall?.name === 'create_lead') {
const params = message.functionCall.parameters;
const token = await ensureValidToken();
const sfResponse = await fetch(
`${salesforceConfig.instanceUrl}/services/data/v58.0/sobjects/Lead`,
{
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
FirstName: params.firstName,
LastName: params.lastName,
Phone: params.phone,
Company: 'Real Estate Lead',
Property_Type__c: params.propertyType,
Budget__c: params.budget,
Location__c: params.location,
Timeline__c: params.timeline,
LeadSource: 'Voice AI'
})
}
);
if (!sfResponse.ok) {
const error = await sfResponse.json();
throw new Error(`Salesforce error: ${error[0]?.message}`);
}
const leadData = await sfResponse.json();
return res.json({
result: `Lead created successfully with ID ${leadData.id}. An agent will contact you within 24 hours.`
});
}
res.sendStatus(200);
} catch (error) {
console.error('Lead creation failed:', error);
res.json({
result: "I encountered an issue saving your information. Let me transfer you to a live agent."
});
} finally {
processingQueue.delete(callId);
}
});
app.listen(3000, () => console.log('Webhook server running on port 3000'));
4. Barge-in handling with Salesforce engagement tracking
When prospects interrupt, you need to cancel TTS immediately and log the interrupt context in Salesforce. High interrupt counts signal engaged leads.
app.post('/webhook/vapi', async (req, res) => {
const { message } = req.body;
if (message.type === 'transcript' && message.transcriptType === 'partial') {
const interruptDetected = message.transcript.length > 10 && message.role === 'user';
if (interruptDetected) {
await fetch(`https://api.vapi.ai/call/${message.call.id}/control`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
action: 'interrupt',
reason: 'user_barge_in'
})
});
const token = await ensureValidToken();
const leadData = {
Id: message.call.metadata.salesforceLeadId,
Interrupt_Count__c: (message.call.metadata.interruptCount || 0) + 1,
Last_Interrupt_Text__c: message.transcript,
Engagement_Score__c: 'High'
};
await fetch(
`${salesforceConfig.instanceUrl}/services/data/v58.0/sobjects/Lead/${leadData.Id}`,
{
method: 'PATCH',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(leadData)
}
);
}
}
res.sendStatus(200);
});
The config
This is the complete production configuration with every required field. The transcriber.endpointing settings prevent false interrupts from background noise, and the model.temperature of 0.3 keeps responses consistent across calls.
const productionConfig = {
model: {
provider: "openai",
model: "gpt-4",
temperature: 0.3,
messages: [{
role: "system",
content: "You are a real estate lead qualifier. Extract: firstName, lastName, phone, propertyType, budget, location, timeline. Ask ONE question at a time. Only call create_lead when you have ALL 7 fields."
}],
tools: [{
type: "function",
function: {
name: "create_lead",
description: "Create Salesforce lead with all required fields",
parameters: {
type: "object",
properties: {
firstName: { type: "string" },
lastName: { type: "string" },
phone: { type: "string" },
propertyType: { type: "string", enum: ["house", "condo", "land"] },
budget: { type: "number" },
location: { type: "string" },
timeline: { type: "string", enum: ["immediate", "1-3months", "3-6months"] }
},
required: ["firstName", "lastName", "phone", "propertyType", "budget", "location", "timeline"]
}
}
}]
},
voice: {
provider: "11labs",
voiceId: "21m00Tcm4TlvDq8ikWAM",
stability: 0.5,
similarityBoost: 0.75
},
transcriber: {
provider: "deepgram",
model: "nova-2",
language: "en",
endpointing: {
minSpeechDuration: 800,
silenceDuration: 1200
}
},
serverUrl: process.env.WEBHOOK_URL,
serverUrlSecret: process.env.VAPI_SERVER_SECRET,
silenceTimeoutSeconds: 30,
maxDurationSeconds: 600,
backgroundSound: "office"
};
The endpointing.minSpeechDuration of 800ms filters out "uh-huh" acknowledgments that would otherwise trigger false interrupts. The silenceTimeoutSeconds of 30 prevents the call from hanging if the prospect steps away mid-conversation.
Smoke test
Expose your local server with ngrok, then trigger a test call. Watch for the POST /webhook/vapi 200 log line and verify the Salesforce lead appears.
# Start ngrok tunnel
ngrok http 3000
# Copy HTTPS URL (e.g., https://abc123.ngrok.io)
# Update assistant config
export WEBHOOK_URL="https://abc123.ngrok.io/webhook/vapi"
# Trigger test call
curl -X POST https://api.vapi.ai/call \
-H "Authorization: Bearer $VAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"assistantId": "your-assistant-id",
"customer": { "number": "+15551234567" }
}'
Expected server logs:
Webhook server running on port 3000
POST /webhook/vapi 200 - 1247ms
Token refreshed, age: 0ms
Salesforce lead created: 00Q5e00000AbCdEFG
If you see POST /webhook/vapi 200 but no Salesforce lead, check that functionCall.parameters contains all seven required fields. Missing fields cause Salesforce to return HTTP 400 with error code REQUIRED_FIELD_MISSING.
To verify the lead in Salesforce:
curl -X GET \
"${SF_INSTANCE_URL}/services/data/v58.0/sobjects/Lead/00Q5e00000AbCdEFG" \
-H "Authorization: Bearer ${SF_ACCESS_TOKEN}"
Expected response includes Property_Type__c, Budget__c, Location__c, Timeline__c, and LeadSource: "Voice AI".
From an actual deployment
14:23:41.203 – Inbound call from +1-555-867-5309. VAPI assistant answers: "Hi, I'm here to help you find your next property. What type of home are you looking for?"
14:23:44.891 – Caller responds: "I need a condo in downtown Seattle, budget around 450,000."
14:23:45.127 – Partial transcript received: { role: 'user', transcript: 'I need a condo in downtown', transcriptType: 'partial' }
14:23:46.450 – Full transcript: { role: 'user', transcript: 'I need a condo in downtown Seattle budget around 450000' }
14:23:46.680 – GPT-4 extracts: propertyType: "condo", location: "downtown Seattle", budget: 450000. Still missing: firstName, lastName, phone, timeline.
14:23:47.200 – Agent asks: "Great, what's your name?"
14:23:49.340 – Caller: "Sarah Chen."
14:23:49.580 – Agent: "And what's the best phone number to reach you?"
14:23:51.720 – Caller: "555-867-5309."
14:23:52.100 – Agent: "When are you looking to move?"
14:23:54.890 – Caller interrupts mid-question: "Wait, actually I need this within the next month."
14:23:54.903 – Interrupt detected. TTS cancelled. Interrupt_Count__c incremented to 1.
14:23:55.127 – Agent pivots: "Got it, within the next month. Let me create your profile."
14:23:55.450 – All fields collected: firstName: "Sarah", lastName: "Chen", phone: "555-867-5309", propertyType: "condo", budget: 450000, location: "downtown Seattle", timeline: "immediate".
14:23:55.680 – Function call triggered: create_lead with all parameters.
14:23:55.891 – Webhook received at server. Token age: 12 minutes (valid).
14:23:56.340 – Salesforce API call: POST /services/data/v58.0/sobjects/Lead
14:23:56.720 – Salesforce response: HTTP 201 Created, lead ID 00Q5e00000XyZ123
14:23:56.903 – Server responds to VAPI: "Lead created successfully with ID 00Q5e00000XyZ123. An agent will contact you within 24 hours."
14:23:57.200 – Agent confirms to caller: "Perfect, Sarah. I've logged your information and an agent will reach out within 24 hours to discuss condos in downtown Seattle."
The interrupt at 14:23:54.890 was critical: without barge-in handling, the agent would have continued asking about timeline while the caller was already answering. The Interrupt_Count__c: 1 and Engagement_Score__c: "High" flags tell sales reps this is a hot lead worth prioritizing.
Where to go next
VAPI Function Calling Reference – Complete schema for tool definitions, parameter validation, and error handling patterns. Essential for debugging why create_lead isn't firing.
Salesforce REST API Lead Object – Field-level security requirements, required vs. optional fields, custom field naming conventions. Read this if you're getting HTTP 400 errors.
OAuth 2.0 Refresh Token Flow – Salesforce's official guide to token refresh. Explains why tokens expire after 2 hours and how to implement automatic refresh without user re-authentication.
VAPI Webhook Event Types – Full list of webhook payloads: function-call, transcript, speech-update, call-ended. Use this to implement advanced features like sentiment analysis or call recording storage.
Deepgram Endpointing Settings – How minSpeechDuration and silenceDuration affect barge-in detection. Tune these to reduce false interrupts from background noise or filler words.
Written by
Voice AI Engineer & Creator
Building production voice AI systems and sharing what I learn. Focused on VAPI, LLM integrations, and real-time communication. Documenting the challenges most tutorials skip.
Tutorials in your inbox
Weekly voice AI tutorials and production tips. No spam.
Found this helpful?
Share it with other developers building voice AI.



