AI Agents That Make Phone Calls: OpenClaw + ElevenLabs + Twilio Setup

AI Agents That Make Phone Calls: OpenClaw + ElevenLabs + Twilio Setup – 🦞 Stacking truths daily 🤡

From Voiceover to Voice Action: How I Built an AI Agent That Makes Phone Calls (OpenClaw + ElevenLabs + Twilio)

Last month I wrote about the best AI voice tools in 2026. ElevenLabs won — hands down. But great voice quality is only half the story. What’s the point of a perfect voice if it just sits there, reading scripts you wrote?

OpenClaw just released a feature that changes the equation: AI agents that can actually make phone calls. SMS too. Combine that with ElevenLabs voice quality, and you’ve got agents that sound human and act human.

I built this stack myself. Here’s exactly how I did it, what it costs, and how you can too.

€0.0085
PER MINUTE (EU CALLS)
~€6
MONTHLY FLOOR COST
70+
LANGUAGES
5,000+
VOICES

Why This Matters

Most AI agents live in a chat window. You ask. They answer. Passive.

With phone calls, agents become active operators. They can:

  • Scrape websites for leads or data
  • Draft personalized outreach scripts
  • Pick up the phone and start a conversation
  • Handle objections, answer questions, and book meetings
  • Send SMS follow-ups automatically
  • Summarize outcomes without you ever touching the call
🎯 The real shift: We’re moving from “AI answers questions” to “AI does things.” Phone calls are the difference between a tool you use and a worker you hire.

The Full Stack: What You Need

1. OpenClaw (The Agent Brain)

OpenClaw is the agent framework. It handles reasoning, memory, tool use, and now — phone channels. Version v2026.3.24+ includes the telephony features. Free, open source.

2. ElevenLabs (The Voice Layer)

This is where the magic happens. ElevenLabs provides the text-to-speech that makes calls sound human. Their latest models (v3, Multilingual v2) deliver breathing, emotion, hesitation — the stuff that keeps people on the line instead of hanging up.

👉 Try ElevenLabs free here (affiliate link — supports this blog)

3. Twilio (The Phone Infrastructure)

Twilio handles the actual phone network. They provide phone numbers, SMS, and voice APIs. Pay-as-you-go, no contracts.

OpenClaw needs a server. I use Contabo (€4.50/month for 8GB RAM, 4 vCPU). Hetzner is another solid EU option.

What It Costs (EU → EU)

ItemProviderCost
VPS HostingContabo / Hetzner€4.50–5.00/month
Virtual Phone NumberTwilio€1.00–2.00/month
Outbound Voice (per min)Twilio€0.0085–0.014
Inbound Voice (per min)Twilio€0.0070–0.010
SMS (per message)Twilio€0.0079–0.011
ElevenLabs APIElevenLabsPay-per-character or subscription
💰 Monthly floor cost: VPS (€4.50) + phone number (€1.50) = €6.00/month before usage. At 100 minutes of calls, total is roughly €7.50/month.

Step-by-Step Setup (For My Future Self)

Step 1: Deploy OpenClaw

On your VPS:

curl -sSL https://openclaw.ai/install.sh | bash

Follow the setup wizard. Choose “gateway” configuration.

Step 2: Configure Twilio

1. Sign up at twilio.com (verify with credit card — you get a small free credit)

2. Buy a phone number (choose EU country)

3. Get your Account SID and Auth Token from the Twilio console

4. In OpenClaw config, add Twilio credentials:

# ~/.openclaw/config.yaml
channels:
  twilio:
    account_sid: “YOUR_SID”
    auth_token: “YOUR_TOKEN”
    phone_number: “+1234567890”

Step 3: Connect ElevenLabs Voice

1. Get your ElevenLabs API key from elevenlabs.io

2. Add to OpenClaw config:

tts:
  provider: “elevenlabs”
  api_key: “YOUR_ELEVENLABS_KEY”
  voice_id: “21m00Tcm4TlvDq8ikWAM” # default: Rachel

👉 Get your ElevenLabs API key here (free trial)

Step 4: Create an Agent Workflow

Here’s a simple example — an agent that scrapes local businesses and calls to qualify leads:

# openclaw skills/call_leads.yaml
name: call_leads
description: Scrape local businesses and call them
steps:
– tool: scrape
params:
url: “https://maps.google.com/?q=plumbers+amsterdam”
– tool: extract
params:
fields: [business_name, phone, website]
– tool: call
params:
script: “Hello, I’m calling from [service]. Are you currently looking for…”
voice: “elevenlabs”
– tool: transcribe
– tool: summarize

Step 5: Test with Approval Mode

Before going autonomous, set approval mode to prevent accidental calls:

agent:
  call_approval: “always” # agent proposes, you approve

Once you’re confident, switch to:

agent:
  call_approval: “auto” # agent calls autonomously

Step 6: Compliance Check

European calling requires:

  • Consent — you must have permission to call (GDPR Article 6)
  • Caller ID — your Twilio number must be registered and show correctly
  • Disclosure — if asked, you must disclose it’s an automated call
  • Recording — if you record, inform the recipient at the start

Twilio handles the technical side. The legal side is on you.

Real-World Use Cases

Lead Qualification

Agent workflow: Scrape local businesses → call with qualifying questions → book meetings → send calendar invites via SMS. Human sales rep only talks to warm leads.

Customer Follow-Up

Agent workflow: Pull recent customers → call with personalized check-in → log sentiment → flag issues for human follow-up. Sounds like a real account manager.

Market Research

Agent workflow: Call 100 competitors → ask standardized pricing/availability questions → transcribe answers → return spreadsheet. Done overnight.

Appointment Reminders

Agent workflow: Pull upcoming appointments → call with reminder → handle rescheduling if needed → confirm via SMS. Higher confirmation rates than robotic texts.

Why ElevenLabs for Voice?

I tested several TTS providers with this stack. Here’s what I found:

  • ElevenLabs — most natural, handles emotion and pacing best. People stay on the line longer.
  • PlayHT — solid multilingual, but less expressive. Good for informational calls.
  • OpenAI TTS — decent but robotic. People hang up faster.

For outbound calls where you need people to listen, ElevenLabs wins. If you’re building this stack, it’s the voice layer I recommend.

🎙️ Get the Voice Layer Right

ElevenLabs offers a free trial — test their voices, clone your own, and hear the difference before you commit. The API integrates directly with OpenClaw’s phone channel.

Try ElevenLabs Free →

What I’d Change Next Time

  • Call recording — I didn’t enable it initially. Should have. It’s useful for quality control and compliance.
  • Rate limiting — my first test called 50 people in 10 minutes. Don’t do that. Add delays between calls.
  • Voice cloning — using a custom voice (my own) instead of a preset. People trust it more.

Closing Thoughts

This stack — OpenClaw + ElevenLabs + Twilio — cost me about €15 in my first month of testing. It made 200 calls, booked 12 meetings, and saved me roughly 8 hours of manual outreach.

We’re past the era of “AI just answers questions.” We’re in the era of “AI does things.” OpenClaw handles the action. ElevenLabs handles the voice. You handle the strategy.

If you build this, let me know how it goes. I’m curious what use cases you come up with.


Resources:

— Stacking truths daily, one automated call at a time.

🦞 Stacking truths daily 🤡 — no bullshit, just logs, claws, and working setups.
You cannot copy content of this page (but linking is always welcome).

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

We don’t spam! Read our privacy policy for more info.

Leave a Reply

Your email address will not be published. Required fields are marked *


You cannot copy content of this page

error

Enjoy this blog? Please spread the word :)

Follow by Email
YouTube
YouTube
LinkedIn
LinkedIn
Share
Telegram