AI that calls the Real World

👋 Howdy, GTM Engineer!

I attended one of the best AI in GTM events in NYC last night led by Chris Balestras.

GTM leaders from Clay, Cursor, Kleiner Perkins, and Sensei shared how they evaluate startups to determine:

  1. Likelihood of breakout success

  2. How much the Founder values / focuses on their sales org

  3. How the product will fair long term in the market

Check out the strategies shared in my latest LinkedIn post and check out Chris’ newsletter: GTMBA

This week I’m sharing a workflow that gathers data in a way that Clay cannot…

A voice agent that makes calls, has a conversation, and gathers intelligence to programmatically bring back across hundreds or thousands of SMBs.

This is useful for gathering pricing details that Medspas or other similar local businesses may not share on their website, but that you can only get from calling.

The download this week

GEAR Workflow: How to build a Voice Agent that calls local businesses for research.

New in GTM tech: Chainlit.io a framework for building AI applications or your own ChatGPT-style app.

📌 GTME jobs of the week: Primary Ventures, Harness, Jack & Jill, Goblins, IronCircle

📚 What I’m reading:

  • Systems of Records lose ground

  • Long Live System of Records

  • System of Action: Hero Tools

  • Data Teams Aren’t Dying. They’re Finally Becoming What We Hoped They’d Be

Happy Automating!

GEAR: Building a Voice Agent that Actually Calls for You

GEAR - ‘GTM Engineers Accelerating Revenue’ is Claymation’s series that highlights GTM Engineers and creative workflows & automations they’re building for modern GTM. Reply to this email if you’re interested and tell me a bit about what you’ve cooked up.

This week: Ashish Datta, Partner at SetFive Consulting

🎯 Objective: Unlocking "Offline" Pricing Data at Scale

We live in a digital-first world, yet a massive amount of high-value competitive intelligence—specifically pricing in industries like aesthetics and healthcare—remains locked behind a phone line.

Medspas frequently withhold pricing from their websites for several reasons:

  • Dynamic Pricing: Rates fluctuate based on inventory or seasonality.

  • Sales Psychology: They want to convert a caller, not just inform a browser.

  • Competitive Opacity: Keeping rivals in the dark.

Traditionally, gathering this data requires a BDR to sit on a phone for hours, navigating IVR trees and manually transcribing numbers. This is unscalable.

The Solution: An automated voice agent workflow that sources targets via Clay, calls them using OpenAI Realtime API, navigates phone trees, and extracts structured pricing data back into Clay or a database.

The Tech Stack

👇 The Step-by-Step Build

Step 1: Building the "Hit List" in Clay

Before you can dial, you need verified targets. We needed to find Medspas in specific high-volume metros—Boston, Dallas, and Miami—to test the system.

Instead of buying stale lists or scraping manually, we used Clay’s Find Local Businesses source (powered by Google Maps).

  1. Input: Categories (Medspa, Aesthetics, Laser Clinic) + Location (Boston, Dallas, Miami).

  2. Enrichment: Clay instantly pulled the Business Name, Formatted Phone Number, and Website.

  3. Export: We generated a clean CSV of verified local businesses in seconds, ready to be fed into our dialing infrastructure. Clay took what used to be a day of manual research and turned it into a 5-minute workflow.

Google Maps in Clay

Step 2: eliminating Latency with OpenAI Realtime API

OpenAI Realtime API Voice 2 Voice

The biggest hurdle in AI voice agents has historically been latency. The old stack looked like this: Speech-to-Text -> LLM Processing -> Text-to-Speech. This resulted in awkward pauses that immediately signaled to the recipient that they were talking to a bot.

We leveraged the OpenAI Realtime API to solve this. Because it handles native audio-to-audio processing, we achieved:

  • Human-level Latency: No awkward waiting periods.

  • Interruptibility: The AI stops talking instantly if the human interjects, just like a real conversation.

  • Function Calling: Triggering backend actions mid-call.

Step 3: Bridging the Audio via Twilio

To get the AI on the actual telephone network, we utilized Twilio Media Streams. This allows us to pipe the raw audio stream directly from the phone call into OpenAI.

The architecture ensures that the Medspa receptionist hears a responsive potential customer, capable of handling unexpected conversational pivots, rather than a rigid script.

Step 4: The "Press 1" Problem (DTMF Simulation)

A major friction point in calling local businesses is the Interactive Voice Response (IVR) system (e.g.,"Press 1 for appointments"). These systems listen for Dual-Tone Multi-Frequency (DTMF) signals, not voice commands.

We built a layer on top of Twilio’s callback architecture that allows our AI Agent to:

  1. Listen to the menu options.

  2. Decide which path leads to a human (e.g.,"Front Desk" or "Appointments").

  3. Simulate the physical button press (DTMF tone) to bypass the menu and connect the call.

Step 5: Converting Chatter to JSON

Getting a human on the line is only half the battle. The goal is data extraction.

Using system prompts designed for structured output, the AI engages in natural conversation to uncover pricing (e.g.,"How much is a first-time Botox unit?"). Even if the pricing is mentioned casually or amidst small talk, the system captures it.

The Output: The agent takes the free-flowing conversation and returns clean, structured data (JSON) containing pricing, availability, and specific conditions (first-time customer deals, minimums, etc.) that can be piped back into your CRM or database.

Step 6: Handling the "No Answer" Scenario

Small businesses are busy. Often, they don’t pick up. To ensure high success rates without harassing the business with repeated calls, we implemented a full communication cycle loop:

  • Voicemail Detection: The AI recognizes when it has hit an inbox.

  • Message Drop: It leaves a natural, context-aware message requesting the specific information needed.

  • Inbound Handling: When the business calls back, the system picks up and seamlessly resumes the objective to gather the data.

💡 You can also send this data back to Clay via a Webhook so that you can perform further analysis, enrichment, or data transformation on the data collected.

🚀 Results & Demo

By combining Clay's ability to instantly map local markets with OpenAI's conversational speed, we transformed a manual, high-friction research task into a scalable GTM engine.

Want to hear it in action?

Listen to the Demo at voice2data.setfive.com

If you're interested in implementing voice-based data gathering for your GTM motion, reach out to Ashish: [email protected]

New in GTM Tech: Chainlit

Chainlit is a framework for building your own AI applications that look and feel like ChatGPT. It’s being used by teams at Google, NVIDIA, Microsoft and smaller companies like BuzzFeed & Backmarket.

A customer showed me their app in action and it was pretty slick! It’s like they made their own ChatGPT-style app for their sales team. It’s growing in popularity and has over 50,000 monthly developers building on it.

GTM Engineer Jobs of the Week

Primary Ventures - Associates for GTM Engineering Team

Primary, a leading seed stage VC firm in New York, is hiring 2 new Associates for their GTM team. Primary’s GTM team works side by side with early stage founders to help them build GTM systems and workflows, run growth experiments, and drive pipeline and revenue.   This role is great for an early career SDR, Sales, or GTM professional that would love the exposure to a VC firm and to hone their growth skills. In this role you will learn pipeline building tools and techniques as well as technical skills.  You must be based in NYC as we are 3-4 days per week in office.  Contact [email protected] to apply.

  • Comp: Unknown

  • Location: Bengaluru, India

  • Company: Just raised $240M Series E round of financing. Company providers developer tooling platform for DevOps and Automation to help engineering teams ship code faster

  • Comp: Unknown

  • Location: London, UK

  • Company: Jack & Jill AI is a recruitment platform that utilizes specialized AI personas to connect job seekers with employment opportunities and assist companies in hiring talent more efficiently. The service features Jack, who acts as a career coach and recruiter for candidates, and Jill, who sources and screens candidates for employers at a fraction of the cost of traditional agencies.

  • Comp: $120k - $140k OTE

  • Location: Columbia, MD

  • Company: Jack & Jill AI is a recruitment platform that utilizes specialized AI personas to connect job seekers with employment opportunities and assist companies in hiring talent more efficiently. The service features Jack, who acts as a career coach and recruiter for candidates, and Jill, who sources and screens candidates for employers at a fraction of the cost of traditional agencies.

  • Comp: ~$150k to $250k Base + Equity

  • Location: Brooklyn, NY (will pay for relocation)

  • Company: Goblins is building the AI tutor students actually treat like a friend (2x MoM growth, 82% retention, 4x viral coefficient). Backed by the Gates Foundation, we are supply-constrained—closing massive school districts in 6 weeks vs the industry standard of 18 months. We need a Head of Growth Eng to pour gasoline on the fire—using code, not ads, to engineer the automated systems that will scale us to 1 Million Students.

What I’m Reading: System of Record vs. System of Action

But also:

Bonus: State of GTM Engineering Report

I’m putting together the first State of GTM Engineering report with my friends Garrett Wolfe Author of Garrett’s Growth Substack and Maja Voje, Author or The GTM Strategist.

AI-native CRM

“When I first opened Attio, I instantly got the feeling this was the next generation of CRM.”
— Margaret Shen, Head of GTM at Modal

Attio is the AI-native CRM for modern teams. With automatic enrichment, call intelligence, AI agents, flexible workflows and more, Attio works for any business and only takes minutes to set up.

Join industry leaders like Granola, Taskrabbit, Flatfile and more.

Reply

Avatar

or to participate

Keep Reading