LOADING MODULE...
CASE STUDY

SCAI

AI-Powered Voice Calling Agent

PythonFlaskAWSDockerAI/ML11 Labs

The Problem

Businesses needed automated, natural-sounding voice interactions to handle high-volume customer calls across multiple languages. Traditional IVR systems felt robotic and couldn't handle nuanced conversations in Hindi and Mexican Spanish, leading to poor customer satisfaction and high abandonment rates.

The challenge was to build a real-time conversational AI pipeline that could process speech-to-text, understand intent, generate intelligent responses, and convert them back to natural-sounding speech — all within a 2-second latency window.

System Architecture

Voice Input
STT Service
Conversational AI
TTS Engine
Voice Output

My Contributions

Backend API Development

  • Built Flask microservices powering the voice pipeline
  • Designed RESTful APIs for call management and analytics
  • Achieved sub-2-second end-to-end response latency

AI Pipeline Integration

  • Owned end-to-end STT, TTS, and conversational AI workflows
  • Integrated AWS, Google, and 11 Labs speech services
  • Achieved ~90-95% accuracy across Hindi and Mexican Spanish

Infrastructure & Deployment

  • Deployed on AWS EC2 with Docker containerization
  • Configured Nginx for load balancing and SSL termination
  • Handled live production issue resolution

Key Metrics

<2
Second Latency
0
% Accuracy
0
Languages Supported

Tech Stack Deep Dive

Python & Flask

Core microservice framework for all API endpoints and business logic

AWS (EC2, S3)

Cloud infrastructure for compute, storage, and speech services

Docker

Containerized deployments for consistent environments across dev and prod

AI/ML Services

STT, TTS, and conversational AI via AWS, Google, and 11 Labs APIs