A

Amazon Polly

Cloud Text-to-Speech
Site ↗

Amazon Polly is a cloud TTS service with neural voices tightly integrated into the AWS ecosystem.

✓ Pros
  • Seamless AWS IAM and S3 integration
  • Speech Marks (metadata) for lip-sync and highlighting
  • Pay-as-you-go pricing with 12-month free tier
  • Low-latency streaming synthesis
✗ Cons
  • Smaller voice catalog than Google Cloud TTS
  • Neural voices limited to specific languages
  • Less natural prosody compared to newer deep-learning rivals
Free tier ✓ Free tier
Pricing model usage
Price (Standard) varies USD
Features
ssmlneural tts
Languages en, ja
Voices 80
API ✓ Available Docs ↗
Pricing Plans
Free Tier$05M standard chars/mo for 12 months
Standard voices$4/1M charsAfter free tier
Neural voices$16/1M charsAfter free tier
Platforms
api
Integrations AWS Lambda, Amazon Lex, S3, Amazon Connect, SDK (Python, JS, Java)
Homepage https://aws.amazon.com/polly/

AI Commentary

Amazon Polly is the natural TTS choice for AWS-native architectures, particularly those using Amazon Lex chatbots or Amazon Connect contact centers. Speech Marks—timestamped metadata for words and visemes—enable lip-sync animations and karaoke-style highlighting. Voice naturalness is adequate for utility applications but falls behind Google Neural2 and ElevenLabs for expressive or creative content.

Compare with: Amazon Polly