https://store-images.s-microsoft.com/image/apps.33170.5f3e428c-38c8-43a3-8f5f-393b1e46d531.37841aa7-bea3-4109-996c-5b3d173973ea.51a1ac9e-d56c-42c4-bef7-9ba74271a59a
Speech AI
by Brainiall
Just a moment, logging you in...
Pronunciation scoring + Speech-to-Text + Text-to-Speech + Whisper STT Pro. 99 languages, 12 voices.
## Complete Speech AI Suite — Four Tools, One API
### 1. Pronunciation Assessment
Score English pronunciation at phoneme, word, and sentence levels (0-100). Exceeds human expert inter-annotator agreement. 17MB model with sub-300ms inference.
| Metric | Value |
|--------|-------|
| Phone-level accuracy | Exceeds human agreement by +3.5% |
| Sentence-level accuracy | Exceeds human agreement by +3.6% |
| Inference latency | p50 = 257ms, p95 = 423ms |
| Model footprint | 17MB (edge / mobile / browser ready) |
### 2. Speech-to-Text (STT)
Convert spoken English to text with word-level timestamps and per-word confidence scores. CTC decoding with Viterbi forced alignment for precise timing.
### 3. Text-to-Speech (TTS)
Generate natural speech from text with 12 English voices (American and British, male and female). Speed control from 0.5x to 2.0x. 24kHz WAV output.
### 4. Whisper STT Pro
Production-grade speech recognition powered by Whisper Large V3 Turbo (809M parameters). Supports 99 languages with word-level timestamps. Optional speaker diarization identifies who said what. GPU-accelerated with sub-second latency.
---
### API Endpoints
| Service | Method | Endpoint | Description |
|---------|--------|----------|-------------|
| Pronunciation | POST | /assess/base64 | Score pronunciation from audio |
| STT | POST | /transcribe/base64 | Transcribe audio to text |
| TTS | POST | /synthesize | Generate speech from text |
| TTS | GET | /voices | List available voices |
| Whisper | POST | /whisper/transcribe/base64 | Transcribe audio (99 languages) |
| Whisper | GET | /whisper/health | Service health check |
### Pricing Tiers
| Tier | Requests/Month | Price |
|------|---------------|-------|
| Free | 1,000 | USD 0 |
| Basic | 25,000 | USD 29/mo |
| Pro | 250,000 | USD 79/mo |
| Enterprise | Unlimited | USD 199/mo |
*Built by Brainiall. Four speech tools, one API.*
At a glance
https://store-images.s-microsoft.com/image/apps.18132.5f3e428c-38c8-43a3-8f5f-393b1e46d531.37841aa7-bea3-4109-996c-5b3d173973ea.133c5105-f3e0-423e-b6d6-58acb6747986