https://store-images.s-microsoft.com/image/apps.33170.5f3e428c-38c8-43a3-8f5f-393b1e46d531.37841aa7-bea3-4109-996c-5b3d173973ea.51a1ac9e-d56c-42c4-bef7-9ba74271a59a

Speech AI

by Brainiall

Pronunciation scoring + Speech-to-Text + Text-to-Speech + Whisper STT Pro. 99 languages, 12 voices.

## Complete Speech AI Suite — Four Tools, One API ### 1. Pronunciation Assessment Score English pronunciation at phoneme, word, and sentence levels (0-100). Exceeds human expert inter-annotator agreement. 17MB model with sub-300ms inference. | Metric | Value | |--------|-------| | Phone-level accuracy | Exceeds human agreement by +3.5% | | Sentence-level accuracy | Exceeds human agreement by +3.6% | | Inference latency | p50 = 257ms, p95 = 423ms | | Model footprint | 17MB (edge / mobile / browser ready) | ### 2. Speech-to-Text (STT) Convert spoken English to text with word-level timestamps and per-word confidence scores. CTC decoding with Viterbi forced alignment for precise timing. ### 3. Text-to-Speech (TTS) Generate natural speech from text with 12 English voices (American and British, male and female). Speed control from 0.5x to 2.0x. 24kHz WAV output. ### 4. Whisper STT Pro Production-grade speech recognition powered by Whisper Large V3 Turbo (809M parameters). Supports 99 languages with word-level timestamps. Optional speaker diarization identifies who said what. GPU-accelerated with sub-second latency. --- ### API Endpoints | Service | Method | Endpoint | Description | |---------|--------|----------|-------------| | Pronunciation | POST | /assess/base64 | Score pronunciation from audio | | STT | POST | /transcribe/base64 | Transcribe audio to text | | TTS | POST | /synthesize | Generate speech from text | | TTS | GET | /voices | List available voices | | Whisper | POST | /whisper/transcribe/base64 | Transcribe audio (99 languages) | | Whisper | GET | /whisper/health | Service health check | ### Pricing Tiers | Tier | Requests/Month | Price | |------|---------------|-------| | Free | 1,000 | USD 0 | | Basic | 25,000 | USD 29/mo | | Pro | 250,000 | USD 79/mo | | Enterprise | Unlimited | USD 199/mo | *Built by Brainiall. Four speech tools, one API.*

At a glance

https://store-images.s-microsoft.com/image/apps.18132.5f3e428c-38c8-43a3-8f5f-393b1e46d531.37841aa7-bea3-4109-996c-5b3d173973ea.133c5105-f3e0-423e-b6d6-58acb6747986