Speech-to-text training and evaluation platform
Measure, compare, and improve transcription accuracy across 40+ languages using industry-standard evaluation metrics. Our platform provides transparent benchmarks to help you choose the best speech recognition model for your needs.
Live Precision Data
Click on any language row to view detailed model comparisons and error analysis.
Loading precision data...
Learn More
A beginner-friendly guide to the evaluation metrics we use.
The percentage of words that were incorrectly transcribed. It's calculated by comparing the transcribed text to the reference text and counting substitutions, insertions, and deletions at the word level.
WER = (Substitutions + Insertions + Deletions) / Total Words x 100
Lower WER is better. A WER of 5% means 5 out of every 100 words contain errors.
Similar to WER, but measures errors at the character level instead of the word level. This metric is particularly useful for languages without clear word boundaries or for detecting minor spelling errors.
CER = (Char Substitutions + Insertions + Deletions) / Total Characters x 100
Lower CER is better. CER is typically lower than WER since a single word error might only affect a few characters.
The inverse of WER, representing the percentage of words correctly transcribed. It provides an intuitive measure of transcription quality.
Accuracy = 100% - WER
Higher accuracy is better. An accuracy of 95% means 95 out of every 100 words are correct.
After initial transcription, we apply GPT-4.1-mini to enhance the text by fixing common transcription errors, improving punctuation, and correcting obvious mistakes. The "improved" score reflects accuracy after this AI enhancement step.
Post-processed with GPT-4.1-mini text enhancement
The improved score shows how much AI post-processing can enhance raw transcription quality.
A word was replaced with a different word. Example: "cat" transcribed as "bat".
An extra word was added that wasn't in the original audio. Example: "the cat" transcribed as "the big cat".
A word from the original audio was missing in the transcription. Example: "the big cat" transcribed as "the cat".
Get access to our benchmarking platform and evaluate your speech recognition models against industry standards. Contact us to learn more about our enterprise evaluation solutions.