Background Image

LitteraWorks Benchmarks

Speech-to-text training and evaluation platform

Measure, compare, and improve transcription accuracy across 40+ languages using industry-standard evaluation metrics. Our platform provides transparent benchmarks to help you choose the best speech recognition model for your needs.

Live Precision Data

Precision by Language

Click on any language row to view detailed model comparisons and error analysis.

Loading precision data...

Learn More

Understanding the Metrics

A beginner-friendly guide to the evaluation metrics we use.

WER (Word Error Rate)

The percentage of words that were incorrectly transcribed. It's calculated by comparing the transcribed text to the reference text and counting substitutions, insertions, and deletions at the word level.

WER = (Substitutions + Insertions + Deletions) / Total Words x 100

Lower WER is better. A WER of 5% means 5 out of every 100 words contain errors.

CER (Character Error Rate)

Similar to WER, but measures errors at the character level instead of the word level. This metric is particularly useful for languages without clear word boundaries or for detecting minor spelling errors.

CER = (Char Substitutions + Insertions + Deletions) / Total Characters x 100

Lower CER is better. CER is typically lower than WER since a single word error might only affect a few characters.

Accuracy

The inverse of WER, representing the percentage of words correctly transcribed. It provides an intuitive measure of transcription quality.

Accuracy = 100% - WER

Higher accuracy is better. An accuracy of 95% means 95 out of every 100 words are correct.

"Improved" Score

After initial transcription, we apply GPT-4.1-mini to enhance the text by fixing common transcription errors, improving punctuation, and correcting obvious mistakes. The "improved" score reflects accuracy after this AI enhancement step.

Post-processed with GPT-4.1-mini text enhancement

The improved score shows how much AI post-processing can enhance raw transcription quality.

Error Types Explained

S

Substitutions

A word was replaced with a different word. Example: "cat" transcribed as "bat".

I

Insertions

An extra word was added that wasn't in the original audio. Example: "the cat" transcribed as "the big cat".

D

Deletions

A word from the original audio was missing in the transcription. Example: "the big cat" transcribed as "the cat".

Want to evaluate your own models?

Get access to our benchmarking platform and evaluate your speech recognition models against industry standards. Contact us to learn more about our enterprise evaluation solutions.