Tables are sortable — click any column header. Green = best WER in the dataset. Red = hallucination (WER > 100% means the model generated more output than the reference). Model names link to their HuggingFace pages. Dataset titles link to their HF datasets.
Comparative speech-to-text benchmarks for Lietuvių. 61 model configurations tested across 4 datasets (FLEURS Lithuanian (lt_lt), VoxPopuli lt test, Common Voice 22 — Lithuanian, Common Voice 25 — Lithuanian). WER, CER, real-time speed (RTFx), latency, and GPU memory for Whisper, Parakeet, Gemma, and more.
| Model | Backend | n | WER | CER | RTFx mean | RTFx p50 | Lat mean (ms) | Lat p90 (ms) | GPU peak (MB) | Wall (s) |
|---|---|---|---|---|---|---|---|---|---|---|
| parakeet-tdt-lt+europarl5gram | nemo | 986 | 15.59% | 4.86% | 32.7 | 32.7 | 312 | 461 | 612 | 317 |
| parakeet-tdt-lt | nemo | 986 | 18.49% | 4.60% | 132.5 | 132.5 | 71 | 80 | 286 | 78 |
| parakeet-tdt-0.6b-v3 | nemo | 986 | 22.16% | 5.78% | 132.0 | 132.0 | 73 | 83 | 280 | 80 |
| whisper-large-v3+beam5 | transformers | 986 | 23.08% | 5.80% | 5.1 | 4.5 | 2268 | 3310 | 0 | 2245 |
| whisper-large-v3-turbo | transformers | 986 | 23.90% | 5.94% | 20.3 | 20.3 | 530 | 714 | 374 | 533 |
| fw-large-v3 | faster-whisper | 986 | 24.18% | 6.11% | 10.2 | 10.2 | 1057 | 1491 | 232 | 1051 |
| fw-large-v3-turbo | faster-whisper | 986 | 24.43% | 6.08% | 34.8 | 34.8 | 305 | 367 | 334 | 309 |
| whisper-large-v3 | transformers | 986 | 24.48% | 6.14% | 5726.5 | 5.8 | 1767 | 2561 | 0 | 1752 |
| whisper-large-v2 | transformers | 986 | 28.02% | 7.34% | 4.2 | 4.2 | 2531 | 3914 | 1552 | 2504 |
| gemma-4-E4B-it-lt-asr | transformers | 986 | 33.56% | 15.16% | 1.4 | 1.4 | 7866 | 11167 | 153 | 7766 |
| gemma-4-E4B-it | transformers | 986 | 38.95% | 12.95% | 2.2 | 2.2 | 4892 | 7263 | 188 | 4835 |
| whisper-medium | transformers | 986 | 40.77% | 10.49% | 5.3 | 5.3 | 1927 | 2812 | 1014 | 1918 |
| gemma-4-E2B-it | transformers | 30 | 47.57% | 15.21% | 3.0 | 2.9 | 3631 | 5428 | 0 | 111 |
| whisper-small | transformers | 986 | 65.12% | 18.43% | 9.5 | 9.5 | 1078 | 1576 | 380 | 1072 |
| whisper-base | transformers | 986 | 93.83% | 33.39% | 15.3 | 15.3 | 703 | 989 | 164 | 702 |
| whisper-tiny | transformers | 986 | 123.26% | 56.39% | 19.1 | 19.1 | 672 | 881 | 178 | 675 |
| Model | Backend | n | WER | CER | RTFx mean | RTFx p50 | Lat mean (ms) | Lat p90 (ms) | GPU peak (MB) | Wall (s) |
|---|---|---|---|---|---|---|---|---|---|---|
| parakeet-tdt-lt | nemo | 42 | 27.56% | 18.36% | 130.6 | 130.6 | 87 | 87 | 286 | 6 |
| parakeet-tdt-lt+europarl5gram | nemo | 42 | 28.41% | 21.24% | 29.5 | 29.5 | 330 | 576 | 612 | 16 |
| fw-large-v3 | faster-whisper | 42 | 28.90% | 19.71% | 7.5 | 7.2 | 1405 | 1942 | 416 | 60 |
| parakeet-tdt-0.6b-v3 | nemo | 42 | 29.88% | 17.90% | 120.9 | 120.9 | 104 | 91 | 280 | 8 |
| fw-large-v3-turbo | faster-whisper | 42 | 30.37% | 18.51% | 17.4 | 17.1 | 648 | 699 | 288 | 28 |
| whisper-large-v2 | transformers | 42 | 33.05% | 20.32% | 2.9 | 2.9 | 3573 | 5367 | 0 | 151 |
| gemma-4-E4B-it-lt-asr | transformers | 42 | 39.63% | 22.07% | 1.4 | 1.4 | 7505 | 12008 | 153 | 317 |
| gemma-4-E4B-it-lt-asr-lm | transformers | 42 | 40.12% | 23.26% | 0.1 | 0.1 | 55962 | 81118 | 1424 | 2352 |
| whisper-medium | transformers | 42 | 41.95% | 20.72% | 4.3 | 4.2 | 2404 | 3554 | 0 | 102 |
| gemma-4-E4B-it | transformers | 42 | 46.34% | 24.28% | 2.0 | 2.0 | 5135 | 7986 | 188 | 219 |
| gemma-4-E2B-it | transformers | 30 | 50.83% | 25.48% | 2.9 | 2.9 | 3634 | 5407 | 0 | 110 |
| whisper-large-v3 | transformers | 42 | 53.05% | 35.05% | 6.7 | 5.6 | 2059 | 3013 | 0 | 91 |
| whisper-small | transformers | 42 | 57.80% | 25.11% | 9.2 | 9.0 | 1149 | 1698 | 0 | 49 |
| whisper-large-v3+beam5 | transformers | 42 | 60.73% | 54.59% | 4.0 | 4.1 | 3464 | 4980 | 0 | 148 |
| whisper-base | transformers | 42 | 80.98% | 33.62% | 16.8 | 16.3 | 630 | 898 | 0 | 27 |
| whisper-large-v3-turbo | transformers | 42 | 84.15% | 40.49% | 14.3 | 13.1 | 824 | 1162 | 0 | 36 |
| whisper-tiny | transformers | 42 | 105.61% | 52.96% | 20.8 | 21.4 | 610 | 927 | 0 | 27 |
| Model | Backend | n | WER | CER | RTFx mean | RTFx p50 | Lat mean (ms) | Lat p90 (ms) | GPU peak (MB) | Wall (s) |
|---|---|---|---|---|---|---|---|---|---|---|
| parakeet-tdt-lt-beamlm | nemo | 300 | 9.39% | 2.28% | 40.0 | 38.0 | 178 | 235 | 0 | 992 |
| parakeet-tdt-lt | nemo | 300 | 13.77% | 2.82% | 75.1 | 72.5 | 69 | 73 | 0 | 407 |
| parakeet-tdt-0.6b-v3 | nemo | 300 | 16.24% | 4.10% | 40.0 | 38.0 | 69 | 73 | 0 | 406 |
| whisper-large-v3 | transformers | 300 | 28.18% | 6.48% | 3.6 | 3.5 | 1581 | 2151 | 0 | 476 |
| fw-large-v3 | faster-whisper | 300 | 28.47% | 6.60% | 7.0 | 6.7 | 788 | 969 | 256 | 239 |
| fw-large-v3-turbo | faster-whisper | 300 | 32.55% | 8.14% | 13.0 | 12.5 | 424 | 462 | 288 | 129 |
| whisper-large-v3-turbo | transformers | 300 | 33.82% | 8.63% | 12.9 | 12.5 | 426 | 527 | 0 | 130 |
| whisper-large-v2 | transformers | 300 | 37.65% | 9.27% | 3.5 | 3.4 | 1619 | 2190 | 0 | 489 |
| whisper-medium | transformers | 300 | 50.12% | 12.97% | 5.5 | 5.3 | 1034 | 1423 | 0 | 312 |
| gemma-4-E2B-it | transformers | 30 | 63.28% | 23.19% | 3.9 | 3.8 | 1341 | 1860 | 0 | 42 |
| gemma-4-E4B-it | transformers | 30 | 63.84% | 21.85% | 0.2 | 0.2 | 29437 | 43359 | 5452 | 885 |
| whisper-small | transformers | 300 | 72.31% | 20.35% | 11.7 | 11.3 | 490 | 686 | 0 | 149 |
| whisper-base | transformers | 300 | 90.92% | 29.86% | 22.7 | 22.0 | 252 | 357 | 0 | 78 |
| whisper-tiny | transformers | 300 | 109.38% | 47.44% | 28.7 | 27.7 | 236 | 279 | 0 | 73 |
| Model | Backend | n | WER | CER | RTFx mean | RTFx p50 | Lat mean (ms) | Lat p90 (ms) | GPU peak (MB) | Wall (s) |
|---|---|---|---|---|---|---|---|---|---|---|
| parakeet-tdt-lt+europarl5gram | nemo | 5644 | 8.93% | 2.06% | 29.6 | 29.6 | 167 | 234 | 612 | 982 |
| parakeet-tdt-lt | nemo | 5644 | 13.45% | 2.73% | 71.7 | 71.7 | 67 | 74 | 286 | 414 |
| parakeet-tdt-0.6b-v3 | nemo | 5644 | 16.64% | 4.32% | 71.7 | 71.7 | 67 | 74 | 280 | 430 |
| gemma-4-E4B-it-lt-asr | transformers | 5644 | 27.58% | 8.79% | 1.6 | 1.6 | 3341 | 4875 | 153 | 18896 |
| fw-large-v3 | faster-whisper | 5644 | 28.61% | 6.39% | 10.5 | 10.5 | 524 | 687 | 232 | 2998 |
| whisper-large-v3 | transformers | 5644 | 29.88% | 6.84% | 41216.9 | 6.7 | 766 | 1135 | 0 | 4364 |
| fw-large-v3-turbo | faster-whisper | 5644 | 32.15% | 7.48% | 26.8 | 26.8 | 201 | 226 | 334 | 1171 |
| whisper-large-v3-turbo | transformers | 5644 | 32.96% | 7.84% | 21.7 | 21.7 | 254 | 331 | 374 | 1486 |
| whisper-large-v2 | transformers | 5644 | 35.77% | 8.45% | 5.0 | 5.0 | 1117 | 1598 | 1552 | 6347 |
| gemma-4-E4B-it | transformers | 5644 | 49.63% | 14.94% | 2.5 | 2.5 | 2072 | 3104 | 188 | 11788 |
| whisper-medium | transformers | 5644 | 51.26% | 12.93% | 6.2 | 6.2 | 833 | 1233 | 1014 | 4791 |
| whisper-small | transformers | 5644 | 73.57% | 20.63% | 10.8 | 10.8 | 479 | 711 | 72 | 2744 |
| whisper-base | transformers | 5644 | 94.98% | 35.09% | 17.9 | 17.9 | 305 | 435 | 0 | 1765 |
| whisper-tiny | transformers | 5644 | 117.63% | 50.80% | 22.6 | 22.6 | 277 | 350 | 0 | 1617 |