saattrupdan
/

kblab-voxrex-wav2vec2-large-cv8-da

@@ -1,127 +1,26 @@
 ---
 license: cc0-1.0
-tags:
-- generated_from_trainer
 datasets:
-- common_voice
 model-index:
 - name: kblab-voxrex-wav2vec2-large-cv8-da
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# kblab-voxrex-wav2vec2-large-cv8-da
-This model is a fine-tuned version of [KBLab/wav2vec2-large-voxrex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) on the common_voice dataset.
-It achieves the following results on the evaluation set:
-- Loss: 329.0055
-- Wer: 0.3768
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 4e-05
-- train_batch_size: 4
-- eval_batch_size: 8
-- seed: 4242
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 32
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 500
-- num_epochs: 500
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step  | Validation Loss | Wer    |
-|:-------------:|:------:|:-----:|:---------------:|:------:|
-| 767.4506      | 5.55   | 300   | 1359.1575       | 1.0    |
-| 576.4063      | 11.11  | 600   | 1265.8390       | 1.0    |
-| 519.8654      | 16.66  | 900   | 1039.6812       | 1.0    |
-| 313.227       | 22.22  | 1200  | 551.1237        | 0.8551 |
-| 241.8147      | 27.77  | 1500  | 421.1035        | 0.7096 |
-| 203.6478      | 33.33  | 1800  | 359.2018        | 0.6291 |
-| 169.5277      | 38.88  | 2100  | 328.7173        | 0.5931 |
-| 149.7277      | 44.44  | 2400  | 312.2329        | 0.5593 |
-| 134.0794      | 49.99  | 2700  | 298.8540        | 0.5364 |
-| 124.439       | 55.55  | 3000  | 295.4873        | 0.5169 |
-| 114.4032      | 61.11  | 3300  | 287.1676        | 0.5050 |
-| 103.9973      | 66.66  | 3600  | 280.2365        | 0.4967 |
-| 96.152        | 72.22  | 3900  | 279.2440        | 0.4857 |
-| 89.5619       | 77.77  | 4200  | 279.0049        | 0.4739 |
-| 89.8041       | 83.33  | 4500  | 276.0360        | 0.4616 |
-| 78.6993       | 88.88  | 4800  | 278.6253        | 0.4539 |
-| 74.2165       | 94.44  | 5100  | 276.4348        | 0.4488 |
-| 69.5902       | 99.99  | 5400  | 276.1476        | 0.4417 |
-| 67.8592       | 105.55 | 5700  | 275.3440        | 0.4341 |
-| 64.1541       | 111.11 | 6000  | 278.0880        | 0.4363 |
-| 60.7204       | 116.66 | 6300  | 281.5571        | 0.4374 |
-| 56.6715       | 122.22 | 6600  | 282.7102        | 0.4306 |
-| 55.7875       | 127.77 | 6900  | 279.3789        | 0.4228 |
-| 54.5305       | 133.33 | 7200  | 283.6728        | 0.4208 |
-| 51.4744       | 138.88 | 7500  | 282.4348        | 0.4227 |
-| 47.1217       | 144.44 | 7800  | 287.4393        | 0.4123 |
-| 48.4808       | 149.99 | 8100  | 286.8406        | 0.4126 |
-| 46.415        | 155.55 | 8400  | 290.3094        | 0.4144 |
-| 43.29         | 161.11 | 8700  | 291.6872        | 0.4144 |
-| 42.7431       | 166.66 | 9000  | 297.7512        | 0.4210 |
-| 41.8859       | 172.22 | 9300  | 296.6982        | 0.4085 |
-| 41.2126       | 177.77 | 9600  | 294.0860        | 0.4123 |
-| 40.8457       | 183.33 | 9900  | 298.7288        | 0.4058 |
-| 36.6865       | 188.88 | 10200 | 305.0593        | 0.4036 |
-| 34.1681       | 194.44 | 10500 | 304.9405        | 0.4112 |
-| 34.4368       | 199.99 | 10800 | 303.7193        | 0.4023 |
-| 35.3407       | 205.55 | 11100 | 295.9553        | 0.3975 |
-| 34.0598       | 211.11 | 11400 | 300.0461        | 0.4012 |
-| 33.4694       | 216.66 | 11700 | 307.4055        | 0.3942 |
-| 32.2768       | 222.22 | 12000 | 307.5330        | 0.3926 |
-| 34.4758       | 227.77 | 12300 | 307.9725        | 0.4003 |
-| 30.5966       | 233.33 | 12600 | 311.4758        | 0.3950 |
-| 29.2803       | 238.88 | 12900 | 308.0916        | 0.3933 |
-| 28.6945       | 244.44 | 13200 | 307.3855        | 0.3921 |
-| 29.8094       | 249.99 | 13500 | 317.3207        | 0.3920 |
-| 29.7135       | 255.55 | 13800 | 310.4784        | 0.3925 |
-| 28.7815       | 261.11 | 14100 | 315.4926        | 0.3942 |
-| 27.1585       | 266.66 | 14400 | 321.6101        | 0.3972 |
-| 26.9533       | 272.22 | 14700 | 314.2688        | 0.3918 |
-| 26.8752       | 277.77 | 15000 | 321.5280        | 0.3941 |
-| 26.7076       | 283.33 | 15300 | 323.5451        | 0.3912 |
-| 25.8936       | 288.88 | 15600 | 326.1316        | 0.3889 |
-| 25.6714       | 294.44 | 15900 | 324.0426        | 0.3905 |
-| 25.0952       | 299.99 | 16200 | 322.3788        | 0.3870 |
-| 23.5694       | 305.55 | 16500 | 323.4653        | 0.3828 |
-| 24.6763       | 311.11 | 16800 | 328.5225        | 0.3831 |
-| 25.1798       | 316.66 | 17100 | 320.6808        | 0.3868 |
-| 23.6551       | 322.22 | 17400 | 325.5733        | 0.3842 |
-| 23.118        | 327.77 | 17700 | 327.0573        | 0.3820 |
-| 23.178        | 333.33 | 18000 | 322.2932        | 0.3723 |
-| 22.3727       | 338.88 | 18300 | 332.8637        | 0.3783 |
-| 22.8178       | 344.44 | 18600 | 333.7156        | 0.3854 |
-| 22.3476       | 349.99 | 18900 | 326.8071        | 0.3766 |
-| 21.6792       | 355.55 | 19200 | 329.8040        | 0.3793 |
-| 23.5751       | 361.11 | 19500 | 329.0055        | 0.3768 |
-### Framework versions
-- Transformers 4.16.2
-- Pytorch 1.10.2+cu102
-- Datasets 1.18.3
-- Tokenizers 0.11.0

 ---
+language:
+- da
 license: cc0-1.0
 datasets:
+- common_voice_8_0
 model-index:
 - name: kblab-voxrex-wav2vec2-large-cv8-da
   results: []
 ---
+# KBLab-VoxRex-Wav2vec2-large-CV8-da
 ## Model description
+This model is a fine-tuned version of the Swedish acoustic model [KBLab/wav2vec2-large-voxrex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) on the Danish part of [Common Voice 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0), containing ~6 crowdsourced hours of read-aloud Danish speech.
+## Performance
+The model achieves the following WER scores (lower is better):
+| **Dataset** | **WER without LM** | **WER with 5-gram LM** |
+| :---:   | ---: | ---: |
+| [Danish part of Common Voice 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0/viewer/da/train) | 37.63 | xx.xx |
+| [Alvenir test set](https://huggingface.co/datasets/Alvenir/alvenir_asr_da_eval) | 35.75 | xx.xx |