naver
/

multilingual-distilwhisper-3k

Automatic Speech Recognition

text2text-generation

Model card Files Files and versions Community

multilingual-distilwhisper-3k / README.md

mzboito's picture

Update README.md

ced63b0 verified 25 days ago

|

history blame contribute delete

No virus

1.13 kB

	---
	license: mit
	datasets:
	- mozilla-foundation/common_voice_13_0
	language:
	- ca
	- bg
	- cs
	- fi
	- gl
	- hi
	- hu
	- pl
	- ro
	- sk
	- ta
	- th
	tags:
	- automatic-speech-recognition
	inference: false
	pipeline_tag: automatic-speech-recognition
	---

	## About

	Multilingual Distilwhisper allows for better ASR performance in target languages by adding lightweight CLSR modules on top of whisper-small.
	These modules are trained on a mix of cross-entropy (ASR) and knowledge distillation losses, where whisper-large-v2 is used as teacher.
	More details in the ICASSP 2024 paper: arxiv.org/abs/2311.01070

	## Inference

	Code for training and inference at: https://github.com/naver/multilingual-distilwhisper

	## Citation
	```
	@inproceedings{ferraz2024distilwhisper,
	title={Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts},
	author={Ferraz, Thomas Palmeira and Boito, Marcely Zanon and Brun, Caroline and Nikoulina, Vassilina},
	booktitle={ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
	year={2024},
	organization={IEEE}
	}
	```