MUSTAR
/

Rigel-rvc-base-pretrained-model

Model card Files Files and versions Community

Rigel-rvc-base-pretrained-model / README.md

MUSTAR's picture

Update README.md

a28cd70 verified 2 months ago

|

No virus

1.28 kB



	## Rigel Pretrained Model
	Base and Fine tuned models

	### Dataset

	* Size: Approximately 2000 hours of speech and vocals.
	* Languages:
	* Arabic: ~70 hours
	* Chinese (Mandarin): ~70 hours
	* English: ~800 hours
	* French: ~42 hours
	* German: ~35 hours
	* Hindi: ~30 hours
	* Indonesian: ~53 hours
	* Japanese: ~140 hours
	* Korean: ~80 hours
	* Portuguese: ~40 hours
	* Russian: ~188 hours
	* Singing (all languages): ~190 hours
	* Spanish: ~200 hours
	* Tagalog: ~30 hours
	* Common language: Unknown amount

	### Sampling Frequency

	* 32kHz (Done)
	* 40kHz (Retraining)

	### Models

	#### Base Model

	* Data: Approximately 2000 hours of low-mid quality data.
	* Steps: 3,890,220
	* Batch: 40
	* Precision: FP32
	* Sampling Rate: 32k

	#### Fine-Tuned Model

	* Data: 102 hours of high-quality data.
	* Steps: 2,854,856
	* Batch: 20
	* Precision: FP32
	* Sampling Rate: 32k

	### Hardware Used

	* CPU: AMD EPYC 9754
	* RAM: 256GB
	* GPUs:
	* 1 x H100
	* 4 x L40s
	* 1 x RTX 4080
	* 1 x RTX 4070 Ti

	### Expected Release Date

	* July 22nd

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/65041c19e88eb2d0d521d46c/NfsOJxAzRbllBDCDjFC5e.png)