openai/whisper-large-v3 · how many GPU memory do I need to finetune largeV3

Aug 19

I am trying to finetune largeV3 using transformers code: torchrun --nproc_per_node 3 whisper_transformer_001.py --model_name_or_path="/home/lane/ai/models/openai/whisper/largeV3" --dataset_name="mozilla-foundation/common_voice_2_0" --dataset_config_name="zh-CN" --language="Chinese" --task="transcribe" --train_split_name="train+validation" --eval_split_name="test" --max_steps="400" --output_dir="/home/lane/ai/models/openai/whisper/largeV3-Chinese" --per_device_train_batch_size="1" --per_device_eval_batch_size="1" --logging_steps="5" --learning_rate="1e-5" --warmup_steps="40" --eval_strategy="steps" --eval_steps="100" --save_strategy="steps" --save_steps="100" --generation_max_length="95" --preprocessing_num_workers="8" --max_duration_in_seconds="30" --text_column_name="sentence" --freeze_feature_encoder="False" --gradient_checkpointing --fp16 --overwrite_output_dir --do_train --do_eval --predict_with_generate.

I have 3 GPU with 22G memory for each. but each time I encountered CUDA OUT OF MEMORY.
dose any one have finetuned largeV3? how do you do that, with how much GPU memory? thanks.

metricv

26 days ago

Was able to finetune large-v3 model on an A100. Max GPU memory consumption was 32GB. per_device_train_batch_size=2.

lanejohn

22 days ago

Was able to finetune large-v3 model on an A100. Max GPU memory consumption was 32GB. per_device_train_batch_size=2.

thanks a lot for the sharing