how to translate model ( whisper-small ) to pt file (small.pt)?

#146
by lihenan1996 - opened

Friends, I fine-tune a model ( whisper-fine ).But encounter to a problem that I cannot translate model ( whisper-fine) to pt file ( whisper_fine.pt) , when I use torch.load and save, the pt can not use by whisper.load function.
Do you have ways to make it successfully?

After fine-tuning, ensure you save the entire model state, not just the weights.
torch.save(model.state_dict(), 'whisper_fine.pt')
This typically involves saving the model in a way that you can later load it directly with the necessary architecture in place.
When loading the model, you need to first initialize a model with the same architecture as your fine-tuned model and then load the weights.
whisper.load_model('base') like using appropriate model size 'tiny', 'base'
model.load_state_dict(torch.load('whisper_fine.pt'))
To load models with pretrained weights provided by OpenAI use whisper.load function. If you're trying to use this function directly to load your fine-tuned .pt file, it might not work because whisper.load expects specific file paths and configurations. Instead, you should manually load the model. I hp this might help you.

Hi, I am here for consultant. I have tried to finetune largeV3 bur failed due to GPU MEMORY outage. so how did you fine tuned it, how many GPU Memory did you use? I used three GPU with 22G memory for each.

After fine-tuning, ensure you save the entire model state, not just the weights.
torch.save(model.state_dict(), 'whisper_fine.pt')
This typically involves saving the model in a way that you can later load it directly with the necessary architecture in place.
When loading the model, you need to first initialize a model with the same architecture as your fine-tuned model and then load the weights.
whisper.load_model('base') like using appropriate model size 'tiny', 'base'
model.load_state_dict(torch.load('whisper_fine.pt'))
To load models with pretrained weights provided by OpenAI use whisper.load function. If you're trying to use this function directly to load your fine-tuned .pt file, it might not work because whisper.load expects specific file paths and configurations. Instead, you should manually load the model. I hp this might help you.

Unfortunately this does not seem to work. The HF model and original model seems to differ in their shape.

RuntimeError: Error(s) in loading state_dict for Whisper:
        Missing key(s) in state_dict: <Lots of keys>

Sign up or log in to comment