how can i use the model to perform multigpu inference?

#39
by weijie210 - opened

Hi, thanks for the work in quantizing these models.
Can I ask how I can deploy the model to perform batch inference across gpus? Similar to ddp mode in pytorch lightning.

Sign up or log in to comment