The template format and a few questions.

#1
by tinybiggames - opened
  • Is the template still the same?
    Im currently using:
    <|user|>\n<|image_1|>\n%s<|end|>\n<|assistant|>\n for images
    and
    <|user|>\n%s<|end|>\n<|assistant|>\n for text.
    every response I get a </s> at the end. I'm thinking my template needs to be tweaked maybe? Otherwise I can always just snip it off the end as it seems to be consistent.

  • I've noticed the 128k versions of the phi3 tend to generate shorter output as compared to the 4k versions.

  • Is it possible to implement a model loading progress callback so that I can display a loading percentage of some sort?

  • Otherwise, working great for me. I avg about 60+ tokens/sec on my RTX 3060.

Note: I using only using the C api as I use this from Delphi.

image.png

Microsoft org

Is the template still the same?
Im currently using:
<|user|>\n<|image_1|>\n%s<|end|>\n<|assistant|>\n for images
and
<|user|>\n%s<|end|>\n<|assistant|>\n for text.
every response I get a at the end. I'm thinking my template needs to be tweaked maybe? Otherwise I can always just snip it off the end as it seems to be consistent.

Here is the official chat template you can use. The </s> is an extra EOS token id for the Phi-3 vision model because there is an eos_token_id = 2 entry in config.json and 2 = </s> in tokenizer_config.json. You can snip it off at the end as you suggested.

I've noticed the 128k versions of the phi3 tend to generate shorter output as compared to the 4k versions.

You can swap out the 128K text model that has been uploaded with the 4K text model if you want. Here are the instructions for how the text model is created.

Is it possible to implement a model loading progress callback so that I can display a loading percentage of some sort?

We can add a progress callback. Can you open an issue in the ONNX Runtime GenAI repo so we can track this?

Otherwise, working great for me. I avg about 60+ tokens/sec on my RTX 3060.

Great to hear!

Ok, thanks!

tinybiggames changed discussion status to closed

Sign up or log in to comment