feat: add eos_token_id to generation_config.json (needed by vllm infer)

#12
No description provided.

could you please share the python script to serve the internvl2-8b using the vllm using openai chat completions?

could you please share the python script to serve the internvl2-8b using the vllm using openai chat completions?

we may replace the /model with hf models

vllm serve /model --port 8000 \
  --trust-remote-code \
  --served-model-name internvl2-internlm2 \
  --enable-chunked-prefill False # this is required by now, otherwise inference will failed
OpenGVLab org

Thanks for your efforts and time!

czczup changed pull request status to merged

Sign up or log in to comment