Trelis
/

Llama-2-7b-chat-hf-function-calling-GPTQ

Text Generation

function calling

text-generation-inference

Model card Files Files and versions Community

RonanMcGovern commited on Aug 12, 2023

Commit

48c33a5

•

1 Parent(s): c3d9785

add link to 13B QPTQ

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 language:
 - en
 pipeline_tag: text-generation
-inference: true
 tags:
 - facebook
 - meta
@@ -22,7 +22,7 @@ tags:
 Available models:
 - fLlama-7B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling)), ([GGML](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GGML)), ([GPTQ](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GPTQ)) - free
-- fLlama-13B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling)) - paid
 ## Inference with Google Colab and HuggingFace 🤗
@@ -41,7 +41,7 @@ To run this you'll need to install llamaccp from ggerganov on github.
 ```
   ./server -m fLlama-2-7b-chat.ggmlv3.q3_K_M.bin -ngl 32 -c 2048
   ```
-  which will allow you to run a chatbot in your browser. The -ngl offloads layers to the Mac's GPU and gets very good token generation speed.
 ## Licensing and Usage

 language:
 - en
 pipeline_tag: text-generation
+inference: false
 tags:
 - facebook
 - meta
 Available models:
 - fLlama-7B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling)), ([GGML](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GGML)), ([GPTQ](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GPTQ)) - free
+- fLlama-13B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling)), ([GPTQ](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-GPTQ)) - paid
 ## Inference with Google Colab and HuggingFace 🤗
 ```
   ./server -m fLlama-2-7b-chat.ggmlv3.q3_K_M.bin -ngl 32 -c 2048
   ```
+which will allow you to run a chatbot in your browser. The -ngl offloads layers to the Mac's GPU and gets very good token generation speed.
 ## Licensing and Usage