BramVanroy
commited on
Commit
•
738aa0e
1
Parent(s):
56e810a
Update README.md
Browse files
README.md
CHANGED
@@ -21,53 +21,30 @@ datasets:
|
|
21 |
<em>A conversational model for Dutch, aligned through AI feedback.</em>
|
22 |
</div>
|
23 |
|
24 |
-
This is a
|
25 |
|
26 |
-
|
27 |
-
|
28 |
-
### LM Studio
|
29 |
-
|
30 |
-
You can use this model in [LM Studio](https://lmstudio.ai/), an easy-to-use interface to locally run optimized models. Simply search for `BramVanroy/GEITje-7B-ultra-GGUF`, and download the available file.
|
31 |
|
32 |
-
### Ollama
|
33 |
-
|
34 |
-
The model is available on `ollama` and can be easily run as follows:
|
35 |
-
|
36 |
-
```shell
|
37 |
-
ollama run bramvanroy/geitje-7b-ultra-gguf
|
38 |
```
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
```shelll
|
47 |
-
ollama create geitje-7b-ultra-gguf -f ./Modelfile
|
48 |
-
ollama run geitje-7b-ultra-gguf
|
49 |
```
|
50 |
|
51 |
-
|
52 |
|
53 |
-
|
54 |
|
55 |
-
Download initial model (probaby a huggingface-cli alternative exists, too...)
|
56 |
|
57 |
-
|
58 |
-
from huggingface_hub import snapshot_download
|
59 |
-
model_id = "BramVanroy/GEITje-7B-ultra"
|
60 |
-
snapshot_download(repo_id=model_id, local_dir="geitje-ultra-hf", local_dir_use_symlinks=False)
|
61 |
-
```
|
62 |
|
63 |
-
|
64 |
|
65 |
-
|
66 |
-
# Convert to GGML format
|
67 |
-
python convert.py build/geitje-ultra-hf/
|
68 |
|
69 |
-
|
70 |
|
71 |
-
|
72 |
-
bin/quantize geitje-ultra-hf/ggml-model-f32.gguf geitje-ultra-hf/GEITje-7B-ultra-Q5_K_M.gguf Q5_K_M
|
73 |
-
```
|
|
|
21 |
<em>A conversational model for Dutch, aligned through AI feedback.</em>
|
22 |
</div>
|
23 |
|
24 |
+
This is a GGUF version of [BramVanroy/GEITje-7B-ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra), a powerful Dutch chatbot, which ultimately is Mistral-based model, further pretrained on Dutch and additionally treated with supervised-finetuning and DPO alignment. For more information on the model, data, licensing, usage, see the main model's README.
|
25 |
|
26 |
+
Available quantization types and expected performance differences compared to base `f16`, higher perplexity=worse (from llama.cpp):
|
|
|
|
|
|
|
|
|
27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
```
|
29 |
+
Q3_K_M : 3.07G, +0.2496 ppl @ LLaMA-v1-7B
|
30 |
+
Q4_K_M : 3.80G, +0.0532 ppl @ LLaMA-v1-7B
|
31 |
+
Q5_K_M : 4.45G, +0.0122 ppl @ LLaMA-v1-7B
|
32 |
+
Q6_K : 5.15G, +0.0008 ppl @ LLaMA-v1-7B
|
33 |
+
Q8_0 : 6.70G, +0.0004 ppl @ LLaMA-v1-7B
|
34 |
+
F16 : 13.00G @ 7B
|
|
|
|
|
|
|
|
|
35 |
```
|
36 |
|
37 |
+
Also available on [ollama](https://ollama.com/bramvanroy/geitje-7b-ultra).
|
38 |
|
39 |
+
Quants were made with release [`b2777`](https://github.com/ggerganov/llama.cpp/releases/tag/b2777) of llama.cpp.
|
40 |
|
|
|
41 |
|
42 |
+
## Usage
|
|
|
|
|
|
|
|
|
43 |
|
44 |
+
### LM Studio
|
45 |
|
46 |
+
You can use this model in [LM Studio](https://lmstudio.ai/), an easy-to-use interface to locally run optimized models. Simply search for `BramVanroy/GEITje-7B-ultra-GGUF`, and download the available file.
|
|
|
|
|
47 |
|
48 |
+
### Ollama
|
49 |
|
50 |
+
The model is available on [`ollama`](https://ollama.com/bramvanroy/geitje-7b-ultra).
|
|
|
|