fedric95/gemma-2-2b-GGUF

Llamacpp Quantizations of Meta-Llama-3.1-8B

Using llama.cpp release b3583 for quantization.

Filename	Quant type	File Size	Perplexity (wikitext-2-raw-v1.test)
gemma-2-2b.FP32.gguf	FP32	10.50GB	8.9236 +/- 0.06373
gemma-2-2b-Q8_0.gguf	Q8_0	2.78GB	8.9299 +/- 0.06377
gemma-2-2b-Q6_K.gguf	Q6_K	2.15GB	8.9570 +/- 0.06404
gemma-2-2b-Q5_K_M.gguf	Q5_K_M	1.92GB	9.0061 +/- 0.06461
gemma-2-2b-Q5_K_S.gguf	Q5_K_S	1.88GB	9.0096 +/- 0.06451
gemma-2-2b-Q4_K_M.gguf	Q4_K_M	1.71GB	9.2260 +/- 0.06643
gemma-2-2b-Q4_K_S.gguf	Q4_K_S	1.64GB	9.3116 +/- 0.06726
gemma-2-2b-Q3_K_L.gguf	Q3_K_L	1.55GB	9.5683 +/- 0.06909
gemma-2-2b-Q3_K_M.gguf	Q3_K_M	1.46GB	9.7759 +/- 0.07120
gemma-2-2b-Q3_K_S.gguf	Q3_K_S	1.36GB	10.8067 +/- 0.08032
gemma-2-2b-Q2_K.gguf	Q2_K	1.23GB	13.8994 +/- 0.10723

First, make sure you have hugginface-cli installed:

pip install -U "huggingface_hub[cli]"

Then, you can target the specific file you want:

huggingface-cli download fedric95/gemma-2-2b-GGUF --include "gemma-2-2b-Q4_K_M.gguf" --local-dir ./

If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:

huggingface-cli download fedric95/gemma-2-2b-GGUF --include "gemma-2-2b-Q8_0.gguf/*" --local-dir gemma-2-2b-Q8_0

You can either specify a new local-dir (gemma-2-2b-Q8_0) or download them all in place (./)