how to merge all 8 split gguf files

#2
by marabgol - opened

Hi! I have downloaded :
Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00004-of-00008.gguf Llama-3.1-405B-Instruct.Q2_K-00007-of-00008.gguf
Llama-3.1-405B-Instruct-Q2_K-00002-of-00008.gguf Llama-3.1-405B-Instruct.Q2_K-00005-of-00008.gguf Llama-3.1-405B-Instruct.Q2_K-00008-of-00008.gguf
Llama-3.1-405B-Instruct-Q2_K-00003-of-00008.gguf Llama-3.1-405B-Instruct.Q2_K-00006-of-00008.gguf

how can I merge them and use them as inference ? thanks

Many modern tools support multi-part models, there is no need to merge the files.

If you use an older version update your tools to the latest version and try to load the model again.

Some tools use different terminology such as split files or sharted GGUF but it is actually the same thing.

If your tools do not support multi part models yet, you can use the gguf-split utility, it is included in llama.cpp.
gguf-split --merge input output
More info here: https://github.com/ggerganov/llama.cpp/discussions/6404

Hi
I tried this command but it does not work :
llama-gguf-split --merge Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00002-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00003-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00004-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00005-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00006-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00007-of-00008.gguf Llama-3.1-405B-Instruct-Q2_K-00008-of-00008.gguf ../merge.gguf

it also touch modify the file :
0 Aug 3 16:08 Llama-3.1-405B-Instruct-Q2_K-00002-of-00008.gguf

i dont see any binary as gguf-split

I see source code here : /Volumes/T9/llama.cpp/llama.cpp/examples/gguf-split but I can not compile it in macos

any tip appreciate it.

I have a similar issue:

gguf_merge: reading metadata models/Llama-3.1-405B-Instruct.Q4_0.gguf/Llama-3.1-405B-Instruct.Q4_0-00001-of-00012.gguf done
gguf_merge: reading metadata models/Llama-3.1-405B-Instruct.Q4_0.gguf/Llama-3.1-405B-Instruct.Q4_0-00002-of-00012.gguf ...gguf_init_from_file: invalid magic characters 'mf,a'

gguf_merge: failed to load input GGUF from models/Llama-3.1-405B-Instruct.Q4_0.gguf/Llama-3.1-405B-Instruct.Q4_0-00001-of-00012.gguf

Hello, in the merge command you should only use the first file.
For Example: llama-gguf-split --merge Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf ../merge.gguf

The tool detects the other files by itself.

@Mobbyxx
Thank you,
I tried it looks like no 5 has an issue , I removed and ( git lfs ) as well:

llama-gguf-split --merge Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf ../merge.gguf
gguf_merge: Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf -> ../merge.gguf
gguf_merge: reading metadata Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf done
gguf_merge: reading metadata Llama-3.1-405B-Instruct-Q2_K-00002-of-00008.gguf done
gguf_merge: reading metadata Llama-3.1-405B-Instruct-Q2_K-00003-of-00008.gguf done
gguf_merge: reading metadata Llama-3.1-405B-Instruct-Q2_K-00004-of-00008.gguf done
gguf_merge: reading metadata Llama-3.1-405B-Instruct-Q2_K-00005-of-00008.gguf ...gguf_init_from_file: failed to open 'Llama-3.1-405B-Instruct-Q2_K-00005-of-00008.gguf': 'No such file or directory'

gguf_merge: failed to load input GGUF from Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf

can you or someone test it ?
thanks again

Hello looks like the Llama-3.1-405B-Instruct-Q2_K-00005-of-00008.gguf doesnt exist. Can you confirm you downloaded every file? Tested it and worked fine.

It exists and I removed it and dowloaded few times :
ls -ltr
total 295323296
-rw-r--r-- 1 majid staff 19985474624 Jul 27 17:46 Llama-3.1-405B-Instruct-Q2_K-00004-of-00008.gguf
-rw-r--r-- 1 majid staff 19840771040 Jul 27 18:08 Llama-3.1-405B-Instruct.Q2_K-00007-of-00008.gguf
-rw-r--r-- 1 majid staff 19823882720 Jul 27 18:21 Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf
-rw-r--r-- 1 majid staff 19787752384 Jul 27 20:17 Llama-3.1-405B-Instruct-Q2_K-00003-of-00008.gguf
-rw-r--r-- 1 majid staff 19787752384 Jul 27 20:21 Llama-3.1-405B-Instruct.Q2_K-00006-of-00008.gguf
-rw-r--r-- 1 majid staff 12351280160 Jul 27 20:22 Llama-3.1-405B-Instruct.Q2_K-00008-of-00008.gguf
-rw-r--r-- 1 majid staff 19840770880 Aug 8 00:38 Llama-3.1-405B-Instruct-Q2_K-00002-of-00008.gguf
-rw-r--r-- 1 majid staff 19787817984 Sep 3 12:35 Llama-3.1-405B-Instruct.Q2_K-00005-of-00008.gguf

I wonder if there is hash file to double check.

But when I download from this place everything woks fine:
https://huggingface.co/bullerwins/Meta-Llama-3.1-405B-Instruct-GGUF/tree/main

Thanks again for reply

Can you please dowload all files again. I see that the other files are from Jul 27, maybe they did an update to the files.

from llama_cpp import Llama

llm = Llama.from_pretrained(
repo_id="leafspark/Meta-Llama-3.1-405B-Instruct-GGUF",
filename="Llama-3.1-405B-Instruct.Q2_K.gguf/Llama-3.1-405B-Instruct-Q2_K-00001-of-00008.gguf",
)

llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)

Download part then change the number to part 2 and so on and ignore the error messages and after downloading part 8 change the number to 1 and run it and it will work

path part on colab
/root/.cache/huggingface/hub/models--bartowski--DeepSeek-V2.5-GGUF/blobs

Sign up or log in to comment