cstr's picture
Update README.md
3185f21 verified
---
base_model:
- cstr/llama3.1-8b-spaetzle-v85
- cstr/llama3.1-8b-spaetzle-v86
- cstr/llama3.1-8b-spaetzle-v74
tags:
- merge
- mergekit
- lazymergekit
- cstr/llama3.1-8b-spaetzle-v85
- cstr/llama3.1-8b-spaetzle-v86
- cstr/llama3.1-8b-spaetzle-v74
license: llama3
language:
- en
- de
---
# llama3.1-8b-spaetzle-v90
These are q4_k_m quants made with llama.cpp b3472 from [cstr/llama3.1-8b-spaetzle-v90](https://huggingface.co/cstr/llama3.1-8b-spaetzle-v90) which is a progressive merge of merges.
EQ-Bench v2_de: 69.93 (171/171).
The merge tree involves the following models:
- NousResearch/Hermes-3-Llama-3.1-8B
- Undi95/Meta-Llama-3.1-8B-Claude
- Dampfinchen/Llama-3.1-8B-Ultra-Instruct
- VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct
- akjindal53244/Llama-3.1-Storm-8B
- nbeerbower/llama3.1-gutenberg-8B
- Undi95/Meta-Llama-3.1-8B-Claude
- DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1
- nbeerbower/llama-3-wissenschaft-8B-v2
- Azure99/blossom-v5-llama3-8b
- VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
- princeton-nlp/Llama-3-Instruct-8B-SimPO
- Locutusque/llama-3-neural-chat-v1-8b
- Locutusque/Llama-3-Orca-1.0-8B
- DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
- seedboxai/Llama-3-Kafka-8B-v0.2
- VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
- nbeerbower/llama-3-wissenschaft-8B-v2
- mlabonne/Daredevil-8B-abliterated-dpomix
There have been a number of steps involved, among which, slep merging of only middle layers compensating for tokenizer / chat template differences. An illustration below.
## 🧩 Configuration
The final merge for this was:
```yaml
models:
- model: cstr/llama3.1-8b-spaetzle-v59
# no parameters necessary for base model
- model: cstr/llama3.1-8b-spaetzle-v85
parameters:
density: 0.65
weight: 0.3
- model: cstr/llama3.1-8b-spaetzle-v86
parameters:
density: 0.65
weight: 0.3
- model: cstr/llama3.1-8b-spaetzle-v74
parameters:
density: 0.65
weight: 0.3
merge_method: dare_ties
base_model: cstr/llama3.1-8b-spaetzle-v59
parameters:
int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base
```
Among the previous steps:
```yaml
models:
- model: NousResearch/Hermes-3-Llama-3.1-8B
merge_method: slerp
base_model: cstr/llama3.1-8b-spaetzle-v74
parameters:
t:
- value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0, 0]
dtype: float16
```
## 💻 Usage
Use with llama3 chat template as common. The q4km quants here are from [cstr/llama3.1-8b-spaetzle-v90](https://huggingface.co/cstr/llama3.1-8b-spaetzle-v90).