--- tags: - merge - mergekit - lazymergekit - mistralai/Mistral-7B-Instruct-v0.2 - beowolx/CodeNinja-1.0-OpenChat-7B base_model: - mistralai/Mistral-7B-Instruct-v0.2 - beowolx/CodeNinja-1.0-OpenChat-7B license: apache-2.0 --- # Hugo-7B-slerp alt text Hugo-7B-slerp is a successful merge of the following models using mergekit: * [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) * [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B) ## 🧩 Configuration ```yaml slices: - sources: - model: mistralai/Mistral-7B-Instruct-v0.2 layer_range: [0, 32] - model: beowolx/CodeNinja-1.0-OpenChat-7B layer_range: [0, 32] merge_method: slerp base_model: mistralai/Mistral-7B-Instruct-v0.2 parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 dtype: bfloat16 ``` ## 📈 Performance | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | | --- | --- | --- | --- | --- | --- | --- | --- | | [paulilioaica/Hugo-7B-slerp](#) | **67.07** | **64.51** | 84.77 | **62.54** | 57.13 | **80.03** | 53.45 | | [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | 65.71 | 63.14 | 84.88 | 60.78 | 68.26 | 77.19 | 40.03 | | [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B) | 67.4 | 63.48 | 83.65 | 63.77 | 47.16 | 79.79 | 66.57 | With bold one can see the benchmarks where this merge overtakes the basemodel in performance. ## 💻 Usage ```python !pip install -qU transformers accelerate from transformers import AutoTokenizer import transformers import torch model = "paulilioaica/Hugo-7B-slerp" messages = [{"role": "user", "content": "What is a large language model?"}] tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "conversational", model=model, torch_dtype=torch.float16, device_map="auto", ) outputs = pipeline(messages, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs) ``` ## 🛈 More on megekit [mergekit](https://huggingface.co/blog/mlabonne/merge-models)