Edit model card

Quantized GGUF model Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge

This model has been quantized using llama-quantize from llama.cpp

Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge

Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge is a merge of the following models using mergekit:

🧩 Merge Configuration

slices:
  - sources:
      - model: NousResearch/Hermes-2-Pro-Llama-3-8B
        layer_range: [0, 31]
      - model: shenzhi-wang/Llama3-8B-Chinese-Chat
        layer_range: [0, 31]
merge_method: slerp
base_model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: float16

Model Features

This fusion model combines the robust generative capabilities of NousResearch/Hermes-2-Pro-Llama-3-8B with the refined tuning of shenzhi-wang/Llama3-8B-Chinese-Chat, creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks, including multilingual capabilities and structured outputs.

Evaluation Results

Hermes-2-Pro-Llama-3-8B

  • Scored 90% on function calling evaluation.
  • Scored 84% on structured JSON output evaluation.

Llama3-8B-Chinese-Chat

  • Significant improvements in roleplay, function calling, and math capabilities compared to previous versions.
  • Achieved high performance in both Chinese and English tasks, surpassing ChatGPT in certain benchmarks.

Limitations

While the merged model inherits the strengths of both parent models, it may also carry over some limitations and biases. For instance, the model may exhibit inconsistencies in responses when handling complex queries or when generating content that requires deep contextual understanding. Additionally, the model's performance may vary based on the language used, with potential biases present in the training data affecting the quality of outputs in less represented languages or dialects. Users should remain aware of these limitations when deploying the model in real-world applications.

Downloads last month
303
GGUF
Model size
7.81B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .