Edit model card
  e88 88e                               d8     
 d888 888b  8888 8888  ,"Y88b 888 8e   d88     
C8888 8888D 8888 8888 "8" 888 888 88b d88888   
 Y888 888P  Y888 888P ,ee 888 888 888  888     
  "88 88"    "88 88"  "88 888 888 888  888     
      b                                        
      8b,                                      
 
  e88'Y88                  d8           888    
 d888  'Y  ,"Y88b 888,8,  d88    ,e e,  888    
C8888     "8" 888 888 "  d88888 d88 88b 888    
 Y888  ,d ,ee 888 888     888   888   , 888    
  "88,d88 "88 888 888     888    "YeeP" 888    
                                               
PROUDLY PRESENTS         

Mistral-Small-NovusKyver-iMat-GGUF

Quantization Note: For smaller sizes (i.e. Q3/IQ3 and below) a repetition penalty of 1.05-1.15 is recommended.

Quantized with love from fp32.

Original model author: envoid

  • Importance Matrix calculated using groups_merged.txt
  • 105 chunks
  • n_ctx=512
  • Calculation uses fp32 precision model weights

Original model README here and below:

Warning this model can be a bit unpredictable regarding adult content.

Mistral-Small-NovusKyver started out as mistralai/Mistral-Small-Instruct-2409

I ran a fairly strong LoRA on it using a private raw-text dataset. The results were 'overcooked' so I did a 50/50 SLERP merge back onto the original model and this is the result of that merge.

Use Cases:

The output is definitely more interesting. As an assistant with a system message such as "You will provide the user with interesting and thought provoking responses."

It also has interesting output in some RP scenarios with the following caveats:

-Any mention of consent or NSFW in the prompt will cause it to run away with producing adult content.

-It tends to run away with its turn when using a highly structured prompt while giving short, boring responses, with a relatively minimalist prompt.

-Creative writing by instruct.

It utilises the Mistral Instruct prompt format.

Downloads last month
7
GGUF
Model size
22.2B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for Quant-Cartel/Mistral-Small-NovusKyver-iMat-GGUF

Quantized
this model