Sampler:
Likes a low temperature due to the MoE architecture. I use 0.3 personally.

Llama-3.1-Celestial-Stone-2x8B-DPO (BF16)

DPO Trained, Mixture of Experts (14B).

2x Experts working together per token, Gutenberg novelwriting finetuning.

The first expert is Instruct 405B distillation/RP vector merge (Supernova-Lite, Niitama1.1, Storm)

The second expert is ERP/Reddit data merge (Celeste1.5, Stheno3.4, Storm)

The base model is Sao10k/L3.1-Stheno-3.4 with the Sunfall LoRa 0.6.1 to make it understand SillyTavern prompts and storywriting better.

Resultant merge finetuned on jondurbin/gutenberg-dpo-v0.1.

List of llama.cpp repos

Thanks mradermacher (GGUF):

Thanks QuantFactory (GGUF):

GGUF static Q2-Q8

Thanks Triangle104 (GGUF):

Other

GGUF Imatrix IQ4-Q8

L3.1-Celestial-Stone-2x8B Finetuned on Nvidia A100. (See Base Model card for additional details.)

0.5 Epoch completed of dataset jondurbin/gutenberg-dpo-v0.1 with learning_rate=8e-6

Result seems pretty good even with half epoch and low learning rate, the effect is smoother and less pronounced but its probably not optimal.

Outputs are more compliant and verbose, less sloppy and safety aligned.

Prompt Template:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{output}<|eot_id|>

Sometimes has false refusals but swiping and "uncensored" prompts work. I have no idea why this happens tbh, since none of the base models exhibit this behavior, it seems to be a random emergence, and extra abliteration has no impact? gating method has no impact. But it's still pretty good imo.

For Llama.cpp/LMStudio/etc Make sure "num_experts_used = 2"

v000000
/

L3.1-Celestial-Stone-2x8B-DPO

Llama-3.1-Celestial-Stone-2x8B-DPO (BF16)

Thanks mradermacher (GGUF):

Thanks QuantFactory (GGUF):

Thanks Triangle104 (GGUF):

Prompt Template:

Model tree for v000000/L3.1-Celestial-Stone-2x8B-DPO

Dataset used to train v000000/L3.1-Celestial-Stone-2x8B-DPO

Collections including v000000/L3.1-Celestial-Stone-2x8B-DPO

Favorite

11-20B Models