Edit model card

Sampler:
Likes a low temperature due to the MoE architecture. I use 0.3 personally.

Llama-3.1-Celestial-Stone-2x8B-DPO (BF16)

  • DPO Trained, Mixture of Experts (14B).

image/png

  • 2x Experts working together per token, Gutenberg novelwriting finetuning.

The first expert is Instruct 405B distillation/RP vector merge (Supernova-Lite, Niitama1.1, Storm)

The second expert is ERP/Reddit data merge (Celeste1.5, Stheno3.4, Storm)


The base model is Sao10k/L3.1-Stheno-3.4 with the Sunfall LoRa 0.6.1 to make it understand SillyTavern prompts and storywriting better.


Resultant merge finetuned on jondurbin/gutenberg-dpo-v0.1.


List of llama.cpp repos

Thanks mradermacher (GGUF):

Thanks QuantFactory (GGUF):

Thanks Triangle104 (GGUF):

Other


L3.1-Celestial-Stone-2x8B Finetuned on Nvidia A100. (See Base Model card for additional details.)

0.5 Epoch completed of dataset jondurbin/gutenberg-dpo-v0.1 with learning_rate=8e-6

Result seems pretty good even with half epoch and low learning rate, the effect is smoother and less pronounced but its probably not optimal.

Outputs are more compliant and verbose, less sloppy and safety aligned.


Prompt Template:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{output}<|eot_id|>

Sometimes has false refusals but swiping and "uncensored" prompts work. I have no idea why this happens tbh, since none of the base models exhibit this behavior, it seems to be a random emergence, and extra abliteration has no impact? gating method has no impact. But it's still pretty good imo.

For Llama.cpp/LMStudio/etc Make sure "num_experts_used = 2"

Downloads last month
20
Safetensors
Model size
13.7B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for v000000/L3.1-Celestial-Stone-2x8B-DPO

Dataset used to train v000000/L3.1-Celestial-Stone-2x8B-DPO

Collections including v000000/L3.1-Celestial-Stone-2x8B-DPO