Edit model card

Model Card for Llama-3-8B-Instruct-SkillMix

This model was SFT-ed from meta-llama/Meta-Llama-3-8B with data generated by the Seed-Dataset Agnostic version of the Instruct-SkillMix pipeline.

Training Details

We used 4000 examples from Instruct-SkillMix-SDA(k=2) (data available at PrincetonPLI/Instruct-SkillMix-SDA).

  • LR: 2e-5
    • Linear Warmup Ratio: 0.03
    • Decay: Cosine Decay to 0
  • Batch Size: 128
  • epoch: 7 / 15
  • Optimizer: AdamW
  • Sequence Length: 1024

Evaluation Details

We provide the set of generation configuration used for evaluation.

AlpacaEval

  • model_kwargs:
    • torch_dtype: 'bfloat16'
    • max_new_tokens: 2048
  • temperature: 0.9
  • top_p: 1.0
  • do_sample: True
  • stop_token_ids:
    • 128001
    • 128009

MTBench

  • model_kwargs:
    • torch_dtype: 'bfloat16'
    • max_new_tokens: 1024
  • temperature: 0.7
  • stop_token_ids:
    • 128001
    • 128009

WildBench

  • model_kwargs:
    • torch_dtype: 'bfloat16'
    • max_new_tokens: 4096
  • temperature: 0.9
  • top_p: 1.0
  • do_sample: True
  • stop_token_ids:
    • 128001
    • 128009

Citation

Paper: Instruct-SkillMix

@misc{kaur2024instructskillmixpowerfulpipelinellm,
      title={Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning}, 
      author={Simran Kaur and Simon Park and Anirudh Goyal and Sanjeev Arora},
      year={2024},
      eprint={2408.14774},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2408.14774}, 
}

Contact

Simran Kaur, Princeton University

Simon Park, Princeton University

{skaur, juhyunp} 'at' princeton 'dot' edu

Downloads last month
4
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for PrincetonPLI/Llama-3-8B-Instruct-SkillMix

Finetuned
this model

Collection including PrincetonPLI/Llama-3-8B-Instruct-SkillMix