Model Card for Llama-3-8B-Instruct-SkillMix

This model was SFT-ed from meta-llama/Meta-Llama-3-8B with data generated by the Seed-Dataset Agnostic version of the Instruct-SkillMix pipeline.

Training Details

We used 4000 examples from Instruct-SkillMix-SDA(k=2) (data available at PrincetonPLI/Instruct-SkillMix-SDA).

LR: 2e-5
- Linear Warmup Ratio: 0.03
- Decay: Cosine Decay to 0
Batch Size: 128
epoch: 7 / 15
Optimizer: AdamW
Sequence Length: 1024

Evaluation Details

We provide the set of generation configuration used for evaluation.

AlpacaEval

model_kwargs:
- torch_dtype: 'bfloat16'
- max_new_tokens: 2048
temperature: 0.9
top_p: 1.0
do_sample: True
stop_token_ids:
- 128001
- 128009

MTBench

model_kwargs:
- torch_dtype: 'bfloat16'
- max_new_tokens: 1024
temperature: 0.7
stop_token_ids:
- 128001
- 128009

WildBench

model_kwargs:
- torch_dtype: 'bfloat16'
- max_new_tokens: 4096
temperature: 0.9
top_p: 1.0
do_sample: True
stop_token_ids:
- 128001
- 128009

Citation

Paper: Instruct-SkillMix

@misc{kaur2024instructskillmixpowerfulpipelinellm,
      title={Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning}, 
      author={Simran Kaur and Simon Park and Anirudh Goyal and Sanjeev Arora},
      year={2024},
      eprint={2408.14774},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2408.14774}, 
}

Contact

Simran Kaur, Princeton University

Simon Park, Princeton University

{skaur, juhyunp} 'at' princeton 'dot' edu

PrincetonPLI
/

Llama-3-8B-Instruct-SkillMix