--- base_model: meta-llama/Meta-Llama-3-8B-Instruct library_name: transformers license: llama3 tags: - axolotl - generated_from_trainer - spectrum finetuning - Deepspeed MultiGPU - autoquant - gptq model-index: - name: Llama-3-8B-spectrum-25 results: [] --- # Llama-3-8B-spectrum-25 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the [yuvraj17/finetune_alpaca_1K](https://huggingface.co/datasets/yuvraj17/finetune_alpaca_1K) dataset. It achieves the following results on the evaluation set: - Loss: 1.2791 ## Spectrum Fine-tuning: I have used the **Spectrum Fine-tuning** method as described in [Eric Hartford et. al 2024](https://arxiv.org/abs/2406.06623), which selectively targets some ***t%*** of the model layers with the highest **Signal-to-Noise Ratio (SNR)**. By focusing on the most information-dense layers, this approach maximizes fine-tuning efficiency while minimizing compute resources. **The key goal of Spectrum Fine-tuning is:** *minimize the memory footprint and accelerate LLM training without sacrificing performance.* The 25% layer selection ensures minimal computational overhead for fine-tuning. ## Training: - Trained on **2x A40s (48GB VRAM each)** for over 1 hour using the **Axolotl**. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - num_devices: 2 - gradient_accumulation_steps: 4 - total_train_batch_size: 32 - total_eval_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 100 - num_epochs: 2 ![Train/loss Curve Image](https://cdn-uploads.huggingface.co/production/uploads/66137d95e8d2cda230ddcea6/eSBh0SmeGYYUfx9pKgMIv.png) ![eval/loss Curve Image](https://cdn-uploads.huggingface.co/production/uploads/66137d95e8d2cda230ddcea6/xNslkLH1pKot7tzWtIiu9.png) ### Framework versions - Axolotl 0.4.1 - Transformers 4.44.2 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1