Edit model card

language: en tags: - text-generation - causal-lm - fine-tuning - unsupervised

Model Name: olabs-ai/reflection_model

Model Description

The olabs-ai/reflection_model is a fine-tuned language model based on Meta-Llama-3.1-8B-Instruct. It has been further fine-tuned using LoRA (Low-Rank Adaptation) for improved performance in specific tasks. This model is designed for text generation and can be used for various applications like conversational agents, content creation, and more.

Model Details

  • Base Model: Meta-Llama-3.1-8B-Instruct
  • Fine-Tuning Method: LoRA
  • Architecture: LlamaForCausalLM
  • Number of Parameters: 8 Billion (Base Model)
  • Training Data: [Details about the training data used for fine-tuning, if available]


To use this model, you need to have the transformers and unsloth libraries installed. You can load the model and tokenizer as follows:

from transformers import AutoConfig, AutoModel, AutoTokenizer
from unsloth import FastLanguageModel

# Load base model configuration
base_model_name = "olabs-ai/Meta-Llama-3.1-8B-Instruct"
base_config = AutoConfig.from_pretrained(base_model_name)
base_model = AutoModel.from_pretrained(base_model_name, config=base_config)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Load LoRA adapter
adapter_config_path = "path_to_your_adapter_config.json"
adapter_weights_path = "path_to_your_adapter_weights"

# Use FastLanguageModel to apply LoRA adapter
model = FastLanguageModel.from_pretrained(

# Set inference mode for LoRA

# Prepare inputs
custom_prompt = "What is a famous tall tower in Paris?"
inputs = tokenizer([custom_prompt], return_tensors="pt").to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)

# Generate outputs
outputs = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)
Downloads last month


Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .