Llama-3-2B-Base

A slimmed-down, third-party adaptation of the Llama 3 model with only 2 billion parameters.

Important: This project is not affiliated with Meta.

Overview

Llama-3-2B-Base is a reduced version of the popular Llama 3 models, specifically designed to bring the power of LLMs (Large Language Models) to environments with limited computational resources. This model offers a balance between performance and resource usage, serving as an efficient alternative for users who cannot leverage the larger, resource-intensive versions from Meta.

Model Developers

This version has been developed independently and is not associated with Meta.

Input/Output

Input: Text only.
Output: Generates text and code only.

Model Architecture

Llama-3-2B is an auto-regressive language model using an optimized transformer architecture. It employs supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to enhance alignment with human preferences for helpfulness and safety.

Intended Use

Use Cases: Suitable for both commercial and research use in English, capable of assistant-like chat and a variety of natural language generation tasks.
Out-of-Scope: Any use that violates applicable laws or regulations (including trade compliance laws), or the Acceptable Use Policy.

Usage Instructions

Use with transformers

You can leverage the transformers library to run inference.

Transformers Pipeline

import transformers
import torch

model_id = "andrijdavid/Llama-3-2B-Base"
pipeline = transformers.pipeline(
    "text-generation", model=model_id, 
    model_kwargs={"torch_dtype": torch.bfloat16}, 
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"}
]

terminators = [
    pipeline.tokenizer.eos_token_id, 
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    messages, max_new_tokens=256, 
    eos_token_id=terminators, 
    do_sample=True, temperature=0.6, top_p=0.9
)
print(outputs[0]["generated_text"][-1])

Hardware and Software Considerations

Llama-3-2B is designed to run efficiently on mid-tier hardware, significantly lowering the entry barrier for using advanced language models.

Ethical Considerations and Limitations

Llama-3-2B, like any LLM, is susceptible to generating biased or inappropriate outputs. Developers must evaluate and fine-tune the model to ensure safety and suitability for their specific use cases.