The Big Picture (Brainproject.ai)

The human brain is an intricate puzzle that we're continually striving to decode. My aim is to replicate its complexity, functionality, and depth in a digital realm. In other words, we're exploring the convergence of neuroscience and artificial intelligence to glean insights into the mind's intricate workings and harness that knowledge into digital counterparts.

Mixture of Experts

Chameleon-Llama-70B doesn't work alone. It's part of the Mixture of Experts framework. Within this structure, various models, each with their distinct competencies, collaborate. This synergy allows for a richer, more holistic approach to understanding and replicating brain functions.

Chameleon-Llama-70B

Chameleon enhances Llama-70B with a natural language planner module that dynamically composes reasoning chains from various tools:

Module Inventory: Vision models, knowledge modules, web search, Python functions, etc.
Natural Language Planner: Generates programs indicating a sequence of modules to execute.
Tool Execution: Selected modules process inputs sequentially, caching context.
Adaptability: Planner synthesizes custom programs for diverse tasks.

Model Description

Developed by: Priyanshu Pareek
Model type: Fine-tuned LLama with Chamelion
License: wtfpl
Finetuned from model [optional]: Llama-2-70B

Uses

Direct Use

The model is primed for out-of-the-box applications without the need for fine-tuning or integration into bigger systems.

Recommendations

It's essential to approach the Chameleon-Llama-70B (and models like it) with an informed perspective. Recognize that while it holds immense potential, there are inherent risks, biases, and limitations. More data and insights are necessary to offer detailed recommendations.

How to Get Started with the Model

Want to take Chameleon-Llama-70B for a spin?

from transformers import ChameleonLlamaModel, ChameleonLlamaTokenizer

tokenizer = ChameleonLlamaTokenizer.from_pretrained("path-to-Chameleon-Llama-70B")
model = ChameleonLlamaModel.from_pretrained("path-to-Chameleon-Llama-70B")

input_text = "Your text here"
encoded_input = tokenizer(input_text, return_tensors='pt')
output = model(**encoded_input)

Replace "path-to-Chameleon-Llama-70B" with the correct path or URL for the pre-trained model.

Training Details

Training Data

The model was trained on a combination of the original Llama datasets, integrated with data from various real-time sources like news outlets, web pages, and other real-time data feeds.

Training Procedure

Preprocessing [optional]

Data from real-time sources were preprocessed to ensure a uniform format and to filter out any irrelevant or sensitive information.

Training Hyperparameters

Training regime: fp16 mixed precision
Batch size: 64
Learning rate: 3e-4
Optimizer: AdamW
Training epochs: 4