starsnatched
/

MemGPT-DPO-MoE-test

Text Generation

function calling

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MemGPT-DPO-MoE-test / README.md

starsnatched's picture

Update README.md

be8ba2c verified 8 months ago

|

No virus

1.79 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	---

	This is a test release of DPO version of [MemGPT](https://github.com/cpacker/MemGPT) Language Model.


	# Model Description
	This repository contains a MoE (Mixture of Experts) model of [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2). It has 2 experts per token. This model is specifically designed for function calling in MemGPT. It demonstrates comparable performances to GPT-4 when it comes to working with MemGPT.


	# Key Features
	* Function calling
	* Dedicated to working with MemGPT
	* Supports medium-length context, up to sequences of 8,192


	# Prompt Format
	This model uses ChatML prompt format:
	```
	<\|im_start\|>system
	{system_instruction}<\|im_end\|>
	<\|im_start\|>user
	{user_message}<\|im_end\|>
	<\|im_start\|>assistant
	{assistant_response}<\|im_end\|>
	```


	# Usage
	This model is designed to be ran on multiple backends, such as [oogabooga's textgen WebUI](https://github.com/oobabooga/text-generation-webui).
	Simply install your preferred backend, and then load up this model.
	Then, configure MemGPT using `memgpt configure`, and chat with MemGPT via `memgpt run` command!


	# Model Details
	* Developed by: @starsnatched
	* Model type: This repo contains a language model based on the transformer decoder architecture.
	* Language: English
	* Contact: For any questions, concerns or comments about this model, please contact me at Discord, @starsnatched.


	# Training Infrastructure
	* Hardware: The model in this repo was trained on 2x A100 80GB GPUs.


	# Intended Use
	The model is designed to be used as the base model for MemGPT agents.


	# Limitations and Risks
	The model may exhibit unreliable, unsafe, or biased behaviours. Please double check the results this model may produce.