--- license: cc-by-sa-4.0 datasets: - izumi-lab/llm-japanese-dataset-vanilla language: - ja tags: - gpt_neox - japanese - causal-lm --- This repo contains a low-rank adapter for [CALM](https://huggingface.co/cyberagent/open-calm-7b) fit on the dataset specially extracted from [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset). You can test this at https://huggingface.co/spaces/izumi-lab/stormy-7b-10ep This version of the weights was trained with the following hyperparameters: - Epochs: 10 - Batch size: 128 - Cutoff length: 300 - Learning rate: 3e-4 - Lora _r_: 4 - Lora target modules: query_key_value ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model = "cyberagent/open-calm-7b" model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16) tokenizer = AutoTokenizer.from_pretrained(base_model) model = PeftModel.from_pretrained( model, "izumi-lab/stormy-7b-10ep", torch_dtype=torch.float16, ) ``` To see more latest information, please go to [llm.msuzuki.me](https://llm.msuzuki.me). ## Details - Japanese Paper: - English Paper: - Website: [llm.msuzuki.me](https://llm.msuzuki.me). Citation: TBD If you have any inquiries, such as joint research, data provision, various types of support, please email izumi-llm@socsim.org .