Skepsun
/

baichuan-2-llama-7b-ppo

Model card Files Files and versions Community

Skepsun commited on Sep 18, 2023

Commit

95be25b

•

1 Parent(s): 789e018

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -10,9 +10,9 @@ library_name: peft
 本仓库为ppo步骤（基于(sft)[https://huggingface.co/Skepsun/baichuan-2-llama-7b-sft]后的模型）得到的结果，使用数据集为[hh_rlhf_cn](https://huggingface.co/datasets/dikw/hh_rlhf_cn)。
-![training loss](link "https://huggingface.co/Skepsun/baichuan-2-llama-7b-ppo/resolve/main/training_loss.png")
-![training reward](link "https://huggingface.co/Skepsun/baichuan-2-llama-7b-ppo/resolve/main/training_reward.png")
 ## Usage
 使用方法即使用上述训练框架的推理脚本，指定基座模型为sft模型，checkpoint_dir为本仓库地址，prompt template为vicuna。

 本仓库为ppo步骤（基于(sft)[https://huggingface.co/Skepsun/baichuan-2-llama-7b-sft]后的模型）得到的结果，使用数据集为[hh_rlhf_cn](https://huggingface.co/datasets/dikw/hh_rlhf_cn)。
+![training loss](https://huggingface.co/Skepsun/baichuan-2-llama-7b-ppo/resolve/main/training_loss.png)
+![training reward](https://huggingface.co/Skepsun/baichuan-2-llama-7b-ppo/resolve/main/training_reward.png)
 ## Usage
 使用方法即使用上述训练框架的推理脚本，指定基座模型为sft模型，checkpoint_dir为本仓库地址，prompt template为vicuna。