mistral-sft4epoch-spin-v

This model is a fine-tuned version of AmberYifan/mistral-safe-sft-full on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2284
Rewards/real: 10.1344
Rewards/generated: -5.3158
Rewards/accuracies: 1.0
Rewards/margins: 15.4503
Logps/generated: -131.8755
Logps/real: -111.3366
Logits/generated: -2.7694
Logits/real: -2.7499

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-07
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/real	Rewards/generated	Rewards/accuracies	Rewards/margins	Logps/generated	Logps/real	Logits/generated	Logits/real
0.278	0.0640	100	0.2703	8.6366	-3.4251	0.9922	12.0617	-112.9675	-126.3148	-2.9055	-2.8963
0.2283	0.1280	200	0.2438	9.5699	-4.6271	0.9922	14.1970	-124.9880	-116.9817	-2.8308	-2.8192
0.2284	0.1919	300	0.2384	9.7849	-5.0781	0.9922	14.8630	-129.4981	-114.8321	-2.8396	-2.8204
0.2154	0.2559	400	0.2361	9.8971	-4.8914	0.9922	14.7885	-127.6311	-113.7101	-2.8303	-2.8085
0.2368	0.3199	500	0.2351	9.9762	-5.0488	0.9922	15.0249	-129.2045	-112.9195	-2.8228	-2.8083
0.2065	0.3839	600	0.2346	10.0426	-4.9610	0.9922	15.0035	-128.3267	-112.2554	-2.8204	-2.8086
0.2244	0.4479	700	0.2317	10.0417	-5.1299	1.0	15.1716	-130.0162	-112.2640	-2.8203	-2.8076
0.2161	0.5118	800	0.2297	10.0737	-5.0565	1.0	15.1303	-129.2824	-111.9440	-2.8437	-2.8337
0.2127	0.5758	900	0.2302	10.0913	-5.0905	1.0	15.1818	-129.6217	-111.7683	-2.8251	-2.8150
0.2017	0.6398	1000	0.2298	10.1245	-5.2627	1.0	15.3872	-131.3441	-111.4362	-2.7955	-2.7831
0.2152	0.7038	1100	0.2297	10.0889	-5.3503	1.0	15.4392	-132.2204	-111.7925	-2.7790	-2.7609
0.2074	0.7678	1200	0.2298	10.1143	-5.3204	1.0	15.4346	-131.9209	-111.5385	-2.7919	-2.7734
0.2107	0.8317	1300	0.2287	10.1349	-5.3137	1.0	15.4486	-131.8539	-111.3324	-2.7734	-2.7524
0.1947	0.8957	1400	0.2288	10.1265	-5.3252	1.0	15.4517	-131.9686	-111.4160	-2.7803	-2.7613
0.2056	0.9597	1500	0.2284	10.1344	-5.3158	1.0	15.4503	-131.8755	-111.3366	-2.7694	-2.7499

Framework versions

Transformers 4.43.3
Pytorch 2.2.2+cu121
Datasets 2.20.0
Tokenizers 0.19.1

AmberYifan
/

mistral-sft4epoch-spin-v

mistral-sft4epoch-spin-v

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for AmberYifan/mistral-sft4epoch-spin-v

Evaluation results