collapse_gemma-2-2b_hs2_replace_iter5_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6417	0.0316	5	1.3077	259488
1.2388	0.0632	10	1.2367	506760
0.8793	0.0947	15	1.2729	765824
0.6917	0.1263	20	1.4184	1019280
0.4293	0.1579	25	1.5732	1271416
0.2959	0.1895	30	1.6498	1515792
0.2466	0.2211	35	1.8461	1771464
0.1581	0.2527	40	1.9813	2020528
0.0765	0.2842	45	2.1203	2277040
0.0524	0.3158	50	2.2317	2527976
0.0539	0.3474	55	2.3208	2784520
0.0348	0.3790	60	2.3321	3041696
0.0366	0.4106	65	2.3404	3300560
0.037	0.4422	70	2.3494	3547576
0.0324	0.4737	75	2.3125	3810280
0.0307	0.5053	80	2.2571	4069032
0.0281	0.5369	85	2.2877	4323872
0.0289	0.5685	90	2.3183	4583064
0.0304	0.6001	95	2.3403	4844432
0.0272	0.6317	100	2.3549	5101512
0.0276	0.6632	105	2.3650	5358960
0.0306	0.6948	110	2.3604	5616864
0.0282	0.7264	115	2.3438	5877496
0.0288	0.7580	120	2.3419	6129360
0.0281	0.7896	125	2.3471	6382240
0.0305	0.8212	130	2.3799	6635400
0.0295	0.8527	135	2.3850	6889824
0.0294	0.8843	140	2.3463	7146448
0.0258	0.9159	145	2.3439	7407840
0.0272	0.9475	150	2.3552	7662520
0.0263	0.9791	155	2.3712	7923432