kaizerBox commited on
Commit
6c066c3
1 Parent(s): ffb52ae

retnet-summarization

Browse files
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  tags:
3
  - generated_from_trainer
4
  datasets:
@@ -13,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # retnet-summarization
15
 
16
- This model is a fine-tuned version of [](https://huggingface.co/) on the xsum dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 3.8453
19
 
20
  ## Model description
21
 
@@ -35,11 +36,11 @@ More information needed
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 0.0006
38
- - train_batch_size: 2
39
- - eval_batch_size: 2
40
  - seed: 42
41
  - gradient_accumulation_steps: 4
42
- - total_train_batch_size: 8
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
  - lr_scheduler_warmup_steps: 10
@@ -50,7 +51,7 @@ The following hyperparameters were used during training:
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:-----:|:---------------:|
53
- | 4.3722 | 1.0 | 12500 | 3.8453 |
54
 
55
 
56
  ### Framework versions
 
1
  ---
2
+ base_model: kaizerBox/retnet-summarization
3
  tags:
4
  - generated_from_trainer
5
  datasets:
 
14
 
15
  # retnet-summarization
16
 
17
+ This model is a fine-tuned version of [kaizerBox/retnet-summarization](https://huggingface.co/kaizerBox/retnet-summarization) on the xsum dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 3.5369
20
 
21
  ## Model description
22
 
 
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0006
39
+ - train_batch_size: 4
40
+ - eval_batch_size: 4
41
  - seed: 42
42
  - gradient_accumulation_steps: 4
43
+ - total_train_batch_size: 16
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: cosine
46
  - lr_scheduler_warmup_steps: 10
 
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:-----:|:---------------:|
54
+ | 3.8204 | 1.0 | 11525 | 3.5369 |
55
 
56
 
57
  ### Framework versions
config.json CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "activation_dropout": 0.0,
3
  "activation_fn": "swish",
4
  "architectures": [
 
1
  {
2
+ "_name_or_path": "kaizerBox/retnet-summarization",
3
  "activation_dropout": 0.0,
4
  "activation_fn": "swish",
5
  "architectures": [
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:621100b7d82b0c756b370ff4e4ad210f296a81bc20b0903f644049236b779c35
3
  size 282181632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a41442c27bf6db723cfe05a19d82cdb0d30fecf265fb84941eaee9a4b6303bf1
3
  size 282181632
runs/Nov05_18-25-27_8ad2684ce4a5/events.out.tfevents.1699208727.8ad2684ce4a5.288.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b218e96bf256f97537ca7888e882b375f5a4547b523ea62d9dbd1816784cd2a
3
+ size 5355
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6758a222c5a8fb23ed432a5ecdcba3b77e388b262c786457940802f927879730
3
  size 4600
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10be0b4f194b73d381f41e5bb8505f054bdf8254997b9d3c7da91c5f92748c5f
3
  size 4600