SicariusSicariiStuff
commited on
Commit
•
ca3e1e4
1
Parent(s):
2c764d3
Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ As of **June 11, 2024**, I've finally **started training** the model! The traini
|
|
19 |
|
20 |
# Additional info:
|
21 |
<details>
|
22 |
-
<summary>As of <b>June 13, 2024</b>, I've observed that even after two days of continuous training, the model is <b>still resistant to learning certain aspects</b>.</summary> For example, some of the validation data still shows a loss over
|
23 |
|
24 |
These observations underscore the critical importance of fine-tuning for alignment. Given the current pace, training will likely extend beyond a week. However, the end result should be **interesting**. If the additional datasets focused on logic and common sense are effective, we should achieve a model that is **nearly completely unaligned**, while still retaining its core 'intelligence.'
|
25 |
<img src="https://i.imgur.com/b6unKyS.png" alt="LLAMA-3_Unaligned_Training" style="width: 60%; min-width: 600px; display: block; margin: auto;">
|
@@ -27,7 +27,7 @@ These observations underscore the critical importance of fine-tuning for alignme
|
|
27 |
</details>
|
28 |
|
29 |
<details>
|
30 |
-
<summary
|
31 |
|
32 |
Cheers,
|
33 |
Sicarius
|
|
|
19 |
|
20 |
# Additional info:
|
21 |
<details>
|
22 |
+
<summary>As of <b>June 13, 2024</b>, I've observed that even after two days of continuous training, the model is <b>still resistant to learning certain aspects</b>.</summary> For example, some of the validation data still shows a loss over <b>2.3</b>, whereas other parts have a loss of <<b>0.3</b> or lower. This is after the model was initially abliterated.
|
23 |
|
24 |
These observations underscore the critical importance of fine-tuning for alignment. Given the current pace, training will likely extend beyond a week. However, the end result should be **interesting**. If the additional datasets focused on logic and common sense are effective, we should achieve a model that is **nearly completely unaligned**, while still retaining its core 'intelligence.'
|
25 |
<img src="https://i.imgur.com/b6unKyS.png" alt="LLAMA-3_Unaligned_Training" style="width: 60%; min-width: 600px; display: block; margin: auto;">
|
|
|
27 |
</details>
|
28 |
|
29 |
<details>
|
30 |
+
<summary><b>June 18, 2024 Update</b>, After extensive testing of the intermediate checkpoints, significant progress has been made.</summary> The model is slowly — I mean, really slowly — unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create [invisietch/EtherealRainbow-v0.3-rc7](https://huggingface.co/invisietch/EtherealRainbow-v0.3-rc7-8B-GGUF), with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work.
|
31 |
|
32 |
Cheers,
|
33 |
Sicarius
|