SicariusSicariiStuff commited on
Commit
bf948f4
•
1 Parent(s): 3377fc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -235,7 +235,8 @@ These observations underscore the critical importance of fine-tuning for alignme
235
  </details>
236
 
237
  <details>
238
- <summary><b>June 18, 2024 Update</b>, After extensive testing of the intermediate checkpoints, significant progress has been made.</summary> The model is slowly — I mean, really slowly — unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create invisietch/EtherealRainbow-v0.3-rc7, with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work.
 
239
 
240
  Cheers,
241
 
@@ -243,7 +244,8 @@ Sicarius
243
  </details>
244
 
245
  <details>
246
- <summary><b>June 20, 2024 Update</b>, Unaligning was partially successful, and the results are decent, but <b>I am not</b> fully satisfied. I decided to bite the bullet, and do a <b>full finetune</b>, god have mercy on my GPUs. I am also releasing the intermediate checkpoint of this model.</summary>
 
247
  It's been a long ride, and I want to do it right, but the model would simply refuse some requests, with (almost) complete disregard for parts of the training data. Of course, one would argue that some easy prompt engineering will get around it, but the point was to make an unaligned model out of the box. Another point is that I could simply use a faster learning rate on more epochs, which would also work (I've tried that before), but the result would be an overcooked model and, therefore more dumb. So I decided to bite the bullet and do a full proper fine-tuning. This is going to be a serious pain in the ass, but I might as well try to do it right. Since I am releasing the intermediate checkpoint of this model under https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha, I might as well take the time and add some features I haven't seen in other models. In short, besides the normal goodies of logic, some theory of mind, and uncensored content along with general NLP tasks, I will TRY to add a massive dataset (that does not yet exist) of story writing, and a new, completely organic and original Roleplay dataset. LimaRP is awesome, but maybe, just maybe... things are finally carefully extricated from LimaRP, the same sentences will leave its entwined body under the stars towards something new, something fresh. This is going to take some serious effort and some time. Any support will be appreciated, even if it's just some feedback. My electricity bill gonna be huge this month LOL.
248
 
249
  Cheers,
@@ -251,7 +253,6 @@ Cheers,
251
  Sicarius
252
  </details>
253
 
254
- <b>I'll make an announcment in the coming days, stay tuned.</b>
255
 
256
  ## Intermediate checkpoint of this model:
257
 
 
235
  </details>
236
 
237
  <details>
238
+ <summary><b>June 18th, 2024 Update</b></summary>
239
+ After extensive testing of the intermediate checkpoints, significant progress has been made. The model is slowly — I mean, really slowly — unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create invisietch/EtherealRainbow-v0.3-rc7, with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work.
240
 
241
  Cheers,
242
 
 
244
  </details>
245
 
246
  <details>
247
+ <summary><b>June 20th, 2024</b></summary>
248
+ Unaligning was partially successful, and the results are decent, but <b>I am not</b> fully satisfied. I decided to bite the bullet, and do a <b>full finetune</b>, god have mercy on my GPUs. I am also releasing the intermediate checkpoint of this model.
249
  It's been a long ride, and I want to do it right, but the model would simply refuse some requests, with (almost) complete disregard for parts of the training data. Of course, one would argue that some easy prompt engineering will get around it, but the point was to make an unaligned model out of the box. Another point is that I could simply use a faster learning rate on more epochs, which would also work (I've tried that before), but the result would be an overcooked model and, therefore more dumb. So I decided to bite the bullet and do a full proper fine-tuning. This is going to be a serious pain in the ass, but I might as well try to do it right. Since I am releasing the intermediate checkpoint of this model under https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha, I might as well take the time and add some features I haven't seen in other models. In short, besides the normal goodies of logic, some theory of mind, and uncensored content along with general NLP tasks, I will TRY to add a massive dataset (that does not yet exist) of story writing, and a new, completely organic and original Roleplay dataset. LimaRP is awesome, but maybe, just maybe... things are finally carefully extricated from LimaRP, the same sentences will leave its entwined body under the stars towards something new, something fresh. This is going to take some serious effort and some time. Any support will be appreciated, even if it's just some feedback. My electricity bill gonna be huge this month LOL.
250
 
251
  Cheers,
 
253
  Sicarius
254
  </details>
255
 
 
256
 
257
  ## Intermediate checkpoint of this model:
258