SicariusSicariiStuff commited on
Commit
ce0c707
1 Parent(s): 47889c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -15,6 +15,21 @@ language:
15
 
16
 
17
  # Current status:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  <details>
19
  <summary><b>July 5th, 2024</b></summary>
20
  I'm amazed with the recent advancements I've made with the unalignment of LLAMA-3_8B. The results are incredibly impressive and far exceed my expectations. It's truly remarkable how much progress I have made with the model.
 
15
 
16
 
17
  # Current status:
18
+
19
+ <details>
20
+ <summary><b>July 26th, 2024, moving on to LLAMA3.1</b></summary>
21
+ One step forward, one step backward. Many issues were solved, but a few new ones were encountered. As I already updated in my ["blog"](https://huggingface.co/SicariusSicariiStuff/Blog_And_Updates#july-26th-2024), I originally wanted to finetune Gradients 0.25M\1M\4M LLAMA3 8B model, but almost at the same time I concluded the model is really not that great in even 8k context, Zuck the CHAD dropped LLAMA 3.1.
22
+
23
+ LLAMA 3.1 is 128k context, which probably means that in practice it will be somewhat coherent at 32k context, as a guesstimate. Also, I've heard from several people who have done some early tests, that the new LLAMA 3.1 8B is even better than the new Mistral Nemo 12B. IDK if that's true, but overall LLAMA 3.1 does seem to be a much better version of the "regular" LLAMA 3.
24
+
25
+ I have no words to describe the hell it is to curate and generate a high-quality dataset. Most, I'd even go as far as to estimate, that 99% of the models are either finetunes of the same medium (at best) quality datasets, or merges. Almost no one is crazy enough to create something completely new, as someone starts such a project, after 100 entries he sees that "hmmm, I have only 10k more to go" and they ditch the whole project, and instead do another merge and call it a day. Not me.
26
+
27
+ A lot of progress has been made, and I hope that I will have a BETA version to share in the very near future. It will probably be ~1%-1.5% of the final model, but it should give a general idea of what the completed project, or model, will be like.
28
+
29
+ Stay tuned.
30
+
31
+ </details>
32
+
33
  <details>
34
  <summary><b>July 5th, 2024</b></summary>
35
  I'm amazed with the recent advancements I've made with the unalignment of LLAMA-3_8B. The results are incredibly impressive and far exceed my expectations. It's truly remarkable how much progress I have made with the model.