beomi commited on
Commit
4767228
1 Parent(s): b37673f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -10
README.md CHANGED
@@ -15,10 +15,25 @@ license_name: llama3
15
  license_link: LICENSE
16
  ---
17
 
 
 
18
  ## Model Details
19
 
20
  **Llama-3-Open-Ko-8B**
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  **Meta Llama-3**
24
 
@@ -54,7 +69,7 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
54
  <tr>
55
  <td rowspan="2" >Llama-3-Open-Ko
56
  </td>
57
- <td rowspan="2" >Open-Solar-Ko Dataset
58
  </td>
59
  <td>8B
60
  </td>
@@ -62,19 +77,21 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
62
  </td>
63
  <td>Yes
64
  </td>
65
- <td rowspan="2" >9B+
66
  </td>
67
  <td>Jun, 2023
68
  </td>
69
  </tr>
70
  </table>
71
 
 
72
 
73
- **Model Release Date** Preview, Not yet Release.
 
74
 
75
  **Status** This is a static model trained on an offline dataset.
76
 
77
- **License** A custom commercial license is available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
78
 
79
  ## Intended Use
80
 
@@ -122,20 +139,22 @@ Please see the Responsible Use Guide available at [http://llama.meta.com/respons
122
 
123
  **Llama-3-Open-Ko**
124
 
125
- TBD
 
 
 
 
 
 
 
126
 
127
  **Original Llama-3**
128
 
129
  ```
130
  @article{llama3modelcard,
131
-
132
  title={Llama 3 Model Card},
133
-
134
  author={AI@Meta},
135
-
136
  year={2024},
137
-
138
  url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
139
-
140
  }
141
  ```
 
15
  license_link: LICENSE
16
  ---
17
 
18
+ > Update @ 2024.04.24: Release Llama-3-Open-Ko-8B model & [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
19
+
20
  ## Model Details
21
 
22
  **Llama-3-Open-Ko-8B**
23
 
24
+ Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.
25
+
26
+ This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.
27
+
28
+ With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).
29
+
30
+ The train was done on TPUv5e-256, with the warm support from TRC program by Google.
31
+
32
+ **Note for [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)**
33
+
34
+ With applying the idea from [Chat Vector paper](https://arxiv.org/abs/2310.04799), I released Instruction model named [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview).
35
+
36
+ Since it is NOT finetuned with any Korean instruction set(indeed `preview`), but it would be great starting point for creating new Chat/Instruct models.
37
 
38
  **Meta Llama-3**
39
 
 
69
  <tr>
70
  <td rowspan="2" >Llama-3-Open-Ko
71
  </td>
72
+ <td rowspan="2" >Same as *Open-Solar-Ko Dataset
73
  </td>
74
  <td>8B
75
  </td>
 
77
  </td>
78
  <td>Yes
79
  </td>
80
+ <td rowspan="2" >17.7B+
81
  </td>
82
  <td>Jun, 2023
83
  </td>
84
  </tr>
85
  </table>
86
 
87
+ *You can find dataset list here: https://huggingface.co/beomi/OPEN-SOLAR-KO-10.7B/tree/main/corpus
88
 
89
+
90
+ **Model Release Date** 2024.04.24.
91
 
92
  **Status** This is a static model trained on an offline dataset.
93
 
94
+ **License** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
95
 
96
  ## Intended Use
97
 
 
139
 
140
  **Llama-3-Open-Ko**
141
 
142
+ ```
143
+ @article{llama3openko,
144
+ title={Llama-3-Open-Ko},
145
+ author={L, Junbum},
146
+ year={2024},
147
+ url={https://huggingface.co/beomi/Llama-3-Open-Ko-8B}
148
+ }
149
+ ```
150
 
151
  **Original Llama-3**
152
 
153
  ```
154
  @article{llama3modelcard,
 
155
  title={Llama 3 Model Card},
 
156
  author={AI@Meta},
 
157
  year={2024},
 
158
  url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
 
159
  }
160
  ```