Metin commited on
Commit
f3be38d
1 Parent(s): e3e140b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md CHANGED
@@ -1,3 +1,62 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ language:
4
+ - tr
5
  ---
6
+
7
+ # Model Card for Model ID
8
+
9
+ <!-- Provide a quick summary of what the model is/does. -->
10
+
11
+ gemma-2b fine-tuned for the task of Turkish text generation.
12
+
13
+ ## Model Details
14
+
15
+ ### Model Description
16
+
17
+ <!-- Provide a longer summary of what this model is. -->
18
+
19
+ - **Language(s) (NLP):** Turkish, English
20
+ - **License:** Creative Commons Attribution Non Commercial 4.0 (Chosen due to the use of restricted/gated datasets.)
21
+ - **Finetuned from model [optional]:** gemma-2b (https://huggingface.co/google/gemma-2b)
22
+
23
+
24
+ ## Uses
25
+
26
+ The model is specifically designed for Turkish text generation. It is not suitable for instruction-following or question-answering tasks.
27
+
28
+ ## How to Get Started with the Model
29
+
30
+ ```Python
31
+ from transformers import AutoTokenizer, AutoModelForCausalLM
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("Metin/gemma-2b-tr")
34
+ model = AutoModelForCausalLM.from_pretrained("Metin/gemma-2b-tr")
35
+
36
+ system_prompt = "You are a helpful assistant. Always reply in Turkish."
37
+ instruction = "Bugün sinemaya gidemedim çünkü"
38
+ prompt = f"{system_prompt} [INST] {instruction} [/INST]"
39
+ input_ids = tokenizer(prompt, return_tensors="pt")
40
+
41
+ outputs = model.generate(**input_ids)
42
+ print(tokenizer.decode(outputs[0]))
43
+ ```
44
+
45
+ ## Training Details
46
+
47
+ ### Training Data
48
+
49
+ - Dataset size: ~190 Million Token or 100K Document
50
+ - Dataset content: Web crawl data
51
+
52
+ ### Training Procedure
53
+
54
+
55
+ #### Training Hyperparameters
56
+
57
+ - **Adapter:** QLoRA
58
+ - **Epochs:** 1
59
+ - **Context length:** 1024
60
+ - **LoRA Rank:** 32
61
+ - **LoRA Alpha:** 32
62
+ - **LoRA Dropout:** 0.05