File size: 1,364 Bytes
675ddb5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
**Train-Test Set:** "teknofest_train_final.csv"

**Model:** "dbmdz/bert-base-turkish-128k-uncased"

**Önişleme**
- Büyük karakterler öncesine special token (#) eklenip sonrasında karakterler küçültülmüştür
- Noktalama işaretleri silinmiştir
  
## Tokenizer Parametreleri
```
max_length=64
padding=True
truncation=True
```

## Eğitim Parametreleri
- **Epoch:** 3
- **Learning Rate:** 7e-5
- **Batch-Size:** 64
- **Tokenizer Length:** 64
- **Loss:** BCE
- **Online Hard Example Mining:** Açık
- **Class-Weighting:** Açık (^0.3)
- **Early Stopping:** Kapalı
- **Stratified Batch Sampling:** Açık
- **Gradient Accumulation:** Kapalı
- **LR Scheduler:** Cosine-with-Warmup
- **Warmup Ratio:** 0.1
- **Weight Decay:** 0.01
- **LLRD:** 0.95
- **Label Smoothing:** 0.05
- **Gradient Clipping:** 1.0
- **MLM Pre-Training:** Kapalı


## CV10 Sonuçları
```
              precision    recall  f1-score   support

      INSULT     0.9172    0.9260    0.9216      2393
       OTHER     0.9681    0.9646    0.9663      3528
   PROFANITY     0.9627    0.9571    0.9599      2376
      RACIST     0.9684    0.9651    0.9667      2033
      SEXIST     0.9618    0.9668    0.9643      2081

    accuracy                         0.9562     12411
   macro avg     0.9557    0.9559    0.9558     12411
weighted avg     0.9563    0.9562    0.9562     12411
```