cloned transfomer
created with stacking layers, trained on Habr+Rulm
dataset |
rugpt 760m large |
AlexWortega/ruClonedGPT_1.4B |
xnliru |
0.34 |
0.36 |
xwinograd |
0.65 |
0.68 |
danetqa |
0.62 |
0.65 |
muserc |
0.72 |
0.74 |
parus |
0.584 |
0.61 |
rcb |
0.417 |
0.45 |
rucos |
0.21 |
0.25 |
russe |
0.647 |
0.66 |
ruterra |
0.654 |
0.67 |
rwsd |
0.636 |
0.339 |