metadata

license: apache-2.0
base_model: Helsinki-NLP/opus-mt-ja-pl
tags:
  - generated_from_trainer
datasets:
  - tatoeba
metrics:
  - bleu
model-index:
  - name: opus_model
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: tatoeba
          type: tatoeba
          config: ja-pl
          split: train
          args: ja-pl
        metrics:
          - name: Bleu
            type: bleu
            value: 34.4952

opus_model

This model is a fine-tuned version of Helsinki-NLP/opus-mt-ja-pl on the tatoeba dataset. It achieves the following results on the evaluation set:

Loss: 1.1164
Bleu: 34.4952
Gen Len: 9.442
Meteor: 0.5692
Chrf: 53.728

Model description

Helsinki-NLP/opus-mt-ja-pl model fine-tuned on tatoeba and some pop culture texts (vn, manga, rpgs).

Intended uses & limitations

More information needed

Training and evaluation data

Training with kaggle notebook (GPU) on GPU P100.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 8
mixed_precision_training: Native AMP

Examples

Japanese	Original translation	DeepL	Opus-mt-ja-pl-pop_v2
今ちょっとやることがあってね	Mam teraz coś do zrobienia.	Mam teraz kilka rzeczy do zrobienia.	Mam teraz kilka spraw do załatwienia.
なぜッあの少女を助けてやらなかったのだ！	Czemu jej nie pomogłeś!	Dlaczego nie pomogłeś tej dziewczynie?	Dlaczego jej nie pomogłeś?!
ここで何をしている？	Czego tu szukacie?	Co ty tu robisz?	Co tu robisz?
あんたの協力が要る	Potrzebujemy cię.	Potrzebuję twojej pomocy.	Potrzebuję twojej pomocy.
こたえはなに？	A jaka jest właściwie odpowiedź?	Jaka jest odpowiedź?	Co to jest?
一人で寝んのが怖くなったんか？	Boisz się spać sama?	Boisz się spać samotnie?	Boisz się spać sama?

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len	Meteor	Chrf
2.5658	1.0	56681	1.6196	21.6767	9.2915	0.4586	43.4725
2.3419	2.0	113362	1.4667	25.4469	9.3688	0.4916	46.3391
2.23	3.0	170043	1.3715	27.166	9.4895	0.5089	48.2252
2.1139	4.0	226724	1.2833	28.9288	9.4581	0.5244	49.4667
1.9825	5.0	283405	1.2170	31.3751	9.3229	0.5358	51.0005
1.8982	6.0	340086	1.1660	32.9805	9.4976	0.5563	52.5487
1.8198	7.0	396767	1.1305	34.0223	9.4436	0.5665	53.2912
1.7592	8.0	453448	1.1164	34.4952	9.442	0.5692	53.728

Framework versions

Transformers 4.42.3
Pytorch 2.1.2
Datasets 2.20.0
Tokenizers 0.19.1