Edit model card

distilbert-base-zh-cased

We are sharing smaller versions of distilbert-base-multilingual-cased that handle a custom number of languages.

Our versions give exactly the same representations produced by the original model which preserves the original accuracy.

For more information please visit our paper: Load What You Need: Smaller Versions of Multilingual BERT.

How to use

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("Geotrend/distilbert-base-zh-cased")
model = AutoModel.from_pretrained("Geotrend/distilbert-base-zh-cased")

To generate other smaller versions of multilingual transformers please visit our Github repo.

How to cite

@inproceedings{smallermdistilbert,
  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
  author={Abdaoui, Amine and Pradel, Camille and Sigel, GrΓ©goire},
  booktitle={SustaiNLP / EMNLP},
  year={2020}
}

Contact

Please contact [email protected] for any question, feedback or request.

Downloads last month
587
Safetensors
Model size
53.5M params
Tensor type
F32
Β·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Dataset used to train Geotrend/distilbert-base-zh-cased