youval commited on
Commit
cac8030
1 Parent(s): 90e0cd0

Update model card (#2)

Browse files

- update model card (b8b2c8413b97696c6fcee90976ca780d83da9fb8)

Files changed (1) hide show
  1. README.md +113 -108
README.md CHANGED
@@ -1,109 +1,114 @@
1
- ---
2
- pipeline_tag: sentence-similarity
3
- tags:
4
- - feature-extraction
5
- - sentence-similarity
6
- language:
7
- - de
8
- - en
9
- - es
10
- - fr
11
- ---
12
-
13
- # Model Card for `vectorizer-v1-S-multilingual`
14
-
15
- This model is a vectorizer developed by Sinequa. It produces an embedding vector given a passage or a query. The
16
- passage vectors are stored in our vector index and the query vector is used at query time to look up relevant passages
17
- in the index.
18
-
19
- Model name: `vectorizer-v1-S-multilingual`
20
-
21
- ## Supported Languages
22
-
23
- The model was trained and tested in the following languages:
24
-
25
- - English
26
- - French
27
- - German
28
- - Spanish
29
-
30
- ## Scores
31
-
32
- | Metric | Value |
33
- |:-----------------------|------:|
34
- | Relevance (Recall@100) | 0.448 |
35
-
36
- Note that the relevance score is computed as an average over 14 retrieval datasets (see
37
- [details below](#evaluation-metrics)).
38
-
39
- ## Inference Times
40
-
41
- | GPU | Batch size 1 (at query time) | Batch size 32 (at indexing) |
42
- |:-----------|-----------------------------:|----------------------------:|
43
- | NVIDIA A10 | 2 ms | 14 ms |
44
- | NVIDIA T4 | 4 ms | 51 ms |
45
-
46
- The inference times only measure the time the model takes to process a single batch, it does not include pre- or
47
- post-processing steps like the tokenization.
48
-
49
- ## Requirements
50
-
51
- - Minimal Sinequa version: 11.10.0
52
- - GPU memory usage: 580 MiB
53
-
54
- Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
55
- size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
56
- can be around 0.5 to 1 GiB depending on the used GPU.
57
-
58
- ## Model Details
59
-
60
- ### Overview
61
-
62
- - Number of parameters: 39 million
63
- - Base language model: Homegrown Sinequa BERT-Small ([Paper](https://arxiv.org/abs/1908.08962)) pretrained in the four
64
- supported languages
65
- - Insensitive to casing and accents
66
- - Training procedure: Query-passage pairs using in-batch negatives
67
-
68
- ### Training Data
69
-
70
- - Natural Questions
71
- ([Paper](https://research.google/pubs/pub47761/),
72
- [Official Page](https://github.com/google-research-datasets/natural-questions))
73
- - Original English dataset
74
- - Translated datasets for the other three supported languages
75
-
76
- ### Evaluation Metrics
77
-
78
- To determine the relevance score, we averaged the results that we obtained when evaluating on the datasets of the
79
- [BEIR benchmark](https://github.com/beir-cellar/beir). Note that all these datasets are in English.
80
-
81
- | Dataset | Recall@100 |
82
- |:------------------|-----------:|
83
- | Average | 0.448 |
84
- | | |
85
- | Arguana | 0.835 |
86
- | CLIMATE-FEVER | 0.350 |
87
- | DBPedia Entity | 0.287 |
88
- | FEVER | 0.645 |
89
- | FiQA-2018 | 0.305 |
90
- | HotpotQA | 0.396 |
91
- | MS MARCO | 0.533 |
92
- | NFCorpus | 0.162 |
93
- | NQ | 0.701 |
94
- | Quora | 0.947 |
95
- | SCIDOCS | 0.194 |
96
- | SciFact | 0.580 |
97
- | TREC-COVID | 0.051 |
98
- | Webis-Touche-2020 | 0.289 |
99
-
100
-
101
- We evaluated the model on the datasets of the [MIRACL benchmark](https://github.com/project-miracl/miracl) to test its
102
- multilingual capacities. Note that not all training languages are part of the benchmark, so we only report the metrics
103
- for the existing languages.
104
-
105
- | Language | Recall@100 |
106
- |:---------|-----------:|
107
- | French | 0.583 |
108
- | German | 0.524 |
 
 
 
 
 
109
  | Spanish | 0.483 |
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - feature-extraction
5
+ - sentence-similarity
6
+ language:
7
+ - de
8
+ - en
9
+ - es
10
+ - fr
11
+ ---
12
+
13
+ # Model Card for `vectorizer-v1-S-multilingual`
14
+
15
+ This model is a vectorizer developed by Sinequa. It produces an embedding vector given a passage or a query. The passage vectors are stored in our vector index and the query vector is used at query time to look up relevant passages in the index.
16
+
17
+ Model name: `vectorizer-v1-S-multilingual`
18
+
19
+ ## Supported Languages
20
+
21
+ The model was trained and tested in the following languages:
22
+
23
+ - English
24
+ - French
25
+ - German
26
+ - Spanish
27
+
28
+ ## Scores
29
+
30
+ | Metric | Value |
31
+ |:-----------------------|------:|
32
+ | Relevance (Recall@100) | 0.448 |
33
+
34
+ Note that the relevance score is computed as an average over 14 retrieval datasets (see
35
+ [details below](#evaluation-metrics)).
36
+
37
+ ## Inference Times
38
+
39
+ | GPU | Quantization type | Batch size 1 | Batch size 32 |
40
+ |:------------------------------------------|:------------------|---------------:|---------------:|
41
+ | NVIDIA A10 | FP16 | 1 ms | 5 ms |
42
+ | NVIDIA A10 | FP32 | 3 ms | 14 ms |
43
+ | NVIDIA T4 | FP16 | 1 ms | 12 ms |
44
+ | NVIDIA T4 | FP32 | 2 ms | 52 ms |
45
+ | NVIDIA L4 | FP16 | 1 ms | 5 ms |
46
+ | NVIDIA L4 | FP32 | 2 ms | 18 ms |
47
+
48
+ ## Gpu Memory usage
49
+
50
+ | Quantization type | Memory |
51
+ |:-------------------------------------------------|-----------:|
52
+ | FP16 | 300 MiB |
53
+ | FP32 | 600 MiB |
54
+
55
+ Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
56
+ size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
57
+ can be around 0.5 to 1 GiB depending on the used GPU.
58
+
59
+ ## Requirements
60
+
61
+ - Minimal Sinequa version: 11.10.0
62
+ - Minimal Sinequa version for using FP16 models and GPUs with CUDA compute capability of 8.9+ (like NVIDIA L4): 11.11.0
63
+ - [Cuda compute capability](https://developer.nvidia.com/cuda-gpus): above 5.0 (above 6.0 for FP16 use)
64
+
65
+ ## Model Details
66
+
67
+ ### Overview
68
+
69
+ - Number of parameters: 39 million
70
+ - Base language model: Homegrown Sinequa BERT-Small ([Paper](https://arxiv.org/abs/1908.08962)) pretrained in the four
71
+ supported languages
72
+ - Insensitive to casing and accents
73
+ - Training procedure: Query-passage pairs using in-batch negatives
74
+
75
+ ### Training Data
76
+
77
+ - Natural Questions
78
+ ([Paper](https://research.google/pubs/pub47761/),
79
+ [Official Page](https://github.com/google-research-datasets/natural-questions))
80
+ - Original English dataset
81
+ - Translated datasets for the other three supported languages
82
+
83
+ ### Evaluation Metrics
84
+
85
+ To determine the relevance score, we averaged the results that we obtained when evaluating on the datasets of the
86
+ [BEIR benchmark](https://github.com/beir-cellar/beir). Note that all these datasets are in English.
87
+
88
+ | Dataset | Recall@100 |
89
+ |:------------------|-----------:|
90
+ | Average | 0.448 |
91
+ | | |
92
+ | Arguana | 0.835 |
93
+ | CLIMATE-FEVER | 0.350 |
94
+ | DBPedia Entity | 0.287 |
95
+ | FEVER | 0.645 |
96
+ | FiQA-2018 | 0.305 |
97
+ | HotpotQA | 0.396 |
98
+ | MS MARCO | 0.533 |
99
+ | NFCorpus | 0.162 |
100
+ | NQ | 0.701 |
101
+ | Quora | 0.947 |
102
+ | SCIDOCS | 0.194 |
103
+ | SciFact | 0.580 |
104
+ | TREC-COVID | 0.051 |
105
+ | Webis-Touche-2020 | 0.289 |
106
+
107
+
108
+ We evaluated the model on the datasets of the [MIRACL benchmark](https://github.com/project-miracl/miracl) to test its multilingual capacities. Note that not all training languages are part of the benchmark, so we only report the metrics for the existing languages.
109
+
110
+ | Language | Recall@100 |
111
+ |:---------|-----------:|
112
+ | French | 0.583 |
113
+ | German | 0.524 |
114
  | Spanish | 0.483 |