BenjaminOcampo commited on
Commit
e7060e7
1 Parent(s): 277c8ec

Add model's weights

Browse files
Files changed (8) hide show
  1. README.md +170 -87
  2. config.json +27 -0
  3. model.pt +3 -0
  4. pytorch_model.bin +3 -0
  5. special_tokens_map.json +7 -0
  6. tokenizer.json +0 -0
  7. tokenizer_config.json +13 -0
  8. vocab.txt +0 -0
README.md CHANGED
@@ -1,116 +1,199 @@
1
  ---
2
- base_model: BenjaminOcampo/model-contrastive-bert__trained-in-ishate__seed-0
3
- datasets:
4
- - ISHate
5
- language:
6
- - en
7
- library_name: transformers
8
- license: bsl-1.0
9
- metrics:
10
- - f1
11
- - accuracy
12
- tags:
13
- - hate-speech-detection
14
- - implicit-hate-speech
15
  ---
16
 
17
- This model card documents the demo paper "PEACE: Providing Explanations and
18
- Analysis for Combating Hate Expressions" accepted at the 27th European
19
- Conference on Artificial Intelligence: https://www.ecai2024.eu/calls/demos.
20
 
21
- # The Model
22
- This model is a hate speech detector fine-tuned specifically for detecting
23
- implicit hate speech. It is based on the paper "PEACE: Providing Explanations
24
- and Analysis for Combating Hate Expressions" by Greta Damo, Nicolás Benjamín
25
- Ocampo, Elena Cabrio, and Serena Villata, presented at the 27th European
26
- Conference on Artificial Intelligence.
27
 
28
- # Training Parameters and Experimental Info
29
- The model was trained using the ISHate dataset, focusing on implicit data.
30
- Training parameters included:
31
- - Batch size: 32
32
- - Weight decay: 0.01
33
- - Epochs: 4
34
- - Learning rate: 2e-5
35
 
36
- For detailed information on the training process, please refer to the [model's
37
- paper](https://aclanthology.org/2023.findings-emnlp.441/).
38
 
39
- # Usage
40
 
41
- First you might need the transformers version 4.30.2.
42
 
43
- ```
44
- pip install transformers==4.30.2
45
- ```
46
 
47
- This model was created using pytorch vanilla. In order to load it you have to use the following Model Class.
48
 
49
- ```python
50
- class ContrastiveModel(nn.Module):
51
- def __init__(self, model):
52
- super(ContrastiveModel, self).__init__()
53
- self.model = model
54
- self.embedding_dim = model.config.hidden_size
55
- self.fc = nn.Linear(self.embedding_dim, self.embedding_dim)
56
- self.classifier = nn.Linear(self.embedding_dim, 2) # Classification layer
57
 
58
- def forward(self, input_ids, attention_mask):
59
- outputs = self.model(input_ids, attention_mask)
60
- embeddings = outputs.last_hidden_state[:, 0] # Use the CLS token embedding as the representation
61
- embeddings = self.fc(embeddings)
62
- logits = self.classifier(embeddings) # Apply classification layer
 
63
 
64
- return embeddings, logits
65
- ```
66
 
67
- Then, we instantiate the model as:
68
 
69
- ```python
70
- from transformers import AutoModel, AutoTokenizer, AutoConfig
 
71
 
72
- repo_name = "BenjaminOcampo/peace_cont_bert"
73
 
74
- config = AutoConfig.from_pretrained(repo_name)
75
- contrastive_model = ContrastiveModel(AutoModel.from_config(config))
76
- tokenizer = AutoTokenizer.from_pretrained(repo_name)
77
- ```
78
 
79
- Finally, to load the weights of the model we do as follows:
80
 
81
- ```python
82
- model_tmp_file = hf_hub_download(repo_id=repo_name, filename="model.pt", token=read_token)
83
 
84
- state_dict = torch.load(model_tmp_file)
85
 
86
- contrastive_model.load_state_dict(state_dict)
87
- ```
88
 
89
- You can make predictions as any pytorch model:
90
 
91
- ```
92
- import torch
93
 
94
- text = "Are you sure that Islam is a peaceful religion?"
95
- inputs = tokenizer(text, return_tensors="pt")
96
 
97
- with torch.no_grad():
98
- _, logits = contrastive_model(inputs["input_ids"], inputs["attention_mask"])
99
 
100
- probabilities = torch.softmax(logits, dim=1)
101
- _, predicted_labels = torch.max(probabilities, dim=1)
102
- ```
103
 
104
- # Datasets
105
- The model was trained on the [ISHate dataset](https://huggingface.co/datasets/BenjaminOcampo/ISHate), specifically
106
- the training part of the dataset which focuses on implicit hate speech.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
 
108
- # Evaluation Results
109
- The model's performance was evaluated using standard metrics, including F1 score
110
- and accuracy. For comprehensive evaluation results, refer to the linked paper.
111
 
112
- Authors:
113
- - [Greta Damo](https://grexit-d.github.io/damo.greta.github.io/)
114
- - [Nicolás Benjamín Ocampo](https://www.nicolasbenjaminocampo.com/)
115
- - [Elena Cabrio](https://www-sop.inria.fr/members/Elena.Cabrio/)
116
- - [Serena Villata](https://webusers.i3s.unice.fr/~villata/Home.html)
 
1
  ---
2
+ language: en
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
+ # Model Card for BenjaminOcampo/model-contrastive-bert__trained-in-ishate__seed-0
 
 
6
 
7
+ <!-- Provide a quick summary of what the model is/does. -->
 
 
 
 
 
8
 
 
 
 
 
 
 
 
9
 
 
 
10
 
11
+ ## Model Details
12
 
13
+ ### Model Description
14
 
15
+ <!-- Provide a longer summary of what this model is. -->
 
 
16
 
 
17
 
 
 
 
 
 
 
 
 
18
 
19
+ - **Developed by:** BenjaminOcampo
20
+ - **Shared by [optional]:** [More Information Needed]
21
+ - **Model type:** [More Information Needed]
22
+ - **Language(s) (NLP):** en
23
+ - **License:** [More Information Needed]
24
+ - **Finetuned from model [optional]:** [More Information Needed]
25
 
26
+ ### Model Sources [optional]
 
27
 
28
+ <!-- Provide the basic links for the model. -->
29
 
30
+ - **Repository:** https://github.com/huggingface/huggingface_hub
31
+ - **Paper [optional]:** [More Information Needed]
32
+ - **Demo [optional]:** [More Information Needed]
33
 
34
+ ## Uses
35
 
36
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
37
 
38
+ ### Direct Use
39
 
40
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
41
 
42
+ [More Information Needed]
43
 
44
+ ### Downstream Use [optional]
 
45
 
46
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
47
 
48
+ [More Information Needed]
 
49
 
50
+ ### Out-of-Scope Use
 
51
 
52
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
53
 
54
+ [More Information Needed]
 
 
55
 
56
+ ## Bias, Risks, and Limitations
57
+
58
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
59
+
60
+ [More Information Needed]
61
+
62
+ ### Recommendations
63
+
64
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
65
+
66
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
67
+
68
+ ### How to Get Started with the Model
69
+
70
+ Use the code below to get started with the model.
71
+
72
+ [More Information Needed]
73
+
74
+ ## Training Details
75
+
76
+ ### Training Data
77
+
78
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
79
+
80
+ [More Information Needed]
81
+
82
+ ### Training Procedure
83
+
84
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
85
+
86
+ #### Preprocessing [optional]
87
+
88
+ [More Information Needed]
89
+
90
+
91
+ #### Training Hyperparameters
92
+
93
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
94
+
95
+ #### Speeds, Sizes, Times [optional]
96
+
97
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
98
+
99
+ [More Information Needed]
100
+
101
+ ## Evaluation
102
+
103
+ <!-- This section describes the evaluation protocols and provides the results. -->
104
+
105
+ ### Testing Data, Factors & Metrics
106
+
107
+ #### Testing Data
108
+
109
+ <!-- This should link to a Data Card if possible. -->
110
+
111
+ [More Information Needed]
112
+
113
+ #### Factors
114
+
115
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
116
+
117
+ [More Information Needed]
118
+
119
+ #### Metrics
120
+
121
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
122
+
123
+ [More Information Needed]
124
+
125
+ ### Results
126
+
127
+ [More Information Needed]
128
+
129
+ #### Summary
130
+
131
+
132
+
133
+ ## Model Examination [optional]
134
+
135
+ <!-- Relevant interpretability work for the model goes here -->
136
+
137
+ [More Information Needed]
138
+
139
+ ## Environmental Impact
140
+
141
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
142
+
143
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
144
+
145
+ - **Hardware Type:** [More Information Needed]
146
+ - **Hours used:** [More Information Needed]
147
+ - **Cloud Provider:** [More Information Needed]
148
+ - **Compute Region:** [More Information Needed]
149
+ - **Carbon Emitted:** [More Information Needed]
150
+
151
+ ## Technical Specifications [optional]
152
+
153
+ ### Model Architecture and Objective
154
+
155
+ [More Information Needed]
156
+
157
+ ### Compute Infrastructure
158
+
159
+ [More Information Needed]
160
+
161
+ #### Hardware
162
+
163
+ [More Information Needed]
164
+
165
+ #### Software
166
+
167
+ [More Information Needed]
168
+
169
+ ## Citation [optional]
170
+
171
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
172
+
173
+ **BibTeX:**
174
+
175
+ [More Information Needed]
176
+
177
+ **APA:**
178
+
179
+ [More Information Needed]
180
+
181
+ ## Glossary [optional]
182
+
183
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
184
+
185
+ [More Information Needed]
186
+
187
+ ## More Information [optional]
188
+
189
+ [More Information Needed]
190
+
191
+ ## Model Card Authors [optional]
192
+
193
+ [More Information Needed]
194
+
195
+ ## Model Card Contact
196
+
197
+ [More Information Needed]
198
 
 
 
 
199
 
 
 
 
 
 
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BenjaminOcampo/model-contrastive-bert__trained-in-ishate__seed-0",
3
+ "architectures": [
4
+ "ContrastiveModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "problem_type": "single_label_classification",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.30.2",
24
+ "type_vocab_size": 2,
25
+ "use_cache": true,
26
+ "vocab_size": 30522
27
+ }
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70144a50148a804b3666d2e3e46677cb07dc8e629b26259f5f233a59d09a449c
3
+ size 440370701
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5121bd30cbf09d873822dc0305e50c848ca242a4b674227f9603ab1397ea4237
3
+ size 440368698
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "clean_up_tokenization_spaces": true,
3
+ "cls_token": "[CLS]",
4
+ "do_lower_case": true,
5
+ "mask_token": "[MASK]",
6
+ "model_max_length": 512,
7
+ "pad_token": "[PAD]",
8
+ "sep_token": "[SEP]",
9
+ "strip_accents": null,
10
+ "tokenize_chinese_chars": true,
11
+ "tokenizer_class": "BertTokenizer",
12
+ "unk_token": "[UNK]"
13
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff