File size: 3,752 Bytes
132af6b
12f320b
 
 
 
 
 
 
 
 
 
 
 
 
132af6b
 
12f320b
 
 
132af6b
12f320b
 
 
 
 
 
132af6b
12f320b
 
 
 
 
 
 
132af6b
12f320b
 
132af6b
12f320b
277c8ec
12f320b
277c8ec
12f320b
 
 
277c8ec
12f320b
277c8ec
12f320b
 
 
 
 
 
 
 
277c8ec
12f320b
 
 
 
 
277c8ec
12f320b
 
277c8ec
12f320b
277c8ec
12f320b
 
277c8ec
12f320b
277c8ec
12f320b
 
 
 
277c8ec
12f320b
277c8ec
12f320b
 
277c8ec
12f320b
277c8ec
12f320b
 
277c8ec
12f320b
277c8ec
031683f
12f320b
277c8ec
12f320b
 
277c8ec
12f320b
 
277c8ec
12f320b
 
 
277c8ec
12f320b
 
 
132af6b
12f320b
 
 
132af6b
12f320b
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
base_model: BenjaminOcampo/model-contrastive-bert__trained-in-ishate__seed-0
datasets:
- ISHate
language:
- en
library_name: transformers
license: bsl-1.0
metrics:
- f1
- accuracy
tags:
- hate-speech-detection
- implicit-hate-speech
---

This model card documents the demo paper "PEACE: Providing Explanations and
Analysis for Combating Hate Expressions" accepted at the 27th European
Conference on Artificial Intelligence: https://www.ecai2024.eu/calls/demos.

# The Model
This model is a hate speech detector fine-tuned specifically for detecting
implicit hate speech. It is based on the paper "PEACE: Providing Explanations
and Analysis for Combating Hate Expressions" by Greta Damo, Nicolás Benjamín
Ocampo, Elena Cabrio, and Serena Villata, presented at the 27th European
Conference on Artificial Intelligence.

# Training Parameters and Experimental Info
The model was trained using the ISHate dataset, focusing on implicit data.
Training parameters included:
- Batch size: 32
- Weight decay: 0.01
- Epochs: 4
- Learning rate: 2e-5

For detailed information on the training process, please refer to the [model's
paper](https://aclanthology.org/2023.findings-emnlp.441/).

# Usage

First you might need the transformers version 4.30.2.

```
pip install transformers==4.30.2
```

This model was created using pytorch vanilla. In order to load it you have to use the following Model Class.

```python
class ContrastiveModel(nn.Module):
    def __init__(self, model):
        super(ContrastiveModel, self).__init__()
        self.model = model
        self.embedding_dim = model.config.hidden_size
        self.fc = nn.Linear(self.embedding_dim, self.embedding_dim)
        self.classifier = nn.Linear(self.embedding_dim, 2)  # Classification layer

    def forward(self, input_ids, attention_mask):
        outputs = self.model(input_ids, attention_mask)
        embeddings = outputs.last_hidden_state[:, 0]  # Use the CLS token embedding as the representation
        embeddings = self.fc(embeddings)
        logits = self.classifier(embeddings)  # Apply classification layer

        return embeddings, logits
```

Then, we instantiate the model as:

```python
from transformers import AutoModel, AutoTokenizer, AutoConfig

repo_name = "BenjaminOcampo/peace_cont_bert"

config = AutoConfig.from_pretrained(repo_name)
contrastive_model = ContrastiveModel(AutoModel.from_config(config))
tokenizer = AutoTokenizer.from_pretrained(repo_name)
```

Finally, to load the weights of the model we do as follows:

```python
model_tmp_file = hf_hub_download(repo_id=repo_name, filename="model.pt", token=read_token)

state_dict = torch.load(model_tmp_file)

contrastive_model.load_state_dict(state_dict)
```

You can make predictions as any pytorch model:

```python
import torch

text = "Are you sure that Islam is a peaceful religion?"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    _, logits = contrastive_model(inputs["input_ids"], inputs["attention_mask"])

probabilities = torch.softmax(logits, dim=1)
_, predicted_labels = torch.max(probabilities, dim=1)
```

# Datasets
The model was trained on the [ISHate dataset](https://huggingface.co/datasets/BenjaminOcampo/ISHate), specifically
the training part of the dataset which focuses on implicit hate speech.

# Evaluation Results
The model's performance was evaluated using standard metrics, including F1 score
and accuracy. For comprehensive evaluation results, refer to the linked paper.

Authors:
- [Greta Damo](https://grexit-d.github.io/damo.greta.github.io/)
- [Nicolás Benjamín Ocampo](https://www.nicolasbenjaminocampo.com/)
- [Elena Cabrio](https://www-sop.inria.fr/members/Elena.Cabrio/)
- [Serena Villata](https://webusers.i3s.unice.fr/~villata/Home.html)