jameswright's picture
Add new SentenceTransformer model.
d10f6ba verified
metadata
base_model: google-bert/bert-base-uncased
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:132712
  - loss:DenoisingAutoEncoderLoss
widget:
  - source_sentence: >-
      it got and spoke was amazing and me sequi a month . I the been seen a in,
      I then started, im last one 4 ive my anxiety has horrible I have had a
      read of different and part hightened, didnt stright away the only just
      taised head the last . to it all than was prior to but im just for similar
      experiences me I want or? taking time read.
    sentences:
      - >-
        In anycase it got too much and I spoke to my gp who was amazing and
        started me on evorel sequi for a 3 month trial. So I would say the first
        4 patches have been good, I definitely seen a change in myself, I then
        started the conti patches, im currently on the last one of those 4, ive
        been brilliant until 3 days ago my anxiety has come back and its
        horrible. SO I have had a read of some different forums and it seems
        that the conti part can cause hightened anxiety. now, this didnt happen
        stright away and the anxiety has only just taised it head again in the
        last 3 days. So I dont want to complain too much as it is all better
        than it was prior to HRT but I guess im just looking for people with
        similar experiences to me, and I want to know if it got better or what?
        Thanks for taking the time to read. 
      - >-
        Just out of interest too... In an untreated Type 1 experiencing symptoms
        such as hunger, what happens to blood sugar levels. Do they just rise
        and rise? For example, given a typical day of carb laden breakfast,
        lunch and dinner and snacks, would you expect the BG levels to just go
        out of control over a short time (mmol/l into the teens and beyond?
        )....
      - >-
        Anyone else have post natal anxiety or depression? I’m on the wait list
        for counselling and my gp has prescribed tablets (although I’m not going
        to take them). Just wondering if anyone else is in the same boat and how
        you are coping? I’m so up and down x 😢
  - source_sentence: >-
      I still a work I am live normally . manage symptoms in daily in there are
      days it like such a Do you?
    sentences:
      - >-
        I am certainly still a work in progress. But most days I am able to live
        my life normally. And manage the symptoms I feel. I still have to put in
        the daily work and in all honesty there are days it feels like such a
        pain lol. Do you know where your anxiety began? 
      - >-
        I've had the rabies anxiety too, then I met this dude in the crisis
        center who told me "do you know how rare rabies is? I've literally
        wrestled hundreds of wild animals as an exterminator and never once got
        rabies, and in all my years I've only seen one animal with rabies, and
        you can easily tell" So if the odds are low seeing an animal with rabies
        then the odds of rabies infected saliva in 90 degree weather is
        absolutely miniscule. 
      - >-
        Morning all, Im currently in the 3rd round of my 8 cycles of chemo for
        breast cancer. Had my last round on 29th June. I have tested positive
        for covid this morning after waking up with a sore throat. Spoke to the
        chemo line, they said dont worry, just get in touch if I get a
        temperature or feel really unwell. Has anyone else had covid while under
        going chemo? Im naturally feeling a littlw anxious and would love to
        hear from anyone who's been there. Thanks in advance xx
  - source_sentence: >-
      yourself out speculating your problem might be, most so overwhelmed right
      that it sense note of what problems are consider possibilities .'d until
      GP of It GP For, there a time have to more Do you notice problems you
      certain?
    sentences:
      - >-
        Don't freak yourself out by speculating about what your problem might
        be. At the same time, most GPs are so overwhelmed right now that it
        makes sense to keep note of what your problems are and to consider
        possibilities. I'd make a diary entry every day until you see the GP
        about the details of your symptoms. It'll help the GP. For example, is
        there a particular time of day that you have to urinate more often? Do
        you notice certain problems after you eat certain things?
      - "Hiya! I was just wondering how long does it take for an appointment to come for an urgent scan. I had my blood test result which showed ca125 of 47 and seen the GP last Friday the 30th and it's the waiting that's driving me crazy. Is it or is it not ovarian cancer??? I have no appetite anymore and I'm thinking if this is due to anxiety or progression of whatever I have? I also have this dry mouth which I am not sure if from anxiety or due to the\_blood pressure tablet I am taking which has been increased recently."
      - >-
        yma123 · Yesterday 12:08 No I didn't have any symptoms at all! CIN2
        means moderate cell changes. I know it all seems very scary but just try
        and think that things like this is exactly what this process is trying
        to pick up so they can catch things before it gets to the stage that
        it's anything serious you're welcome to message me if you have any more
        questions or if you just want to talk about things x Thanks for
        elaborating on what CIN2 means I had no idea. So is that kind of like
        cell changes that happen before it develops fully into cancer? So kind
        of you to offer for me to PM you, thanks so much I will do that. It’s
        nice to be able to chat to people that have been through similar as it’s
        such an anxious time 💐
  - source_sentence: >-
      what proposing is of to lowest Good, school have the wo be, having any,
      doctors That the outcome you think the way through is completely / mixed
      up what is proposing it ’ s something teaching how relate world are ” “
      french lessons
    sentences:
      - >-
        Surely you must know that that's BS? Or am I wrong, does the fact ICE
        cars have oil running through their veins equal the possession of a
        soul? Does that Black Edition Golf doing the rounds have soul? I'd look
        at this and surmise that it is big, cosseting, luxurious, quiet,
        effortlessly fast etc etc, and for me that sounds like pretty compelling
        transport. Can't you lot just accept that cars, and more specifically
        the enjoyment of cars, means different things to different people, and
        that you can be just as enthusiastic about electric propulsion as you
        can be about burning fossil fuels?
      - >-
        Phoebo · Today 03:35 So what you're proposing is a dumbing down of
        society to eductae the lowest common denominator? Good parents, parents
        but if school have to do the parenting then there won't be time for
        actual schooling, so we'll end up not having any engineers, doctors,
        architects etc? That's the outcome if you think it all the way through.
        The OP is completely unsure / confused / mixed up about what she is
        proposing. when asked to be clear… it’s something about teaching about
        “feelings” how we “relate to the world” “fizzy drinks are bad” and “drop
        french for lessons about domestic abuse” 😂
      - >-
        Can you speak to a counsellor about it or ask for mental health support
        at your gp? It might help you build up more coping skills around this.
        What do your parents say about it? Do they intervene? Is she abusive to
        your parents or anyone else? 
  - source_sentence: >-
      You have been sick so If questions it just - ’ being sorted, thanks Then
      assertiveness course and a . overtime isn selfish it just doing ’ s right
      for Why you?
    sentences:
      - >-
        You have been off sick so it makes sense. If anyone questions it just
        grey rock - it’s being sorted, thanks. Then do an online assertiveness
        course and ask your GP for a CBT referral. Not doing overtime isn’t
        selfish - it’s just you doing what’s right for you. Why would you do
        anything else?
      - "Science works by the accumulation of evidence.\_ Independent groups work on projects and publish results.\_ Those results are examined and tested and examined again and tested again, such that they're either confirmed or discarded and further work continues accordingly.\_ If a scientist or doctor disagrees with the 'official line' they're asked to present the data, methods and conclusions that have led to that disagreement so that it can by examined by the broader scientific and medical community.\_ \_ And yes, someone who goes on YouTube or wherever - whether doctor, scientist or layperson - and tells viewers that a vaccine alters DNA structure and destroys the immune system is either a grifter or a fruitcake. 1 minute ago, FIRETHORN1 said: ...Can you not accept that some people can hold an opposite view quite genuinely? To me, a \"conspiracy theorist\" is someone who believes what they are told, without any evidence to back it up. I can absolutely accept that someone can genuinely believe something without having any evidence at all to support that view.\_ I'm believing that right now, in fact. 1 minute ago, FIRETHORN1 said: There is no evidence whatsoever that the vaccine works That is categorically, absolutely and undeniably false, as the most cursory of research will tell you.\_ But then you don't actually want\_ to believe me, do you?"
      - >-
        n111ck said: Thats two completely different things and lets not forget
        every time the government tries to, for example, crack down on benefit
        fraud its slammed as 'unfair and cruel' in the lefty media. Where
        anything is provided 'free' by the government it will always be abused
        and subject to fraud - covid has proven that quite clearly. The cost of
        preventing fraud has to be balanced against the cost of the actual fraud
        unfortunately. Mainly because they way they do it is cruel and unfair
        and usually hits those that need the help by starting off with the
        objective to actively refuse and punish not to actually assess and help?
        Also possibly because benefit fraud is tiny compared to mistakes
        (usually in the governments favour) in decisions and the amount not
        claimed that people are eligible for?

SentenceTransformer based on google-bert/bert-base-uncased

This is a sentence-transformers model finetuned from google-bert/bert-base-uncased. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google-bert/bert-base-uncased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("jameswright/ws-wr-questions-bert-TSDAE-v1")
# Run inference
sentences = [
    'You have been sick so If questions it just - ’ being sorted, thanks Then assertiveness course and a . overtime isn selfish it just doing ’ s right for Why you?',
    'You have been off sick so it makes sense. If anyone questions it just grey rock - it’s being sorted, thanks. Then do an online assertiveness course and ask your GP for a CBT referral. Not doing overtime isn’t selfish - it’s just you doing what’s right for you. Why would you do anything else?',
    'Science works by the accumulation of evidence.\xa0 Independent groups work on projects and publish results.\xa0 Those results are examined and tested and examined again and tested again, such that they\'re either confirmed or discarded and further work continues accordingly.\xa0 If a scientist or doctor disagrees with the \'official line\' they\'re asked to present the data, methods and conclusions that have led to that disagreement so that it can by examined by the broader scientific and medical community.\xa0 \xa0 And yes, someone who goes on YouTube or wherever - whether doctor, scientist or layperson - and tells viewers that a vaccine alters DNA structure and destroys the immune system is either a grifter or a fruitcake. 1 minute ago, FIRETHORN1 said: ...Can you not accept that some people can hold an opposite view quite genuinely? To me, a "conspiracy theorist" is someone who believes what they are told, without any evidence to back it up. I can absolutely accept that someone can genuinely believe something without having any evidence at all to support that view.\xa0 I\'m believing that right now, in fact. 1 minute ago, FIRETHORN1 said: There is no evidence whatsoever that the vaccine works That is categorically, absolutely and undeniably false, as the most cursory of research will tell you.\xa0 But then you don\'t actually want\xa0 to believe me, do you?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 132,712 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 4 tokens
    • mean: 47.75 tokens
    • max: 460 tokens
    • min: 17 tokens
    • mean: 114.94 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    ’ Can really go to the doctors I ’ bored of the ” . Feels more like than a doctor, does sound depression, so seeing GP a first PetersRabbitt I don’t know? Can I really go to the doctors and say “hey, yes my problem is I’m bored all of the time”. Feels more like a me problem than one a doctor can help with. Yes, absolutely. It does sound like it could be depression, so seeing your GP is a good first step.
    Ursuladevine Between 11 16, if hasn t, what has been been providing education have LakieLady Yesterday 15:34 My that dismissed offhand son the school have up for assessment . Within years referred diagnosed with PTSD,,, social anxiety and, decided his be. Ursuladevine · Yesterday 15:42 Between 11 and 16, if he hasn’t been attending school, what has he been doing? Has the LA been providing any education or have you been HE? LakieLady · Yesterday 15:34 My friend tried that, and the GP dismissed it offhand, saying that if her son was neurodivergent, the school would have picked up on it and referred him for assessment. Her DS was eight at the time. Within the next 2-3 years, he got much worse, was referred to CAMHS, diagnosed with significant MH problems (PTSD, GAD, depression, social anxiety disorder) and after a couple of years, CAMHS decided his mother might not be talking bollocks and that he might have ASD.
    It sounds you were a child then came along realised here was he - and this to it . young, I'd imagine It sounds like you were hurt by one man when you were a child, then another came along and realised here was someone damaged he could dominate - and added his own abuse. They can sniff this out and are attracted to it. How old were you when he arrived? Very young, I'd imagine. Stepfather?
  • Loss: DenoisingAutoEncoderLoss

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 5
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss
0.0301 500 4.7687
0.0603 1000 4.2523
0.0904 1500 4.1156
0.1206 2000 4.0278
0.1507 2500 3.9652
0.1808 3000 3.919
0.2110 3500 3.8629
0.2411 4000 3.7985
0.2713 4500 3.7625
0.3014 5000 3.7523
0.3315 5500 3.7316
0.3617 6000 3.6837
0.3918 6500 3.669
0.4220 7000 3.6394
0.4521 7500 3.6017
0.4822 8000 3.5693
0.5124 8500 3.5821
0.5425 9000 3.5488
0.5727 9500 3.5139
0.6028 10000 3.5119
0.6329 10500 3.4988
0.6631 11000 3.4741
0.6932 11500 3.4719
0.7234 12000 3.4501
0.7535 12500 3.4353
0.7837 13000 3.4107
0.8138 13500 3.4023
0.8439 14000 3.3902
0.8741 14500 3.3697
0.9042 15000 3.3731
0.9344 15500 3.3603
0.9645 16000 3.3284
0.9946 16500 3.3339
1.0248 17000 3.2793
1.0549 17500 3.2098
1.0851 18000 3.1994
1.1152 18500 3.1801
1.1453 19000 3.1634
1.1755 19500 3.1566
1.2056 20000 3.1205
1.2358 20500 3.1064
1.2659 21000 3.1028
1.2960 21500 3.099
1.3262 22000 3.1028
1.3563 22500 3.0653
1.3865 23000 3.044
1.4166 23500 3.0481
1.4467 24000 3.0133
1.4769 24500 2.9667
1.5070 25000 3.0226
1.5372 25500 2.991
1.5673 26000 2.9593
1.5974 26500 2.9598
1.6276 27000 2.9572
1.6577 27500 2.9579
1.6879 28000 2.9303
1.7180 28500 2.948
1.7481 29000 2.918
1.7783 29500 2.9014
1.8084 30000 2.8948
1.8386 30500 2.8916
1.8687 31000 2.8787
1.8988 31500 2.8864
1.9290 32000 2.8649
1.9591 32500 2.8419
1.9893 33000 2.8688
2.0194 33500 2.8329
2.0496 34000 2.7442
2.0797 34500 2.7501
2.1098 35000 2.7466
2.1400 35500 2.7343
2.1701 36000 2.7014
2.2003 36500 2.6891
2.2304 37000 2.6819
2.2605 37500 2.6779
2.2907 38000 2.6872
2.3208 38500 2.6758
2.3510 39000 2.6665
2.3811 39500 2.6392
2.4112 40000 2.6362
2.4414 40500 2.6038
2.4715 41000 2.5535
2.5017 41500 2.6081
2.5318 42000 2.6071
2.5619 42500 2.5571
2.5921 43000 2.5774
2.6222 43500 2.5556
2.6524 44000 2.5683
2.6825 44500 2.5317
2.7126 45000 2.5509
2.7428 45500 2.5292
2.7729 46000 2.52
2.8031 46500 2.4818
2.8332 47000 2.5258
2.8633 47500 2.482
2.8935 48000 2.5038
2.9236 48500 2.4864
2.9538 49000 2.4591
2.9839 49500 2.4887
3.0140 50000 2.4635
3.0442 50500 2.3837
3.0743 51000 2.3886
3.1045 51500 2.3836
3.1346 52000 2.38
3.1647 52500 2.3456
3.1949 53000 2.3171
3.2250 53500 2.3341
3.2552 54000 2.3228
3.2853 54500 2.3459
3.3154 55000 2.3251
3.3456 55500 2.3365
3.3757 56000 2.2838
3.4059 56500 2.3042
3.4360 57000 2.2465
3.4662 57500 2.2304
3.4963 58000 2.251
3.5264 58500 2.2727
3.5566 59000 2.2324
3.5867 59500 2.2325
3.6169 60000 2.2246
3.6470 60500 2.2287
3.6771 61000 2.2067
3.7073 61500 2.2206
3.7374 62000 2.1882
3.7676 62500 2.1889
3.7977 63000 2.1559
3.8278 63500 2.2021
3.8580 64000 2.1643
3.8881 64500 2.145
3.9183 65000 2.1707
3.9484 65500 2.1349
3.9785 66000 2.1659
4.0087 66500 2.152
4.0388 67000 2.0801
4.0690 67500 2.0729
4.0991 68000 2.0676
4.1292 68500 2.0622
4.1594 69000 2.0376
4.1895 69500 2.027
4.2197 70000 2.0227
4.2498 70500 2.0146
4.2799 71000 2.0334
4.3101 71500 2.0428
4.3402 72000 2.034
4.3704 72500 1.9907
4.4005 73000 2.0106
4.4306 73500 1.9488
4.4608 74000 1.961
4.4909 74500 1.9351
4.5211 75000 1.9875
4.5512 75500 1.9454
4.5813 76000 1.9453
4.6115 76500 1.9239
4.6416 77000 1.9664
4.6718 77500 1.906
4.7019 78000 1.9256
4.7321 78500 1.9071
4.7622 79000 1.9117
4.7923 79500 1.8817
4.8225 80000 1.9101
4.8526 80500 1.8872
4.8828 81000 1.8634
4.9129 81500 1.8791
4.9430 82000 1.8801
4.9732 82500 1.8586

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.3
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

DenoisingAutoEncoderLoss

@inproceedings{wang-2021-TSDAE,
    title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
    author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna", 
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    pages = "671--688",
    url = "https://arxiv.org/abs/2104.06979",
}