gbyuvd commited on
Commit
4c201cb
1 Parent(s): d7337ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -95,6 +95,10 @@ This prototype is a [sentence-transformers](https://www.SBERT.net) based on [Min
95
 
96
  I am planning to train this model with more epochs on current dataset, before moving on to a larger dataset with 6 million pairs generated from ChemBL34. However, this will take some time due to computational and financial constraints. A future project of mine is to develop a custom model specifically for cheminformatics to address any biases and optimization issues in repurposing an embedding model designed for NLP tasks.
97
 
 
 
 
 
98
  ### Disclaimer: For Academic Purposes Only
99
  The information and model provided is for academic purposes only. It is intended for educational and research use, and should not be used for any commercial or legal purposes. The author do not guarantee the accuracy, completeness, or reliability of the information.
100
 
 
95
 
96
  I am planning to train this model with more epochs on current dataset, before moving on to a larger dataset with 6 million pairs generated from ChemBL34. However, this will take some time due to computational and financial constraints. A future project of mine is to develop a custom model specifically for cheminformatics to address any biases and optimization issues in repurposing an embedding model designed for NLP tasks.
97
 
98
+ ### Update
99
+ This model won't be trained further on current natural products dataset nor the ChemBL34, since I've been working on pre-training a BERT-like base model that operates on SELFIES with a custom tokenizer for past two weeks. This base model was scheduled for release this week, but due to mistakes in parsing some SELFIES notations, the pre-training is halted and I am working intensely to correct these issues and continue the training. The base model will hopefully released next week. Following this, I plant to fine-tune a sentence transformer and a classifier model built on top of that base model.
100
+ The timeline for these tasks depends on the availability of compute server and my own time constraints, as I also need to finish my undergrad thesis. Thank you for checking out this model.
101
+
102
  ### Disclaimer: For Academic Purposes Only
103
  The information and model provided is for academic purposes only. It is intended for educational and research use, and should not be used for any commercial or legal purposes. The author do not guarantee the accuracy, completeness, or reliability of the information.
104