mukaj/fin-mpnet-base · Request for Insights on Fine-Tuning Methods

Hi,

Thanks for the interest. The dataset was generated in similar fashion to what is described in this paper: https://arxiv.org/abs/2401.00368.

The seed data I used was a lot of financial documents such as Annual Reports, Earnings Call Transcripts, SEC filings, Sustainability Reports.. etc.. So the documents were fed page by page (with some cleaning and filtering) and then the LLM generated Positive/Negative retrieval queries based off the given passage. The model used at the time was Mixtral 8x7b. So the dataset format was just [Query, Document Passage, Pos_or_Neg]

The fine tuning details are pretty standard (something like 1e-5 lr, Lion, 10 epochs) but for the final model I ditched the Negatives in the data and just used MultipleNegativesRankingLoss, I think the Hard Negatives generated from the LLM were perhaps not the best as ContrastiveLoss did not do well on validation. Kept around 5000 queries/document pairs as validation and used InformationRetrievalEvaluator to evaluate them on NDCG@10/MRR@10 etc.. There was a very good correlation of validation set performance increasing to FiQA Task performance, of course no test set from FiQA was ever downloaded/used, and this was evaluated with MTEB library so self reported scores are from this output.

Hope this helps!