--- library_name: transformers license: mit language: - en pipeline_tag: object-detection --- This model is fine-tuned version of microsoft/conditional-detr-resnet-50. You can find details of model in this [fashion-visual-search](https://github.com/yainage90/fashion-visual-search) This model was trained using a combination of two datasets: [modanet](https://github.com/eBay/modanet) and [fashionpedia](https://fashionpedia.github.io/home/) The labels are ['bag', 'bottom', 'dress', 'hat', 'shoes', 'outer', 'top'] In the 96th epoch out of total of 100 epochs, the best score was achieved with mAP 0.7542. Therefore, it is believed that there is a little room for performance improvement. ![sample_image](sample_image.png)