--- base_model: openai/clip-vit-base-patch32 tags: - generated_from_trainer model-index: - name: clip-fine-tuned-satellite-20240821 results: [] license: mit datasets: - blanchon/UC_Merced metrics: - accuracy --- # clip-fine-tuned-satellite This model is a fine-tuned version of [openai/clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32) on the blanchon/UC_Merced dataset.\ It achieves the following results on the test set:\ -Accuracy: 96.9% \ The original CLIP model achieves 58.8% of accuracy. ## Model description The model is a fine-tuned version of CLIP.\ 30% of the parameters were retrained to achieve a significant increase in accuracy after only 2 epochs. ## Intended uses & limitations The model is to be used to classify satellite images.\ It was trained on the UC_Merced dataset that comprises 21 classes: agricultural, airplane, baseballdiamond, beach, buildings, chaparral, denseresidential, forest, freeway, golfcourse, harbor, intersection, mediumresidential, mobilehomepark, overpass, parkinglot, river, runway, sparseresidential, storagetanks, tenniscourt ## Training and evaluation data 30% of the parameters trained.\ Evaluated against a test set of 420 images. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 64 - eval_batch_size: 64 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 2 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 1.4974 | 1.0 | 20 | 3.0190 | | 1.3733 | 2.0 | 40 | 2.9588 | ### Framework versions - Transformers 4.44.0 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1