---
library_name: transformers
pipeline_tag: image-text-to-text
license: apache-2.0
datasets:
- joshuachou/SkinCAP
- HemanthKumarK/SKINgpt
language:
- en
tags:
- biology
- skin
- skin disease
- cancer
- medical
---
# Model Card for PaliGemma Dermatology Model

## Model Details

### Model Description

This model, based on the PaliGemma-3B architecture, has been fine-tuned for dermatology-related image and text processing tasks. The model is designed to assist in the identification of various skin conditions using a combination of image analysis and natural language processing.


- **Developed by:** Bruce_Wayne
- **Model type:** vision model
- **Finetuned from model:** https://huggingface.co/google/paligemma-3b-pt-224
- **LoRa Adaptors used:** Yes
- **Intended use:** Medical image analysis, specifically for dermatology
**
### please let me know how the model works -->https://forms.gle/cBA6apSevTyiEbp46
### Thank you
## Uses
### Direct Use

The model can be directly used for analyzing dermatology images, providing insights into potential skin conditions.


## Bias, Risks, and Limitations

**Skin Tone Bias:** The model may have been trained on a dataset that does not adequately represent all skin tones, potentially leading to biased results.
**Geographic Bias:** The model's performance may vary depending on the prevalence of certain conditions in different geographic regions.

## How to Get Started with the Model

```python

import torch
from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
from PIL import Image

# Load the model and processor
model_id = "brucewayne0459/paligemma_derm"
processor = AutoProcessor.from_pretrained(model_id)
model = PaliGemmaForConditionalGeneration.from_pretrained(model_id, device_map={"": 0})
model.eval()

# Load a sample image and text input
input_text = "Identify the skin condition?"
input_image_path = " Replace with your actual image path"  
input_image = Image.open(input_image_path).convert("RGB")

# Process the input
inputs = processor(text=input_text, images=input_image, return_tensors="pt", padding="longest").to("cuda" if torch.cuda.is_available() else "cpu")

# Set the maximum length for generation
max_new_tokens = 50

# Run inference
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)

# Decode the output
decoded_output = processor.decode(outputs[0], skip_special_tokens=True)
print("Model Output:", decoded_output)
```
## Training Details

### Training Data

The model was fine-tuned on a dataset of dermatological images combined with disease names

### Training Procedure

The model was fine-tuned using LoRA (Low-Rank Adaptation) for more efficient training. Mixed precision (bfloat16) was used to speed up training and reduce memory usage.

#### Training Hyperparameters

- **Training regime:** Mixed precision (bfloat16)
- **Epochs:** 10
- **Learning rate:** 2e-5
- **Batch size:** 6
- **Gradient accumulation steps:** 4


## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated on a separate validation set of dermatological images and Disease Names, distinct from the training data.

#### Metrics
- **Validation Loss:** The loss was tracked throughout the training process to evaluate model performance.
- **Accuracy:** The primary metric for assessing model predictions.
### Results

The model achieved a final validation loss of approximately 0.2214, indicating reasonable performance in predicting skin conditions based on the dataset used.

#### Summary


## Environmental Impact


- **Hardware Type:** 1 x L4 GPU
- **Hours used:** ~22 HOURS
- **Cloud Provider:** LIGHTNING AI
- **Compute Region:** USA
- **Carbon Emitted:** 0.9 kg eq. CO2

## Technical Specifications

### Model Architecture and Objective

- **Architecture:** Vision-Language model based on PaliGemma-3B
- **Objective:** To classify and diagnose dermatological conditions from images and text

### Compute Infrastructure

#### Hardware

- **GPU:** 1xL4 GPU
## Model Card Authors 
Bruce_Wayne