This Vision Transformer model is a fine-tuned version of Google's "vit-large-patch16-224" model.

This model has been fine-tuned with a custom dataset as a finishing project for an academic study.

The aim of the project is to develop a model that achieves high consistency with a limited amount of data. The study uses a dataset consisting of breast cancer images of varying resolutions.

The dataset contains 780 MRI images with a total of 3 classes (benign, malignant, normal), separated into train and test.

Distributions of images:

train:

  • benign: 350
  • malignant: 168
  • normal: 106

test:

  • benign: 87
  • malignant: 42
  • normal: 27

Since the size of the images varies, the images were scaled down to the resolution specified by Google for the model (224x224) and given to the model for fine-tuning.

Arguments used in fine-tuning:

trainArgs = TrainingArguments(
    save_strategy="epoch",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=10,
    per_device_eval_batch_size=4,
    num_train_epochs=40,
    weight_decay=0.01,
    load_best_model_at_end=True,
    metric_for_best_model="accuracy",
    logging_dir='logs',
    remove_unused_columns=False,
)
Downloads last month
15
Safetensors
Model size
303M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train emre570/google-vit-large-finetuned

Space using emre570/google-vit-large-finetuned 1

Collection including emre570/google-vit-large-finetuned