Abstract

Despite recent advances in deep generative modeling, skin lesion classification systems remain constrained by the limited availability of large, diverse, and well-annotated clinical datasets, resulting in class imbalance between benign and malignant lesions and consequently reduced generalization performance. We introduce DermaFlux, a rectified flow-based text-to-image generative framework that synthesizes clinically grounded skin lesion images from natural language descriptions of dermatological attributes. Built upon Flux.1, DermaFlux is fine-tuned using parameter-efficient Low-Rank Adaptation (LoRA) on a large curated collection of publicly available clinical image datasets. We construct image-text pairs using synthetic textual captions generated by Llama 3.2, following established dermatological criteria including lesion asymmetry, border irregularity, and color variation. Extensive experiments demonstrate that DermaFlux generates diverse and clinically meaningful dermatology images that improve binary classification performance by up to 6% when augmenting small real-world datasets, and by up to 9% when classifiers are trained on DermaFlux-generated synthetic images rather than diffusion-based synthetic images. Our ImageNet-pretrained ViT fine-tuned with only 2,500 real images and 4,375 DermaFlux-generated samples achieves 78.04% binary classification accuracy and an AUC of 0.859, surpassing the next best dermatology model by 8%.

Curated Dataset

The mole in the image is generally symmetrical, with a relatively smooth border. However, there are some areas where the border appears to be slightly irregular, with a few small notches and bumps. The color of the mole is primarily brown, with some areas of lighter and darker shades of brown. There are also some small, darker spots scattered throughout the mole, which may be pigmentation variations or small freckles. Overall, the mole has a relatively uniform color and texture, with some minor variations.

The mole exhibits a notable degree of asymmetry, with one half being significantly larger than the other. The border of the mole is irregular and jagged, with a rough, uneven texture. The coloration of the mole is irregular, featuring a mix of dark brown and black patches scattered throughout the lesion. Additionally, there are areas of lighter skin tone visible, which may indicate a variation in pigmentation. Overall, the mole’s appearance suggests a potential malignancy, warranting further examination and diagnosis.”

Given a lesion image and its label, benign (left) or malignant (right), LLama 3.2 generates a synthetic caption using the prompt: “This is an image containing a [label] lesion. Give me a description of this mole regarding its asymmetry, border irregularity, and color.”

Generated Samples

Examples of generated images containing a benign skin lesion.

Examples of generated images containing a malignant skin lesion.

Experiments

Test accuracy scores for ResNeXt and ViT across varying real-to-synthetic training data ratios. Results are averaged over five independent runs (different seeds).

DermaFlux-ViT separates malignant and benign test samples more reliably than competing models.

BibTeX


@misc{galanakis2026dermafluxsyntheticskinlesion,
      title={DermaFlux: Synthetic Skin Lesion Generation with Rectified Flows for Enhanced Image Classification}, 
      author={Stathis Galanakis and Alexandros Koliousis and Stefanos Zafeiriou},
      year={2026},
      eprint={2603.16392},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.16392}, 
}

DermaFlux: Synthetic Skin Lesion Generation with Rectified Flows for Enhanced Image Classification

Method

DermaFlux synthesizes a skin lesion image x₁ by transporting Gaussian noise z₀ to a clean latent representation z₁, conditioned on the input caption. The Flux.1 backbone is frozen (❄️) and only the injected LoRA parameters are trained (🔥).

Abstract

Curated Dataset

Generated Samples

Experiments

BibTeX

DermaFlux: Synthetic Skin Lesion Generation with Rectified Flows for Enhanced Image Classification

Method

DermaFlux synthesizes a skin lesion image x1 by transporting Gaussian noise z0 to a clean latent representation z1, conditioned on the input caption. The Flux.1 backbone is frozen (❄️) and only the injected LoRA parameters are trained (🔥).

Abstract

Curated Dataset

Generated Samples

Experiments

BibTeX

DermaFlux synthesizes a skin lesion image x₁ by transporting Gaussian noise z₀ to a clean latent representation z₁, conditioned on the input caption. The Flux.1 backbone is frozen (❄️) and only the injected LoRA parameters are trained (🔥).