FlowEdit: Inversion-Free Text-Based Editing
Using Pre-Trained Flow Models

ICCV 2025 Best Student Paper

Technion – Israel Institute of Technology

Image

Abstract

Editing real images using a pre-trained text-to-image (T2I) diffusion/flow model often involves inverting the image into its corresponding noise map. However, inversion by itself is typically insufficient for obtaining satisfactory results, and therefore many methods additionally intervene in the sampling process. Such methods achieve improved results but are not seamlessly transferable between model architectures. Here, we introduce FlowEdit, a text-based editing method for pre-trained T2I flow models, which is inversion-free, optimization-free and model agnostic. Our method constructs an ODE that directly maps between the source and target distributions (corresponding to the source and target text prompts) and achieves a lower transport cost than the inversion approach. This leads to state-of-the-art results, as we illustrate with Stable Diffusion 3 and FLUX.

Overview

See the following video for visual intuition about our method. The video includes narration.


The following figure illustrates the main idea behind our method: Image (a) In inversion based editing, the source image Zsrc0 is first mapped to the noise space by solving the forward ODE conditioned on the source prompt (left path). Then, the extracted noise is used to solve the reverse ODE conditioned on the target prompt to obtain Ztar0 (right path). The images at the bottom visualize this transition.
(b) We reinterpret inversion as a direct path between the source and target distributions (bottom path). This is done by using the velocities calculated during the inversion and sampling (green and red arrows) to calculate an editing direction (orange arrow) that drives the evolution of the direct path Zinvt through an ODE. The resulting path is noise-free, as demonstrated by the images at the bottom.
(c) FlowEdit traverses a shorter direct path, ZFEt, without relying on inversion. At each timestep, we directly add random noise to src0 to obtain srct and use that direction to create tart from ZFEt (gray parallelogram). We then calculate the corresponding velocities and average over multiple realizations (not shown in the figure) to obtain the next ODE step (orange arrow). The images at the bottom demonstrate our noise-free path.
See our paper for more details.


Real Image Editing

Image

A bicycle parked next to a red brick building

Image

A vespa parked next to a red brick building

Image

A rabbit sitting in a field with flowers

Image

A puppy sitting in a field with flowers

Image

A glass of milk

Image

A glass of beer

Image

A restaurant called Luna

Image

A restaurant called Sol

Image

A woman meditating

Image

A wooden statue meditating

Image

A cat wearing a crown

Image

A cat wearing a top hat

Image

A coconut shell filled with splashing water

Image

A baseball shell filled with splashing water

Image

A wolf standing on a cliff, howling

Image

A Husky standing on a cliff, looking

Image

A horse in the field

Image

A pink toy horse in the field

Image

Two penguins

Image

Two origami penguins

Image

Clownfish swimming in a reef

Image

Goldfish swimming in a reef

Image

A dog in the snow

Image

A deer in the snow



Comparisons

Stable Diffusion 3

FLUX


Paper

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models
Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli.

Bibtex

@inproceedings{kulikov2025flowedit, title={Flowedit: Inversion-free text-based editing using pre-trained flow models}, author={Kulikov, Vladimir and Kleiner, Matan and Huberman-Spiegelglas, Inbar and Michaeli, Tomer}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, pages={19721--19730}, year={2025} }

Our official code and dataset can be found in the official github repository.

Logo
ComfyUI implementation for FLUX and HunyuanLoom, implemented by logtd.
LTX-Video ComfyUI implementation can be found in LTX-Video official repository.



References

[1] Xiaofeng Yang, Cheng Chen, Xulei Yang, Fayao Liu and Guosheng Lin. "Text-to-Image Rectified Flow as Plug-and-Play Priors." ICLR 2025.
[2] Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai and Wen-Sheng Chu. "Semantic Image Inversion and Editing using Stochastic Rectified Differential Equations." ICLR 2025.
[2] Jiangshan Wang, Pu Junfu, Qi Zhongang, Guo Jiayi, Ma Yue, Huang Nisha, Chen Yuxin, Li Xiu and Shan Ying. "Taming Rectified Flow for Inversion and Editing." ICML 2025.


Acknowledgements

This webpage was originally made by Matan Kleiner with the help of Hila Manor. The code for the original template can be found here.
Icons are taken from font awesome or from Academicons.