Image-to-Image (img2img)
Image-to-image transforms an input image based on a text prompt and a denoise strength parameter – from subtle changes to complete redesign.
img2img transforms existing images with text prompts – from style variations to scene changes, controlled by denoise strength.
Explanation
The input image is partially noised (depending on strength) then denoised with the text prompt as guidance. Low strength = close to original. High strength = nearly new generation. Core feature of all diffusion tools.
Marketing Relevance
Essential for marketing workflows: placing product images in different scenes, style changes, quick mockups from sketches.
Example
A product photo is transformed into 10 different seasonal settings with img2img – Christmas, summer, urban – without re-shooting.
Common Pitfalls
Strength balance is trial and error. Fine details are lost at high strength. Product fidelity difficult with strong transformation.
Origin & History
Pix2Pix (Isola et al., 2017) was the first deep-learning-based image translation. CycleGAN enabled unpaired translation. SDEdit (2021) brought stochastic differential editing. Stable Diffusion integrated img2img as core feature (2022). InstructPix2Pix (2023) enabled natural language instructions for image editing.
Comparisons & Differences
Image-to-Image (img2img) vs. Text-to-Image
Text-to-image starts from noise; img2img starts from an existing image and transforms it.
Image-to-Image (img2img) vs. Inpainting
img2img transforms the entire image; inpainting changes only masked areas.