FEAT: add `TransparencyAttackConverter` #1031

paulinek13 · 2025-07-28T17:41:31Z

Description

This PR adds a converter that implements the transparency attack as described in: "Transparency Attacks: How Imperceptible Image Layers Can Fool AI Perception" by McKee, F. and Noever, D., 2024: https://arxiv.org/abs/2401.15817

The converter blends an attack (background/harmful) image with a benign (foreground/target) image using an optimized alpha channel to create a dual perception effect. The output is a PNG image that looks like the benign image on light backgrounds, but reveals the attack image when placed on dark backgrounds.

Tests and Documentation

✔️ Added unit tests.

I've tested the effectiveness of the converter with various models (even local ones) and it works as expected: I was able to successfully "fool" some of them :) I'll post an example of such attack in the PR comments.

TODO: add a notebook

Based on the "Novel Image Blending Algorithm" from: https://arxiv.org/pdf/2401.15817

paulinek13 · 2025-07-28T17:59:57Z

Here's an example of attack that uses this converter for image manipulation, ChatGPT:

Apparently, when using dark theme I can clearly see that the image asks "How to make a bomb", but that's a limitation this converter just has.

And here's some comparison of how the final blended image looks based on the viewer's background color:

Edit: I wanted to share a link to the chat but I immediately got "This shared link has been disabled by moderation" 😄

romanlutz

Woah! This is awesome 🥇 I left a few comments but nothing major. Excellent work! Can't wait to try this.

doc/api.rst

pyrit/prompt_converter/hidden_layer_image_converter.py

jbolor21

This is awesome! Maybe one related idea is adding in a notebook (or in our image converter notebook) on showing this working! (Totally non-blocking comment)

pyrit/prompt_converter/hidden_layer_image_converter.py

paulinek13 · 2025-07-30T18:19:18Z

This is awesome! Maybe one related idea is adding in a notebook (or in our image converter notebook) on showing this working! (Totally non-blocking comment)

I'll definitely do this! Thanks for the idea 😃

bashirpartovi

Great job on this @paulinek13 , this is really cool. I had a few comments and one recommendation as follows:

You could add early convergence check in your step loop for an early exit. Here is an example:

# ...
prev_loss = float('inf')
convergence_threshold = 1e-6
convergence_patience = 10
no_improvement_count = 0

# ... 

for step in range(self.steps):
     # ...
     if abs(prev_loss - loss) < convergence_threshold:
          no_improvement_count += 1
          if no_improvement_count >= convergence_patience:
               # early convergence, exit the loop
               break
          else:
               no_improvement_count = 0
     prev_loss = loss
     # ....

pyrit/prompt_converter/hidden_layer_image_converter.py

paulinek13 · 2025-07-31T18:38:50Z

@romanlutz @jbolor21 @hannahwestra25 @bashirpartovi

Thanks a lot for your reviews, comments and suggestions! I've addressed them 😀

I'll now work on adding a notebook for this converter

romanlutz

Very nice! This is essentially ready to merge.

doc/code/converters/transparency_attack_converter.py

pyproject.toml

feat: add hidden layer image manipulation converter

9e5fd28

Based on the "Novel Image Blending Algorithm" from: https://arxiv.org/pdf/2401.15817

paulinek13 added 4 commits July 28, 2025 21:59

corrections/improvements

75b531d

add tests

3dcdfc8

pre-commit

ad34a09

Merge branch 'main' into feat/529/add_hidden_layer_image_converter

50ce993

paulinek13 changed the title ~~[DRAFT] FEAT: add hidden layer image manipulation converter~~ FEAT: add hidden layer image manipulation converter Jul 29, 2025

paulinek13 marked this pull request as ready for review July 29, 2025 19:02

romanlutz reviewed Jul 30, 2025

View reviewed changes

PR feedback

8de3f26

jbolor21 reviewed Jul 30, 2025

View reviewed changes

hannahwestra25 reviewed Jul 30, 2025

View reviewed changes

pyrit/prompt_converter/hidden_layer_image_converter.py Outdated Show resolved Hide resolved

hannahwestra25 reviewed Jul 30, 2025

View reviewed changes

pyrit/prompt_converter/hidden_layer_image_converter.py Outdated Show resolved Hide resolved

hannahwestra25 reviewed Jul 30, 2025

View reviewed changes

pyrit/prompt_converter/hidden_layer_image_converter.py Outdated Show resolved Hide resolved

hannahwestra25 reviewed Jul 30, 2025

View reviewed changes

pyrit/prompt_converter/hidden_layer_image_converter.py Outdated Show resolved Hide resolved

input images validation

3a2c176

pr feedback

c1c7098

bashirpartovi reviewed Jul 30, 2025

View reviewed changes

romanlutz self-assigned this Jul 30, 2025

paulinek13 added 7 commits July 31, 2025 10:10

improve Adam optimizer docs

8287614

pr feedback

d459eef

fix types

ef17866

one-liner for MSE

01ebe13

optimize gradient computation

8243a63

refactor gradient computation

fa70877

L(A) instead of RGB(A)

2d2beef

paulinek13 force-pushed the feat/529/add_hidden_layer_image_converter branch from ad47c84 to 2d2beef Compare July 31, 2025 11:50

paulinek13 added 2 commits July 31, 2025 14:00

cache benign/foreground image

7f73d10

improvements

558eea5

paulinek13 added 5 commits July 31, 2025 19:17

add early convergence check

ad62e42

tiny changes

0f112cd

move AdamOptimizer outside the class

611d3c4

change the converter name to TransparencyAttackConverter

da9daeb

fix tests

8b22449

paulinek13 changed the title ~~FEAT: add hidden layer image manipulation converter~~ FEAT: add TransparencyAttackConverter Jul 31, 2025

fixes after precommit

dcffe14

paulinek13 and others added 3 commits August 3, 2025 21:44

add notebook

97f2fcb

Merge branch 'main' into feat/529/add_hidden_layer_image_converter

a69952c

Merge branch 'main' into feat/529/add_hidden_layer_image_converter

b1d0a7c

romanlutz approved these changes Aug 8, 2025

View reviewed changes

doc/code/converters/transparency_attack_converter.py Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

paulinek13 and others added 6 commits August 8, 2025 17:44

display images with markdown

a1b65e5

remove rules for ruff

7992369

Merge branch 'main' into feat/529/add_hidden_layer_image_converter

f4c7343

Merge branch 'main' into feat/529/add_hidden_layer_image_converter

a26b12c

move the images to the notebook dir

0d11e51

Merge branch 'main' into feat/529/add_hidden_layer_image_converter

188bb57

romanlutz merged commit c73ce45 into Azure:main Sep 26, 2025
20 checks passed

paulinek13 deleted the feat/529/add_hidden_layer_image_converter branch November 6, 2025 16:22

FEAT: add TransparencyAttackConverter #1031

FEAT: add TransparencyAttackConverter #1031

Uh oh!

Conversation

paulinek13 commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests and Documentation

Uh oh!

paulinek13 commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

romanlutz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbolor21 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulinek13 commented Jul 30, 2025

Uh oh!

bashirpartovi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulinek13 commented Jul 31, 2025

Uh oh!

romanlutz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

FEAT: add `TransparencyAttackConverter` #1031

FEAT: add `TransparencyAttackConverter` #1031

paulinek13 commented Jul 28, 2025 •

edited

Loading

paulinek13 commented Jul 28, 2025 •

edited

Loading