showerstomp / onestep-extraction Goto Github PK

This project forked from ryanwebster90/onestep-extraction

NaN

Shell 5.01% Python 94.99%

onestep-extraction's Introduction

One Step Extraction of Training Images from Diffusion Models

In this work, we have "extracted" training images from several diffusion models, similar to [1]. These are generated images which are exact copies of training set ones. Our attack is more efficient than [1], and our labeling can extract images which are not exactly the same, but vary in fixed spatial locations (see below ). Read about it on arxiv A Reproducible Extraction of Training Images from Diffusion Models

To verify our attack you'll have to first generate some images, then download the corresponding images from LAION-2B, and our set of templates / masks, then verify theiry MSE is indeed low enough (or by inspection). The below code will verify our whitebox attack on SDV1:

pip install -r requirements.txt
sh verify_sdv1_wb_attack.sh

Update 7.24

You may now run our WB and BB attacks versus SDV1 and SDV2. If you want to run the BB attack versus SDV2 for instance, you can run:

sh configs/run_bb_attack_30k_sdv2.sh

This code will download a file membership_attack_top30k.parquet from huggingface which contains 30k captions as the top scores of our whitebox attack on SDV1 (the corresponding "bb" attacks vs. SDV1 are not entirely BB, but are there to study the attack). A blackbox attack can be performed on a much larger set, selected through deduplication, with:

sh configs/run_wb_attack_2M.sh

Roadmap

[1] Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188, 2023 [2] A Reproducible Extraction of Training Images from Diffusion Models

Some regurgitated prompts

Regurgitated prompts can be found in the following parquets. They will be labeled as 'MV','RV' or 'TV' in the 'overfit_type' field:

*** NOTE: the midjourney images only apply to v4. The new version (v5) seems to have mitigated the problem! So you must append --v 4 to prompts ***

Midjourney examples

prompt	type
Prince Reunites With Warner Brothers, Plans New Album --seed 2 --stylize 0 --stop 10 --v 4	Exact
Will Ferrell, John C. Reilly in Talks for Border Guards --seed 0 --stylize 0 --stop 10 --v 4	Exact
Design Art Light in Dense Fall Forest Landscape Oversized Circle Wall Art --seed 4 --q 2 --v 4	Template
Shaw Floors Spice It Up Tyler Taupe 00103_E9013 --seed 16 --stylize 0 --stop 10 --v 4	Template

Stable Diffusion V1

prompt	type
Galaxy Women's Leather Boots	Template
Rambo 5 und Rocky Spin-Off - Sylvester Stallone gibt Updates	Exact
Full body U-Zip main opening - Full body U-Zip main opening on front of bag for easy unloading when you get to camp	Exact
Mothers influence on her young hippo	Exact

Deep Image Floyd

prompt	type
Designart Green Abstract Metal Grill ContemporaryArt On Canvas - 5 Panels	Template
New York Luggage Cover - New York / S - Luggage covers	Template
Foyer painted in HABANERO	Template
Shaw Floors Value Collections Sandy Hollow Cl II Net Alpine Fern 00305_5E510	Template

Stable Diffusion V2

prompt	type
Pencil pleat curtains in collection Velvet, fabric: 704-18	Template
Skull Of A Skeleton With A Burning Cigarette - Vincent Van Gogh Wall Tapestry	Template
Shaw Floors Couture' Collection Ultimate Expression 15′ Sahara 00205_19829	Template
Sting Like A Bee By Louisa - Throw Pillow	Template

some other files

Top 30K scores for the whitebox attack

The prompts for the 2M most duplicated images

Template Verbatims

Template verbatims for various networks: Left is generated, middle is retrieved image and right is the extracted mask. Template verbatims originate from images that have variation in fixed spatial locations in L2B. For instance, in the top-left, varying the carpet color in an e-commerce image. These images are generated in a many-to-many fashion (for instance, the same prompt will generate the topleft and bottom right images, which come from the "Shaw floors" prompts)

Idea behind attack

Training images can be extracted from Stable-Diffusion in one step. In the first row, a verbatim copy is synthesized from the caption corresponding to the image on the second to last column. In the second row, we present verbatim copies that are harder to detect: template verbatims. They typically represent many-to-many mappings (many captions synthesize many verbatim templates) and thus the ground truth is constructed with retrieval (right most column). Non-verbatims have no match, even when retrieving over the entire dataset.

Our attack exploits this fast appearance, by seperating the realistic images in the first two columns from the blurry one in the last column.

Recommend Projects

showerstomp / onestep-extraction Goto Github PK

onestep-extraction's Introduction

One Step Extraction of Training Images from Diffusion Models

Update 7.24

Roadmap

Some regurgitated prompts

some other files

Template Verbatims

Idea behind attack

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent