Skip to content

Commit 972d9b3

Browse files
committed
docs for recipe
1 parent 60e84ce commit 972d9b3

2 files changed

Lines changed: 21 additions & 25 deletions

File tree

docs/1start/attacks4Components.md

Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,15 @@
1-
Four Components of TextAttack Attacks
2-
========================================
3-
4-
To unify adversarial attack methods into one system, We formulate an attack as consisting of four components: a **goal function** which determines if the attack has succeeded, **constraints** defining which perturbations are valid, a **transformation** that generates potential modifications given an input, and a **search method** which traverses through the search space of possible perturbations. The attack attempts to perturb an input text such that the model output fulfills the goal function (i.e., indicating whether the attack is successful) and the perturbation adheres to the set of constraints (e.g., grammar constraint, semantic similarity constraint). A search method is used to find a sequence of transformations that produce a successful adversarial example.
5-
1+
# Four Components of TextAttack Attacks
62

3+
To unify adversarial attack methods into one system, We formulate an attack as consisting of four components: a **goal function** which determines if the attack has succeeded, **constraints** defining which perturbations are valid, a **transformation** that generates potential modifications given an input, and a **search method** which traverses through the search space of possible perturbations. The attack attempts to perturb an input text such that the model output fulfills the goal function (i.e., indicating whether the attack is successful) and the perturbation adheres to the set of constraints (e.g., grammar constraint, semantic similarity constraint). A search method is used to find a sequence of transformations that produce a successful adversarial example.
74

85
This modular design enables us to easily assemble attacks from the literature while re-using components that are shared across attacks. TextAttack provides clean, readable implementations of 16 adversarial attacks from the literature. For the first time, these attacks can be benchmarked, compared, and analyzed in a standardized setting.
96

10-
117
- Two examples showing four components of two SOTA attacks
12-
![two-categorized-attacks](/_static/imgs/intro/01-categorized-attacks.png)
13-
8+
![two-categorized-attacks](/_static/imgs/intro/01-categorized-attacks.png)
149

15-
- You can create one new attack (in one line of code!!!) from composing members of four components we proposed, for instance:
10+
- You can create one new attack (in one line of code!!!) from composing members of four components we proposed, for instance:
1611

17-
```bash
12+
```bash
1813
# Shows how to build an attack from components and use it on a pre-trained model on the Yelp dataset.
1914
textattack attack --attack-n --model bert-base-uncased-yelp --num-examples 8 \
2015
--goal-function untargeted-classification \
@@ -39,27 +34,20 @@ A `Transformation` takes as input an `AttackedText` and returns a list of possib
3934

4035
A `SearchMethod` takes as input an initial `GoalFunctionResult` and returns a final `GoalFunctionResult` The search is given access to the `get_transformations` function, which takes as input an `AttackedText` object and outputs a list of possible transformations filtered by meeting all of the attack’s constraints. A search consists of successive calls to `get_transformations` until the search succeeds (determined using `get_goal_results`) or is exhausted.
4136

42-
43-
4437
### On Benchmarking Attack Recipes
4538

46-
- Please read our analysis paper: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples at [EMNLP BlackBoxNLP](https://arxiv.org/abs/2009.06368).
47-
48-
- As we emphasized in the above paper, we don't recommend to directly compare Attack Recipes out of the box.
49-
50-
- This is due to that attack recipes in the recent literature used different ways or thresholds in setting up their constraints. Without the constraint space held constant, an increase in attack success rate could come from an improved search or a better transformation method or a less restrictive search space.
39+
- Please read our analysis paper: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples at [EMNLP BlackBoxNLP](https://arxiv.org/abs/2009.06368).
5140

41+
- As we emphasized in the above paper, we don't recommend to directly compare Attack Recipes out of the box.
5242

43+
- This is due to that attack recipes in the recent literature used different ways or thresholds in setting up their constraints. Without the constraint space held constant, an increase in attack success rate could come from an improved search or a better transformation method or a less restrictive search space.
5344

54-
### Four components in Attack Recipes we have implemented
55-
45+
### Four components in Attack Recipes we have implemented
5646

5747
- TextAttack provides clean, readable implementations of 16 adversarial attacks from the literature.
5848

5949
- To run an attack recipe: `textattack attack --recipe [recipe_name]`
6050

61-
62-
6351
<table style="width:100%" border="1">
6452
<thead>
6553
<tr class="header">
@@ -224,13 +212,21 @@ A `SearchMethod` takes as input an initial `GoalFunctionResult` and returns a fi
224212
<td ><sub>Greedy attack with goal of changing every word in the output translation. Currently implemented as black-box with plans to change to white-box as done in paper (["Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples" (Cheng et al., 2018)](https://arxiv.org/abs/1803.01128)) </sub> </td>
225213
</tr>
226214

215+
<tr><td style="text-align: center;" colspan="6"><strong><br>General: <br></strong></td></tr>
216+
217+
<tr class="odd">
218+
<td style="text-align: left;"><code>bad-characters</code> <span class="citation" data-cites=""></span></td>
219+
<td style="text-align: left;"><sub>TargetedClassification, TargetedStrict, TargetedBonus, NamedEntityRecognition, LogitSum, MinimizeBleu, MaximizeLevenshtein</sub></td>
220+
<td style="text-align: left;"></td>
221+
<td style="text-align: left;"><sub>(Homoglyph, Invisible Characters, Reorderings, Deletions) Word Swap</sub></td>
222+
<td style="text-align: left;"><sub>DifferentialEvolution</sub></td>
223+
<td><sub>Uses imperceptible character-level perturbations including homoglyph substitutions, Unicode reordering, deletions, and invisibles. Based on (["Bad Characters: Imperceptible NLP Attacks" (Boucher et al., 2021)](https://arxiv.org/abs/2106.09898)).</sub></td>
224+
</tr>
227225

228226
</tbody>
229227
</font>
230228
</table>
231229

232-
233-
234230
- Citations
235231

236232
```

docs/3recipes/attack_recipes_cmd.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -227,11 +227,11 @@ To run an attack recipe: `textattack attack --recipe [recipe_name]`
227227

228228
<tr>
229229
<td><code>bad-characters</code> <span class="citation" data-cites=""></span></td>
230-
<td><sub>Targeted classification, Strict targeted classification, Named entity recognition, Logit sum, Minimize Bleu score, Maximize Levenshtein score</sub> </td>
230+
<td><sub>TargetedClassification, TargetedStrict, TargetedBonus, NamedEntityRecognition, LogitSum, MinimizeBleu, MaximizeLevenshtein</sub> </td>
231231
<td></td>
232232
<td><sub>(Homoglyph, Invisible Characters, Reorderings, Deletions) Word Swap</sub> </td>
233233
<td><sub>DifferentialEvolution</sub></td>
234-
<td ><sub> (["Bad Characters: Imperceptible NLP Attacks" (Boucher et al., 2021)](https://arxiv.org/abs/2106.09898)) </sub> </td>
234+
<td ><sub>Uses imperceptible character-level perturbations including homoglyph substitutions, Unicode reordering, deletions, and invisibles. Based on (["Bad Characters: Imperceptible NLP Attacks" (Boucher et al., 2021)](https://arxiv.org/abs/2106.09898)).</sub> </td>
235235
</tr>
236236

237237
</tbody>

0 commit comments

Comments
 (0)