Skip to content

Commit a3b10c0

Browse files
authored
Update README.md
1 parent 778d5c2 commit a3b10c0

1 file changed

Lines changed: 132 additions & 9 deletions

File tree

README.md

Lines changed: 132 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,13 @@ This code was written with security best practices in mind, has an
1212
extensive test suite, and has undergone
1313
[adversarial security review](docs/attack_review_ground_rules.md).
1414

15-
----
15+
## Getting Started
1616

1717
[Getting Started](docs/getting_started.md) includes instructions on
1818
how to get started with or without Maven.
1919

20+
## Prepackage Policies
21+
2022
You can use
2123
[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/Sanitizers.html):
2224

@@ -25,7 +27,9 @@ PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS);
2527
String safeHTML = policy.sanitize(untrustedHTML);
2628
```
2729

28-
or the
30+
## Crafting a policy
31+
32+
The
2933
[tests](https://github.com/OWASP/java-html-sanitizer/blob/master/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java)
3034
show how to configure your own
3135
[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/HtmlPolicyBuilder.html):
@@ -40,7 +44,9 @@ PolicyFactory policy = new HtmlPolicyBuilder()
4044
String safeHTML = policy.sanitize(untrustedHTML);
4145
```
4246

43-
or you can write
47+
## Custom Policies
48+
49+
You can write
4450
[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/ElementPolicy.html)
4551
to do things like changing `h1`s to `div`s with a certain class:
4652

@@ -49,9 +55,11 @@ PolicyFactory policy = new HtmlPolicyBuilder()
4955
.allowElements("p")
5056
.allowElements(
5157
new ElementPolicy() {
52-
public String apply(String elementName, List<String> attrs) {
58+
(String elementName, List<String> attrs) -> {
59+
// Add a class attribute.
5360
attrs.add("class");
5461
attrs.add("header-" + elementName);
62+
// Return elementName to include, null to drop.
5563
return "div";
5664
}
5765
}, "h1", "h2", "h3", "h4", "h5", "h6")
@@ -64,14 +72,129 @@ need to be explicitly whitelisted using the `allowWithoutAttributes()`
6472
method if you want them to be allowed through the filter when these
6573
elements do not include any attributes.
6674

67-
----
75+
[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/AttributePolicy.html) allow running custom code too. Adding an attribute policy will not water down any default policy like `style` or URL attribute checks.
76+
77+
```Java
78+
new HtmlPolicyBuilder = new HtmlPolicyBuilder()
79+
.allowElement("div", "span")
80+
.allowAttributes("data-foo")
81+
.matching(
82+
(String elementName, String attributeName, String value) -> {
83+
// Return value for the attribute or null to drop.
84+
})
85+
.onElements("div", "span")
86+
.build()
87+
```
88+
89+
## Preprocessors
90+
91+
Preprocessors allow inserting text and large scale structural changes.
92+
93+
```Java
94+
new HtmlPolicyBuilder = new HtmlPolicyBuilder()
95+
// Use a preprocessor to be backwards compatible with the
96+
// <plaintext> element which
97+
.withPreprocessor(
98+
(HtmlStreamEventReceiver r) -> {
99+
// Provide user with info about links before they click.
100+
// Before: <a href="https://example.com/...">
101+
// After: (https://example.com) <a href="https://example.com/...">
102+
return new HtmlStreamEventReceiverWrapper(r) {
103+
@Override public void openTag(String elementName, List<String> attrs) {
104+
if ("a".equals(elementName)) {
105+
for (int i = 0, n = attrs.size(); i < n; i += 2) {
106+
if ("href".equals(attrs.get(i)) {
107+
String url = attrs.get(i + 1);
108+
String origin;
109+
try {
110+
URI uri = new URI(url);
111+
String scheme = uri.getScheme();
112+
String authority = uri.getRawAuthority();
113+
if (scheme == null && authority == null) {
114+
origin = null;
115+
} else {
116+
origin = (scheme != null ? scheme + ":" : "")
117+
+ (authority != null ? "//" + authority : "");
118+
}
119+
} catch (URISyntaxException ex) {
120+
origin = "about:invalid";
121+
}
122+
if (origin != null) {
123+
text(" (" + origin + ") ");
124+
}
125+
}
126+
}
127+
}
128+
super.openTag(elementName, attrs);
129+
}
130+
};
131+
}
132+
.allowElement("a")
133+
...
134+
.build()
135+
136+
```
137+
138+
Preprocessing happens before a policy is applied, so cannot affect the security
139+
of the output.
140+
141+
## Telemetry
142+
143+
When a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/HtmlChangeListener.html).
144+
145+
You can use this to keep track of policy violation trends and find out when someone
146+
is making an effort to breach your security.
147+
148+
```Java
149+
PolicyFactory myPolicyFactory = ...;
150+
// If you need to associate reports with some context, you can do so.
151+
MyContextClass myContext = ...;
152+
153+
String sanitizedHtml = myPolicyFactory.sanitize(
154+
unsanitizedHtml,
155+
new HtmlChangeListener<MyContextClass>() {
156+
@Override
157+
public void discardedTag(MyContextClass context, String elementName) {
158+
// ...
159+
}
160+
@Override
161+
public void discardedAttributes(
162+
MyContextClass context, String elementName, String... attributeNames) {
163+
// ...
164+
}
165+
},
166+
myContext);
167+
```
168+
169+
**Note**: If a string sanitizes with no change notifications, it is not the case
170+
that the input string is necessarily safe to use. Only use the output of the sanitizer.
171+
172+
The sanitizer ensures that the output is in a sub-set of HTML that commonly
173+
used HTML parsers will agree on the meaning of, but the absence of
174+
notifications does not mean that the input is in such a sub-set,
175+
only that it does not contain elements or attributes that were removed.
176+
177+
See ["Why sanitize when you can validate"](https://github.com/OWASP/java-html-sanitizer/blob/master/docs/html-validation.md) for more on this topic.
178+
179+
## Questions?
68180

69-
Subscribe to the
70-
[mailing list](http://groups.google.com/group/owasp-java-html-sanitizer-support)
71-
to be notified of known [Vulnerabilities](docs/vulnerabilities.md).
72181
If you wish to report a vulnerability, please see
73182
[AttackReviewGroundRules](docs/attack_review_ground_rules.md).
74183

75-
----
184+
Subscribe to the
185+
[mailing list](http://groups.google.com/group/owasp-java-html-sanitizer-support)
186+
to be notified of known [Vulnerabilities](docs/vulnerabilities.md) and important updates.
187+
188+
## Contributing
189+
190+
If you would like to contribute, please ping [@mvsamuel](https://twitter.com/mvsamuel) or [@manicode](https://twitter.com/manicode).
191+
192+
We welcome [issue reports](https://github.com/OWASP/java-html-sanitizer/issues) and PRs.
193+
PRs that change behavior or that add functionality should include both positive and
194+
[negative tests](https://www.guru99.com/negative-testing.html).
195+
196+
Please be aware that contributions fall under the [Apache 2.0 License](https://github.com/OWASP/java-html-sanitizer/blob/master/COPYING).
197+
198+
## Credits
76199

77200
[Thanks to everyone who has helped with criticism and code](docs/credits.md)

0 commit comments

Comments
 (0)