Use can_ada for supported URLs by AdrianAtZyte · Pull Request #261 · scrapy/w3lib

AdrianAtZyte · 2026-06-10T09:50:12Z

Fixes #98, fixes #204, closes #221, fixes #222, closes #270.

Some of the tests in #259 would still not pass with these changes, but I suspect that may be OK (can_ada / WHATWG diverge from stdlib).

codecov · 2026-06-10T09:51:33Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.39%. Comparing base (d5877fe) to head (38f52bf).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #261      +/-   ##
==========================================
- Coverage   98.70%   98.39%   -0.31%     
==========================================
  Files           9        9              
  Lines         848      872      +24     
  Branches      172      175       +3     
==========================================
+ Hits          837      858      +21     
- Misses          4        9       +5     
+ Partials        7        5       -2

Files with missing lines	Coverage Δ
w3lib/_url.py	`98.98% <100.00%> (+0.60%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codspeed-hq · 2026-06-10T09:52:27Z

Merging this PR will improve performance by 22.42%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 6 improved benchmarks
✅ 39 untouched benchmarks

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`test_benchmark_url_cold[parse_url]`	145.9 µs	111.5 µs	+30.85%
⚡	`test_benchmark_url_cold[add_or_replace_parameter]`	746.9 µs	574 µs	+30.12%
⚡	`test_benchmark_url_cold[safe_url_string]`	1.9 ms	1.5 ms	+24.94%
⚡	`test_benchmark_url_cold[safe_download_url]`	392.5 µs	318.2 µs	+23.36%
⚡	`test_benchmark_url_cold[add_or_replace_parameters]`	328.4 µs	287.8 µs	+14.08%
⚡	`test_benchmark_url_cold[canonicalize_url]`	350.2 µs	311.4 µs	+12.45%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing AdrianAtZyte:ada (38f52bf) with master (d5877fe)}

wRAR · 2026-06-10T10:49:37Z

        )
        assert isinstance(safeurl, str)
-        assert safeurl == "http://www.example.com/%C2%A3?unit=%B5"
+        assert safeurl == "http://www.example.com/%C2%A3?unit=%C2%B5"


There are several changes that look like it was latin1 and now is utf-8, and in at least some of them an arg with "latin1" is passed to the function, I wonder if all of these are still valid behavior.

My understanding is that latin1 is used for decoding bytes to str, then utf-8 for percent-encoding, which I think aligns with WHATWG.

wRAR · 2026-06-10T10:58:48Z

Some of the tests in #259 would still not pass with these changes, but I suspect that may be OK (can_ada / WHATWG diverge from stdlib).

Yes, it says that explicitly so maybe we don't want to have those tests...

Use can_ada for supported URLs

cd37a8f

AdrianAtZyte requested a review from wRAR June 10, 2026 09:50

Improve coverage

4c65f97

wRAR reviewed Jun 10, 2026

View reviewed changes

Comment thread tests/test_html.py Outdated

wRAR reviewed Jun 10, 2026

View reviewed changes

AdrianAtZyte mentioned this pull request Jun 10, 2026

Provide alternative URL benchmarks not affected by cache savings #262

Merged

Address pre-commit issue, clarify coded behavior

38aac0c

AdrianAtZyte closed this Jun 10, 2026

AdrianAtZyte reopened this Jun 10, 2026

AdrianAtZyte added 2 commits June 10, 2026 13:13

Merge remote-tracking branch 'origin/master' into ada

1204a47

Try a small performance improvement for relative URLs

38f52bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use can_ada for supported URLs#261

Use can_ada for supported URLs#261
AdrianAtZyte wants to merge 5 commits into
scrapy:masterfrom
AdrianAtZyte:ada

AdrianAtZyte commented Jun 10, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

wRAR Jun 10, 2026

Uh oh!

AdrianAtZyte Jun 10, 2026

Uh oh!

wRAR commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AdrianAtZyte commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codspeed-hq Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 22.42%

Performance Changes

Uh oh!

Uh oh!

wRAR Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

AdrianAtZyte Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

wRAR commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AdrianAtZyte commented Jun 10, 2026 •

edited

Loading

codecov Bot commented Jun 10, 2026 •

edited

Loading

codspeed-hq Bot commented Jun 10, 2026 •

edited

Loading