@@ -2539,13 +2539,10 @@ def web_scrape_sitemap(
25392539 timeout : float | httpx .Timeout | None | NotGiven = not_given ,
25402540 ) -> BrandWebScrapeSitemapResponse :
25412541 """
2542- Crawls the sitemap of the given domain and returns all discovered page URLs.
2543- Supports sitemap index files (recursive), parallel fetching with concurrency
2544- control, deduplication, and filters out non-page resources (images, PDFs, etc.).
2542+ Crawl an entire website's sitemap and return all discovered page URLs
25452543
25462544 Args:
2547- domain: Domain name to crawl sitemaps for (e.g., 'example.com'). The domain will be
2548- automatically normalized and validated.
2545+ domain: Domain to build a sitemap for
25492546
25502547 max_links: Maximum number of links to return from the sitemap crawl. Defaults to 10,000.
25512548 Minimum is 1, maximum is 100,000.
@@ -5054,13 +5051,10 @@ async def web_scrape_sitemap(
50545051 timeout : float | httpx .Timeout | None | NotGiven = not_given ,
50555052 ) -> BrandWebScrapeSitemapResponse :
50565053 """
5057- Crawls the sitemap of the given domain and returns all discovered page URLs.
5058- Supports sitemap index files (recursive), parallel fetching with concurrency
5059- control, deduplication, and filters out non-page resources (images, PDFs, etc.).
5054+ Crawl an entire website's sitemap and return all discovered page URLs
50605055
50615056 Args:
5062- domain: Domain name to crawl sitemaps for (e.g., 'example.com'). The domain will be
5063- automatically normalized and validated.
5057+ domain: Domain to build a sitemap for
50645058
50655059 max_links: Maximum number of links to return from the sitemap crawl. Defaults to 10,000.
50665060 Minimum is 1, maximum is 100,000.
0 commit comments