NEW AI Studio is now available Try it now
POST /links

Links API

Discover every URL on a website without the overhead of content extraction. The Links endpoint crawls a domain and returns the complete link structure — faster and cheaper than a full crawl when you only need URLs.

The Recommended Workflow

Use Links to discover URLs first, then selectively scrape only what you need.

1 Discover

Links API

Get all URLs from a domain. Fast and cheap — no content extraction overhead.

→ 1,024 URLs found
2 Filter

Your Code

Filter URLs programmatically. Keep only blog posts, product pages, or whatever you need.

→ 47 blog posts matched
3 Extract

Scrape API

Scrape only the filtered URLs. Pay for content extraction only where it matters.

→ 47 pages as markdown

Links vs. Crawl

Links Endpoint

URL discovery
Faster response times
Lower cost per page
No page content
No metadata

Crawl Endpoint

URL discovery
Full page content
Metadata extraction
MD/HTML/Text formats
Higher cost per page

Key Capabilities

Lower Cost per URL

By skipping content extraction and format conversion, the Links endpoint uses fewer credits per page. Ideal for large-scale discovery.

Faster Response

Without rendering and extracting content, Links returns results faster. Lower latency means quicker iteration on discovery workflows.

Full Crawl Parameters

Use the same depth, limit, subdomains, and TLD controls as crawl. All the same knobs for scoping discovery.

Subdomain Discovery

Enable subdomains to discover URLs across all subdomains. Map the full structure of organizations with complex web presence.

External Domain Linking

Track outbound links to external domains using external_domains. Analyze a site's link profile and relationships.

Streaming Output

Stream discovered URLs as they're found using JSONL content type. Build real-time pipelines that process URLs as the crawl progresses.

Code Examples

from spider import Spider

client = Spider()

# Get all URLs from a website
links = client.links(
    "https://example.com",
    params={
        "limit": 0,  # No limit - discover everything
        "subdomains": True,
    }
)

for link in links:
    print(link["url"])
print(f"Total: {len(links)} URLs found")
curl -X POST https://api.spider.cloud/links \
  -H "Authorization: Bearer $SPIDER_API_KEY" \
  -H "Content-Type: application/jsonl" \
  -d '{
    "url": "https://example.com",
    "limit": 1000,
    "depth": 5,
    "subdomains": true
  }'
import Spider from "@spider-cloud/spider-client";

const client = new Spider();

// Step 1: Discover all URLs
const links = await client.links("https://example.com", {
  limit: 500,
});

// Step 2: Filter for blog posts
const blogUrls = links
  .map(l => l.url)
  .filter(url => url.includes("/blog/"));

// Step 3: Scrape only the blog posts
const content = await client.scrape(blogUrls.join(","), {
  return_format: "markdown",
});

Popular Use Cases

Sitemap Generation

Build comprehensive sitemaps by discovering every URL on a website. Find pages missing from the existing sitemap.xml.

SEO Link Auditing

Map internal link structure to identify orphan pages, broken links, and opportunities to improve site architecture.

Pre-Crawl Discovery

Discover URLs first, filter programmatically, then scrape only pages you need. More efficient than crawling everything.

Change Detection

Periodically collect links to detect new pages, removed pages, or URL structure changes across a domain.

Related Resources

Map any website's URL structure

Discover every page on a website, fast and cost-efficient.