Search API
Go from a search query to fully extracted web content in a single request. Spider searches the web, discovers relevant pages, crawls each result, and returns clean content.
One Query, Many Results
A single search request fans out across the web and converges back to structured content.
How Search + Crawl Works
You Send a Query
A plain-text search query — the same kind you'd type into a search engine. Set search_limit to control how many results to crawl.
Spider Searches the Web
Queries major search engines and collects the top result URLs. Use country_code for geo-targeted results.
Each Result Is Crawled
Every discovered page is loaded, rendered with a real browser if needed, and its content extracted in your chosen format.
Clean Content Returned
Structured content from every result page delivered as markdown, text, HTML, or raw bytes — ready for your pipeline.
Without Spider Search
# Step 1: Search API
serp = serp_api.search("query")
urls = [r.url for r in serp.results]
# Step 2: Scrape each result
pages = []
for url in urls:
html = scraper.fetch(url)
text = parse_and_clean(html)
pages.append(text)
# 3 services, N+1 API calls, error handling... With Spider Search
# One call does everything
pages = spider.search(
"query",
params={
"search_limit": 5,
"return_format": "markdown",
}
)
# Done. 1 service, 1 call, full content. Key Capabilities
Configurable Result Count
Set search_limit to control how many search results Spider crawls — from 1 to dozens. Balance thoroughness against cost and latency.
Deep Crawl Results
Combine search_limit with limit to not just scrape result pages but crawl deeper into each discovered site.
Geo-Targeted Search
Use country_code to get localized results. See what users in different regions find for the same query.
All Output Formats
Get search-sourced content as markdown, text, HTML, or raw bytes — the same formatting and cleaning as crawl and scrape.
Metadata Enrichment
Enable metadata to get page titles, descriptions, and keywords alongside extracted content for building search indexes.
Streaming Support
Use JSONL content type to stream results as each page is crawled and processed. Start consuming data immediately.
Code Examples
from spider import Spider
client = Spider()
# Search for a topic and get content from top results
results = client.search(
"RAG best practices for LLMs",
params={
"search_limit": 5,
"return_format": "markdown",
"metadata": True,
}
)
for page in results:
print(f"{page['url']}: {page['metadata']['title']}") curl -X POST https://api.spider.cloud/search \
-H "Authorization: Bearer $SPIDER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"search": "cloud hosting providers",
"search_limit": 10,
"return_format": "markdown",
"country_code": "de"
}' import Spider from "@spider-cloud/spider-client";
const client = new Spider();
const results = await client.search("cloud hosting providers", {
search_limit: 10,
return_format: "markdown",
metadata: true,
});
results.forEach(r => console.log(r.url, r.metadata?.title)); Popular Use Cases
RAG with Live Web Data
Feed a user's question into the search API, retrieve relevant pages, and pass the content to an LLM for grounded, up-to-date answers. No need to maintain a static corpus.
Market Research
Search for competitor names, product categories, or industry terms and automatically collect the latest information from multiple sources in a single request.
Content Curation
Build automated pipelines that discover and aggregate the best articles on specific topics for newsletters, research reports, or knowledge bases.
SERP Monitoring
Track how search results change over time for keywords that matter to your business. Compare results across regions using country_code targeting.