Search API Reference v1
Last updated: 2022-04-20


Our Search API & UI allows you to find archived scans of URLs on urlscan.io. This page is a reference for the available fields that can be used to query the API. Please see explanations about the field types and visibility below!

Query String Syntax and General Instructions

  • Search requests (through the UI or API) are subject to your individual Search API Quotas. Make sure to use your API key.
  • The query field uses the ElasticSearch Query String to search for results.
  • All queries are run in filter mode, sorted by date with the more recent scans first. There is no scoring of search results.
  • You can group and concatenate search-terms with brackets
    ( )
    ,
    AND
    ,
    OR
    , and
    NOT
    . The default operator is
    AND
    .
  • You can concatenate terms within a group, e.g.
    page.domain:(foo.com OR bar.com)
    .
  • Always use the field names of the fields you want to search. Wildcards for the field-name are not supported! Field names are case sensitive!
  • Always escape reserved characters with backslash:
    + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
  • Limit the time-range if possible using
    date
    , e.g.
    date:>now-7d
    or
    date:>now-1y
    .
  • The
    date
    allows relative queries like
    date:>now-7d
    or range-queries like
    date:[2020-01-01 TO 2020-02-01]
    or both combined.
  • You can only use leading wildcard searches and regular expression searches on supported fields, and only as a signed-in user.
  • Everything is indexed as lowercase, even if the Search API returns values in a case-preserving manner.
  • Regular expressions are always anchored to beginning/end of the tokens (implicit ^ and $). Make sure to prefix/suffix with
    .*
    to match infix strings.
  • Domain fields contain the whole domain and each smaller domain component, i.e.
    domain
    can be searched by google.com which will find hits for www.google.com

Recent Changes

January 8, 2024 - System Labels, User Tags, Meta Hits

  • Additional Fields: We have added additional fields for searching:
    • meta: Contains meta information about the scan and matched searches, e.g. the IDs of Saved Searches that this item has matched.
    • labels: Contains system labels controlled by urlscan.
    • usertags: Contains user-defined tags applied by Saved Searches.

Field Type Legend

The available search fields can have different types, which dictate how they can be searched. It is important to understand the limits of each type to fully utilise our Search API.

  • — Field can be searched and is returned in the search results.
  • — Field can be searched, but the original value is not returned in the search results.
  • keyword — Field is analysed as one keyword, use an exact value or a trailing wildcard search to search. Indexed as lowercase.
  • keyword RE — Field can be searched by regular expression and leading wildcard. Indexed as lowercase.
  • text — Field is analysed as text, broken into multiple tokens (e.g. split on slash in the URL). Phrase search with quotes possible. Indexed as lowercase.
  • date — Field is analysed as date, allowing range-queries and date math, e.g.
    date:>now-24h
    .
  • ip — Search by an IPv4 or IPv6, either using an exact IP or a subnet definition like
    ip:8.8.8.8\/24
    .
  • domain — Search by a domain or parent domain. You can search for
    www.foobar.com
    or just
    foobar.com
    and it will both find scans for www.foobar.com.
  • integer — Allows searching by exact value, range, or threshold, e.g.
    stats.uniqIPs:>5
    .

Searchable Fields

Field Name Type Field semantics, features, & notes
apikey virtual Scans submitted using one of your API keys (Can only be me)
asn keyword Any of the AS numbers that were contacted (e.g. AS123)
asnname text Any of the AS Names that were contacted
asnname.keyword keyword Any of the AS names that were contacted (analysed as keyword)
canonical.page.url keyword Canonicalized version of the page URL
canonical.task.url keyword Canonicalized version of the task URL
country keyword ISO 3166-1 2-letter country code of any country that was contacted
date date Datetime of when the scan was performed
domain domain Any domain and subdomain that was contacted
domain.keyword keywordRE Any domain and subdomain that was contacted
filename text Any URL that was requested
filename.keyword keywordRE Any URL that was requested
files.filename textRE Filename of file downloaded by the website
files.filename.keyword keyword Filename of file downloaded by the website (analyzed as keyword)
files.filesize integer Filesize of file downloaded by the website
files.mimeDescription text Files MIME type description
files.mimeType keywordRE MIME type description of file downloaded by the website
files.sha256 keyword SHA256 of file downloaded by the website
files.state keyword Download state of the file
files.url textRE URL of the file's location
hash keyword Any SHA256 hash of any HTTP response
ip ip Any IP that was contacted
page.apexDomain keyword Apex domain of the page — (New since v1.2)
page.apexDomainAgeDays integer Age of the page apex domain in days
page.asn keyword AS Number of the website
page.asnname text Name of the main AS of the website
page.asnname.keyword keyword Name of the main AS of the website (analyzed as keyword)
page.country keyword Primary IP GeoIP Country (ISO 3166-1 2-letter country code)
page.domain domain Primary Domain (Analysed as all levels of parent domains)
page.domain.keyword keywordRE Primary Domain (Analysed as keyword)
page.domainAgeDays integer Age of the page domain (hostname) in days
page.ip ip Primary IP
page.language keyword ISO-639 language code based on page text
page.mimeType keyword MIME type of the primary HTTP response — (New since v1.2)
page.ptr domain DNS PTR record of primary IP
page.redirected keyword Whether the page was redirected from task.url, can be one of same-domain, sub-domain, off-domain, https-only — (New since v1.2)
page.server text HTTP "Server" header of primary request
page.status keyword HTTP status code of primary request response
page.title text Title of the page — (New since v1.2)
page.title.keyword keywordRE Title of the page — (New since v1.2)
page.tlsAgeDays integer Age of TLS certificate when the page was scanned (in days) — (New since v1.2)
page.tlsIssuer keyword Issuer of the page TLS certificate — (New since v1.2)
page.tlsValidDays integer TLS certificate validity period in days — (New since v1.2)
page.tlsValidFrom date TLS certificate Valid-From date — (New since v1.2)
page.umbrellaRank integer Cisco Umbrella Top 1 Million rank of page domain — (New since v1.2)
page.url text URL of the primary page (after redirection)
page.url.keyword keywordRE URL of the primary page (after redirection, analysed as keyword)
server keyword Any HTTP "Server" header of subrequests
stats.dataLength integer Data size of all subresources
stats.encodedDataLength integer Transfer size of all subresources
stats.requests integer Number of subrequests
stats.uniqCountries integer Number of unique countries contacted
stats.uniqIPs integer Number of unique IPs contacted
task.apexDomain keyword Apex domain of the tasked hostname
task.domain domain Domain of the tasked URL
task.domain.keyword keywordRE Domain of the tasked URL (analysed as keyword)
task.method keyword Can be manual, api, or automatic
task.source keyword Examples: phishtank or certstream-suspicious
task.tags keyword User-defined tags supplied during scan submission
task.url text The original URL that was tasked
task.url.keyword keywordRE The original URL that was tasked (analysed as keyword)
task.uuid keyword The unique UUID of the scan
task.visibility keyword Can be one of public, unlisted, or private
team virtual Scans submitted by any of your teams (Can only be me)
user virtual Scans submitted by yourself (Can only be me)

urlscan Professional and Enterprise — The following fields can only be searched on the Professional, Enterprise, and Ultimate plans

brand.country keyword ISO 3166-1 2-letter country code of the brand
brand.key keyword Unique key of the brand
brand.name text Name of the brand
brand.name.keyword keyword Name of the brand (analyzed as keyword)
brand.vertical text Industry vertical of the brand, e.g. "banking"
brand.vertical.keyword keyword Industry vertical of the brand (analyzed as keyword)
content.cookieNames keyword Names of cookies set by page — (New since v1.2)
content.globalNames keyword Names of non-standard JavaScript global variables — (New since v1.2)
content.inputNames keyword Name attributes of input fields on page — (New since v1.2)
content.inputTypes keyword Type attributes of input fields on page — (New since v1.2)
content.storageNames keyword Names of items in localStorage and sessionStorage set by page — (New since v1.2)
content.technologies keyword Names of technologies detected according to Wappalyzer — (New since v1.2)
dom.hash keyword SHA256 hash of the DOM before truncation — (New since v1.2)
dom.size integer Size of the DOM before truncation — (New since v1.2)
frames.domains domain Domains of frames — (New since v1.2)
frames.length integer Number of frames — (New since v1.2)
frames.urls keywordRE URLs of frames / iFrames — (New since v1.2)
labels keyword High-level system labels applied by urlscan
links.domains domain Domains of outgoing links — (New since v1.2)
links.length integer Number of outgoing links — (New since v1.2)
links.urls keywordRE URLs of outgoing links (to different domains than page.domain) — (New since v1.2)
meta keyword IDs of matching subscriptions / saved searches
scanner.country keyword Scanner IP exit location (ISO 3166-1 2-letter country code) — (New since v1.2)
submitter.country keyword GeoIP country of the submission IP (ISO 3166-1 2-letter country code)
text.content text Visible text on the website, truncated to the first 20kB — (New since v1.2)
text.hash keyword SHA256 hash of the text before truncation — (New since v1.2)
text.size integer Size of the text content before truncation — (New since v1.2)
usertags keyword User-defined tags for Saved Searches that matched the scan
verdicts.community.malicious boolean The community verdict for a scan
verdicts.engines.malicious boolean ML malicious verdict
verdicts.engines.score integer ML score from -100 to 100
verdicts.lastVerdict date Date the latest verdict for this scan was added, only for verdicts created after the scan has finished — (New since v1.2)
verdicts.malicious boolean Whether a verdict exists for the page — (New since v1.2)
verdicts.score integer Maliciousness score of page from -100 (benign) to 100 (malicious) — (New since v1.2)
verdicts.urlscan.malicious boolean The urlscan malicious verdict
visible.brandname text Brand name the website claims to represent (urlscan Brand AI)
visible.brandname.keyword keyword Brand name the website claims to represent (analyzed as keyword)
websockets.domains domainRE Domains of WebSocket connections
websockets.length integer Number of WebSocket connections
websockets.totalBytesReceived integer Total bytes received over WebSocket connections
websockets.totalBytesSent integer Total bytes sent over WebSocket connections
websockets.urls keywordRE URLs of WebSocket connections