Skip to content

Schema of raw data v2.3.0

Note

The Parse.ly Data Pipeline schema is additive only. This means that columns will never be removed and only new columns will be added. Parse.ly Data Pipeline customers receive notifications about upcoming additive schema updates.

JSON format

Raw data accessed via S3 (bulk) or Kinesis (streaming) consists of lines of JSON objects, also known as JSONLines. This format is easy to parse in programming languages, cloud SQL engines, and big data tools.

This page describes the schema of these JSON records (keys and values) for interpreting raw events.

Example JSON page view record

The following example shows a page view record from a Parse.ly site with alphabetically sorted keys.

{
  "action" : "pageview",
  "apikey" : "example.com",
  "campaign_id" : "facebook",
  "channel" : "website",
  "display" : true,
  "display_avail_height" : 735,
  "display_avail_width" : 1280,
  "display_pixel_depth" : 24,
  "display_total_height" : 800,
  "display_total_width" : 1280,
  "engaged_time_inc" : null,
  "event_id" : "0xe6508eda93d5598367b18555ae9b828d",
  "extra_data" : {"subscriber_type" : "premium"},
  "flags_is_amp" : false,
  "ip_city" : "Newark",
  "ip_continent" : "NA",
  "ip_country" : "US",
  "ip_lat" : 37.5147,
  "ip_lon" : -122.0423,
  "ip_postal" : "94560",
  "ip_subdivision" : "CA",
  "ip_timezone" : "America/Los_Angeles",
  "ip_market_name" : "New York",
  "ip_market_nielsen" : "501",
  "ip_market_doubleclick" : "3",
  "metadata" : true,
  "metadata_authors" : [
    "Laura Vitto"
  ],
  "metadata_canonical_url" : "http://example.com/2020/08/07/airpods-ftw/",
  "metadata_custom_metadata" : "{"page" : 1,"omnitureData" : {"channel" : "watercooler","content_type" : "article","v_buy" : null,"v_buy_i" : null,"h_pub" : 0.0,"h_buy" : null,"h_pub_buy" : null,"v_cur" : 0.0,"v_max" : 0.0,"v_cur_i" : 0,"v_max_i" : 0,"events" : "event51,event61","top_channel" : "watercooler","content_source_type" : "Internal - Editorial Series","content_source_name" : "Apple iPhone 7 Event","author_name" : "Laura Vitto","age" : "0","pub_day" : 7,"pub_month" : 9,"pub_year" : 2020,"pub_date" : "08/07/2020","sourced_from" : "Internal","isPostView" : true,"post_lead_type" : "No Lead Image","topics" : "Apple,Gadgets,iPhone 7,Watercooler","campaign" : null,"display_mode" : null,"viral_video_type" : null,"standalone_video_show" : null,"b_flag" : false}}",
  "metadata_duration" : null,
  "metadata_full_content_word_count" : 174,
  "metadata_image_url" : "http://a.amz.mshcdn.com/media/ZgkyMDE2LzA5LzA3LzU2L0NyeFhpNjNYRUFBSnZwRS5lNDAyMy5qcGcKcAl0aHVtYgkxMjAweDYzMAplCWpwZw/156d0173/3ae/CrxXi63XEAAJvpE.jpg",
  "metadata_page_type" : "post",
  "metadata_post_id" : "http://example.com/2020/08/07/airpods-ftw/",
  "metadata_pub_date_tmsp" : 1473275118000,
  "metadata_save_date_tmsp" : 1473275204000,
  "metadata_section" : "watercooler",
  "metadata_share_urls" : null,
  "metadata_tags" : [
    "parsely_smart:entity:Breathability",
	"parsely_smart:entity:Lyocell",
	"parsely_smart:entity:Perspiration",
	"parsely_smart:entity:Textile",
	"parsely_smart:iab:Needlework",
	"sleep week 2022",
	"underscored explore",
	"underscored lifestyle"
  ],
  "metadata_data_source" : "crawl",
  "metadata_thumb_url" : "https://images.parsely.com/xY9xNBMulGDKRMzfKaUQzs7A9PA=/160x160/smart/http%3A//a.amz.mshcdn.com/media/ZgkyMDE2LzA5LzA3LzU2L0NyeFhpNjNYRUFBSnZwRS5lNDAyMy5qcGcKcAl0aHVtYgkxMjAweDYzMAplCWpwZw/156d0173/3ae/CrxXi63XEAAJvpE.jpg",
  "metadata_title" : "Everyone has the same fear about Apple's new earbuds",
  "metadata_urls" : [
    "http://example.com/2020/08/07/airpods-ftw/"
  ],
  "pageload_id" : "b510edbe-84eb-47b6-aa35-9843b5d3b579",
  "pageview_id" : "ae2badca-d81f-467d-b5fc-5d8y45f08ff6",
  "ref_category" : "internal",
  "ref_clean" : "http://example.com/",
  "ref_domain" : "example.com",
  "ref_fragment" : "",
  "ref_netloc" : "example.com",
  "ref_params" : "",
  "ref_path" : "/",
  "ref_query" : "",
  "ref_scheme" : "http",
  "referrer" : "http://example.com/",
  "session" : true,
  "session_id" : 6,
  "session_initial_referrer" : "http://example.com/",
  "session_initial_url" : "http://example.com/",
  "session_last_session_timestamp" : 1473271351611,
  "session_timestamp" : 1473277747806,
  "schema_version" : "2.3.0",
  "slot" : false,
  "sref_category" : "internal",
  "sref_clean" : "http://example.com/",
  "sref_domain" : "example.com",
  "sref_fragment" : "",
  "sref_netloc" : "example.com",
  "sref_params" : "",
  "sref_path" : "/",
  "sref_query" : "",
  "sref_scheme" : "http",
  "surl_clean" : "http://example.com/",
  "surl_domain" : "example.com",
  "surl_fragment" : "",
  "surl_netloc" : "example.com",
  "surl_params" : "",
  "surl_path" : "/",
  "surl_query" : "",
  "surl_scheme" : "http",
  "surl_utm_campaign" : "facebook_campaign",
  "surl_utm_term" : "8908",
  "surl_utm_medium" : "partners",
  "surl_utm_source" : "facebook",
  "surl_utm_content" : "sports/baseball",
  "timestamp_info" : true,
  "timestamp_info_nginx_ms" : 1473277850000,
  "timestamp_info_override_ms" : null,
  "timestamp_info_pixel_ms" : 1473277850017,
  "ts_action" : "2020-08-07 19:50:50",
  "ts_session_current" : "2020-08-07 19:49:07",
  "ts_session_last" : "2020-08-07 18:02:31",
  "ua_browser" : "Safari",
  "ua_browserversion" : "9.1.2",
  "ua_device" : "Other",
  "ua_devicebrand" : null,
  "ua_devicemodel" : null,
  "ua_devicetouchcapable" : false,
  "ua_devicetype" : "desktop",
  "ua_os" : "Mac OS X",
  "ua_osversion" : "10.10.5",
  "url" : "http://example.com/2020/08/07/airpods-ftw/#L.eZPflSGqq5",
  "url_clean" : "http://example.com/2020/08/07/airpods-ftw/",
  "url_domain" : "example.com",
  "url_fragment" : "L.eZPflSGqq5",
  "url_netloc" : "example.com",
  "url_params" : "",
  "url_path" : "/2020/08/07/airpods-ftw/",
  "url_query" : "",
  "url_scheme" : "http",
  "user_agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/601.7.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.7.7",
  "utm_campaign" : "facebook_campaign",
  "utm_term" : "8908",
  "utm_medium" : "partners",
  "utm_source" : "facebook",
  "utm_content" : "sports/baseball",
  "version" : 1,
  "videostart_id" : "be0badca-d81e-467d-a5fc-5d7a45f08ff5",
  "visitor" : true,
  "visitor_ip" : "108.225.131.20",
  "visitor_network_id" : "None",
  "visitor_site_id" : "zp94fd56-a400-8210-4b23-zb4348207c43"
}

These key-value pairs are typically strings, but occasionally also numbers, null, or booleans (true / false).

Example JSON conversion record

Notice that the only differences between this record and the page view example record are the action and extra_data columns. The keys are alphabetically sorted.

{
  "action" : "conversion",
  "apikey" : "example.com",
  "campaign_id" : "facebook",
  "channel" : "website",
  "display" : true,
  "display_avail_height" : 735,
  "display_avail_width" : 1280,
  "display_pixel_depth" : 24,
  "display_total_height" : 800,
  "display_total_width" : 1280,
  "engaged_time_inc" : null,
  "event_id" : "0xe6508eda93d5598367b18555ae9b828d",
  "extra_data" :  {
    "_conversion_type" : "newsletter_signup",
    "_conversion_label" : "Weekly Email Newsletter"
  },
  "flags_is_amp" : false,
  "ip_city" : "Newark",
  "ip_continent" : "NA",
  "ip_country" : "US",
  "ip_lat" : 37.5147,
  "ip_lon" : -122.0423,
  "ip_postal" : "94560",
  "ip_subdivision" : "CA",
  "ip_timezone" : "America/Los_Angeles",
  "ip_market_name" : "New York",
  "ip_market_nielsen" : "501",
  "ip_market_doubleclick" : "3",
  "metadata" : true,
  "metadata_authors" : [
    "Laura Vitto"
  ],
  "metadata_canonical_url" : "http://example.com/2020/08/07/airpods-ftw/",
  "metadata_custom_metadata" : "{"page" : 1,"omnitureData" : {"channel" : "watercooler","content_type" : "article","v_buy" : null,"v_buy_i" : null,"h_pub" : 0.0,"h_buy" : null,"h_pub_buy" : null,"v_cur" : 0.0,"v_max" : 0.0,"v_cur_i" : 0,"v_max_i" : 0,"events" : "event51,event61","top_channel" : "watercooler","content_source_type" : "Internal - Editorial Series","content_source_name" : "Apple iPhone 7 Event","author_name" : "Laura Vitto","age" : "0","pub_day" : 7,"pub_month" : 9,"pub_year" : 2020,"pub_date" : "08/07/2020","sourced_from" : "Internal","isPostView" : true,"post_lead_type" : "No Lead Image","topics" : "Apple,Gadgets,iPhone 7,Watercooler","campaign" : null,"display_mode" : null,"viral_video_type" : null,"standalone_video_show" : null,"b_flag" : false}}",
  "metadata_duration" : null,
  "metadata_full_content_word_count" : 174,
  "metadata_image_url" : "http://a.amz.mshcdn.com/media/ZgkyMDE2LzA5LzA3LzU2L0NyeFhpNjNYRUFBSnZwRS5lNDAyMy5qcGcKcAl0aHVtYgkxMjAweDYzMAplCWpwZw/156d0173/3ae/CrxXi63XEAAJvpE.jpg",
  "metadata_page_type" : "post",
  "metadata_post_id" : "http://example.com/2020/08/07/airpods-ftw/",
  "metadata_pub_date_tmsp" : 1473275118000,
  "metadata_save_date_tmsp" : 1473275204000,
  "metadata_section" : "watercooler",
  "metadata_share_urls" : null,
  "metadata_tags" : [
    "gadgets",
    "iphone-7",
    "watercooler",
    "apple"
  ],
  "metadata_data_source" : "crawl",
  "metadata_thumb_url" : "https://images.parsely.com/xY9xNBMulGDKRMzfKaUQzs7A9PA=/160x160/smart/http%3A//a.amz.mshcdn.com/media/ZgkyMDE2LzA5LzA3LzU2L0NyeFhpNjNYRUFBSnZwRS5lNDAyMy5qcGcKcAl0aHVtYgkxMjAweDYzMAplCWpwZw/156d0173/3ae/CrxXi63XEAAJvpE.jpg",
  "metadata_title" : "Everyone has the same fear about Apple's new earbuds",
  "metadata_urls" : [
    "http://example.com/2020/08/07/airpods-ftw/"
  ],
  "pageload_id" : "b510edbe-84eb-47b6-aa35-9843b5d3b579",
  "pageview_id" : "ae2badca-d81f-467d-b5fc-5d8y45f08ff6",
  "ref_category" : "internal",
  "ref_clean" : "http://example.com/",
  "ref_domain" : "example.com",
  "ref_fragment" : "",
  "ref_netloc" : "example.com",
  "ref_params" : "",
  "ref_path" : "/",
  "ref_query" : "",
  "ref_scheme" : "http",
  "referrer" : "http://example.com/",
  "session" : true,
  "session_id" : 6,
  "session_initial_referrer" : "http://example.com/",
  "session_initial_url" : "http://example.com/",
  "session_last_session_timestamp" : 1473271351611,
  "session_timestamp" : 1473277747806,
  "schema_version" : "2.3.0",
  "slot" : false,
  "sref_category" : "internal",
  "sref_clean" : "http://example.com/",
  "sref_domain" : "example.com",
  "sref_fragment" : "",
  "sref_netloc" : "example.com",
  "sref_params" : "",
  "sref_path" : "/",
  "sref_query" : "",
  "sref_scheme" : "http",
  "surl_clean" : "http://example.com/",
  "surl_domain" : "example.com",
  "surl_fragment" : "",
  "surl_netloc" : "example.com",
  "surl_params" : "",
  "surl_path" : "/",
  "surl_query" : "",
  "surl_scheme" : "http",
  "surl_utm_campaign" : "facebook_campaign",
  "surl_utm_term" : "8908",
  "surl_utm_medium" : "partners",
  "surl_utm_source" : "facebook",
  "surl_utm_content" : "sports/baseball",
  "timestamp_info" : true,
  "timestamp_info_nginx_ms" : 1473277850000,
  "timestamp_info_override_ms" : null,
  "timestamp_info_pixel_ms" : 1473277850017,
  "ts_action" : "2020-08-07 19:50:50",
  "ts_session_current" : "2020-08-07 19:49:07",
  "ts_session_last" : "2020-08-07 18:02:31",
  "ua_browser" : "Safari",
  "ua_browserversion" : "9.1.2",
  "ua_device" : "Other",
  "ua_devicebrand" : null,
  "ua_devicemodel" : null,
  "ua_devicetouchcapable" : false,
  "ua_devicetype" : "desktop",
  "ua_os" : "Mac OS X",
  "ua_osversion" : "10.10.5",
  "url" : "http://example.com/2020/08/07/airpods-ftw/#L.eZPflSGqq5",
  "url_clean" : "http://example.com/2020/08/07/airpods-ftw/",
  "url_domain" : "example.com",
  "url_fragment" : "L.eZPflSGqq5",
  "url_netloc" : "example.com",
  "url_params" : "",
  "url_path" : "/2020/08/07/airpods-ftw/",
  "url_query" : "",
  "url_scheme" : "http",
  "user_agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/601.7.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.7.7",
  "utm_campaign" : "facebook_campaign",
  "utm_term" : "8908",
  "utm_medium" : "partners",
  "utm_source" : "facebook",
  "utm_content" : "sports/baseball",
  "version" : 1,
  "videostart_id" : "be0badca-d81e-467d-a5fc-5d7a45f08ff5",
  "visitor" : true,
  "visitor_ip" : "108.225.131.20",
  "visitor_network_id" : "None",
  "visitor_site_id" : "zp94fd56-a400-8210-4b23-zb4348207c43"
 }

For more documentation on how to set up conversions and exactly what data can be sent, please see the conversion integration documentation.

Base event fields

namedescriptionexample value
actionEvent type identifier“pageview”
apikeySite identifier“example.com”
channelThe channel source such as website, fbia (Facebook Instant Article), amp, apln-rta (Apple News Realtime)website
referrerRaw referring URLhttps://www.facebook.com/instantarticles#v1
user_agentRaw User-Agent (UA) string“Mozilla/5.0 (iPhone; CPU … Safari/601.1”
urlRaw URL on which action occurredhttp://example.com//2020/08/07/airpods-ftw#id=1
visitor_site_idVisitor first-party site identifier“0beabdd1-7b0c-423b-9fae-660101fc8953”
engaged_time_incEngaged time in seconds; only available where action = heartbeat or vheartbeat10

These required fields come from integration with Parse.ly’s data collection infrastructure, whether that’s:

These fields appear in every single event, regardless of event type or source. Note that excluding the session_id and the visitor_ip fields is possible, though all Parse.ly integrations attempt to support these fields to the best of their ability.

On one-time historical imports

One-time imports of historical page view (or other) event data from legacy web analytics systems are possible, but require custom work on Parse.ly’s side. Equivalents for the “Base Event” fields must exist to make sense of historical data.

Timestamp fields

Parse.ly records two raw timestamps per event. One comes from Parse.ly’s data collection servers, and the other comes from Parse.ly’s client-side trackers. These are stored as numbers that represent seconds since the UNIX epoch, also known as UNIX time. Parse.ly’s server clocks are in UTC. Pulling data spanning multiple days may be necessary for timezone conversions.

namedescriptionexample value
timestamp_infoFlag to indicate if timestamp info is availabletrue
timestamp_info_nginx_msThe automatic server-side event timestamp1493598778000
timestamp_info_pixel_msThe automatic client_side event timestamp1493598778538
timestamp_info_override_msA client-side override timestamp1493598778000
ts_actionDate/time of the event. This is a formatted date/time of the timestamp_info_nginx_ms in UTC2020-08-07 00:32:58
ts_session_currentDate/time of the current session derived from timestamp_info_pixel_ms2020-08-07 00:30:00
ts_session_lastDate/time of the previous session2020-08-07 20:22:47

Parse.ly’s server-side timestamp is generally more reliable than the client-side timestamp. However, the client-side timestamp provides greater precision in certain scenarios.

Parse.ly’s nginx (server-side) timestamp is at second resolution, whereas Parse.ly’s pixel (client-side) timestamp is at millisecond resolution. When a pixel timestamp is within a few seconds of the corresponding nginx timestamp, it is likely more accurate since it represents when the event was sent (at millisecond resolution) rather than when the event was received (at second resolution). With Parse.ly’s standard JavaScript tracker, both nginx and pixel are always captured together, so combining them makes JavaScript tracker-based events as accurate as possible.

In mobile SDKs for iOS and Android, it is common to “batch” events if devices are offline. These are also known as “late-arriving” events. In these cases, neither the auto-generated server-side timestamp (in nginx) nor the auto-generated client-side timestamp (in pixel) can be trusted; instead, the client-side override timestamp may be a more accurate representation of reality. The mobile SDK populates these by filling a ts field in the data key-value object sent with every event.

On timezones

Parse.ly’s JavaScript tracker populates the client-side timestamp using newDate().getTime(), which means it is in UTC. Parse.ly’s server clocks are also in UTC, making these timestamps comparable. However, note that the UNIX time itself does not embed any timezone information. It simply represents the number of seconds since a specific UTC time in the past, the UNIX epoch. The user’s local timezone can be inferred from their IP address based on estimated geography. Combining these fields allows interpretation of the user’s local time.

ID field

namedescriptionexample value
event_idUnique event identifier string“0xe6508eda93d5598367b18555ae9b828d”

A unique, hex-encoded ID string is also generated for each Event. This property can be used to deduplicate events for easier ingestion and processing.

This unique ID is generated by hashing the values of apikey, action, url, timestamp (internal, generated property), visitor_site_id, and timestamp_info_pixel_ms. To ensure that each event_id is truly unique, all events sent to Parse.ly must provide these required fields (excluding timestamp, which Parse.ly generates) at an appropriate level of cardinality and granularity.

The following examples show the relationship between event_id, pageload_id, pageview_id, and videostart_id:

Identifier fields and correct usage

namedescriptionexample value
pageload_idUnique identifier string“b510edbe-84eb-47b6-aa35-9843b5d3b579”
pageview_idUnique identifier string“ae2badca-d81f-467d-b5fc-5d8y45f08ff6”
videostart_idUnique identifier string“be0badca-d81e-467d-a5fc-5d7a45f08ff5”

Factors such as browser, device type, and integration age may impact the uniqueness of these identifiers. To avoid overlaps, combine these identifiers with the following fields to ensure they remain distinct across events.

pageview_unique_key = hash(pageview_id, apikey, date(ts_action), visitor_site_id, session_id)
pageload_unique_key = hash(pageload_id, apikey, date(ts_action), visitor_site_id, session_id)
videostart_unique_key = hash(videostart_id, apikey, date(ts_action), visitor_site_id, session_id)

The following table shows how to match videos and page views with their corresponding engaged time:

apikeyactionengaged_time_inctimestampvisitor_site_idurlevent_idpageload_unique_key|pageview_unique_keyvideostart_unique_key*
blog.parsely.compageview 43657.87559027789504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/|189c990e267e1503240044ca34b6e110|90569554|71317637|
blog.parsely.comheartbeat1043657.87570601859504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/|1290e680a7d03b7246c0e2c321c020ab|90569554|71317637|
blog.parsely.comvideostart 43657.87569444449504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/|f9f9fd997c48451c6ac899fd45fd10d8|90569554|71317637|18256298
blog.parsely.comvheartbeat2243657.87665509269504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/|ec3e85eea7df7cc5e5ddc1bb2ecc417a|90569554|71317637|18256298
blog.parsely.comvideostart 43657.8773495379504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/|d2c310ecd4ed9b70e65f51fe8799b5bd|90569554|71317637|85818511
blog.parsely.comvheartbeat3543657.87782407419504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/|24ec5ed54c3713c7e4a964735fa6b449|90569554|71317637|85818511
*indicates derived field

The following table shows how to match page views and engaged time for slide shows where a page reload is not triggered. This example can also apply to infinite scroll pages:

apikeyactionengaged_time_inctimestampvisitor_site_idurlevent_idpageload_unique_key|pageview_unique_keyvideostart_unique_key*
blog.parsely.compageview 43657.87559027789504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/slide=1|189c990e267e1503240044ca34b6e110|90569554|71317637|
blog.parsely.comheartbeat1043657.87570601859504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/slide=1|1290e680a7d03b7246c0e2c321c020ab|90569554|71317637|
blog.parsely.compageview 43657.87569444449504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/slide=2|f9f9fd997c48451c6ac899fd45fd10d8|90569554|85818511|
blog.parsely.comheartbeat743657.87665509269504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/slide=2|ec3e85eea7df7cc5e5ddc1bb2ecc417a|90569554|85818511|
blog.parsely.compageview 43657.8773495379504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/slide=3|d2c310ecd4ed9b70e65f51fe8799b5bd|90569554|45627841|
blog.parsely.comheartbeat443657.87782407419504b04c-b95e-4fff-9225-11f13b96218bhttps://blog.parse.ly/post/7786/three-ways-value-engaged-time/slide=3|24ec5ed54c3713c7e4a964735fa6b449|90569554|45627841|
*indicates derived field

Visitors

namedescriptionexample value
visitorFlag to indicator if visitor info is availabletrue
visitor_site_idVisitor first-party site identifier“0beabdd1-7b0c-423b-9fae-660101fc8953”
visitor_network_id[Deprecated]NULL

The visitor_site_id is set by a first-party cookie and is unique to each browser. The visitor_network_id was formerly a third-party cookie but has been removed due to privacy concerns. The field remains for backwards compatibility and will always be NULL.

Session enrichments

Parse.ly’s JavaScript tracker automatically creates useful session information for user session analysis. For one thing, Parse.ly’s session_id also doubles as a “number of visits” value, since it’s an auto-incrementing integer that starts at 1 and moves up by one for every new visit by a visitor with the same visitor_site_id.

Note that these enrichments are performed client-side by Parse.ly’s JavaScript tracker; they will not apply to events received via other integrations.

The other fields stored with the session are described below:

namedescriptionexample value
session_idThe raw URL of the first page view event of this session1
session_initial_referrerThe raw referring URL of the first page view event of this sessionhttp://facebook.com
session_initial_urlthe raw URL of the first page view event of this sessionhttp://example.com/1234#d3d
session_last_session_timestampTimestamp of the last visit, or 0 if none0
session_timestampTimestamp of the first page view event of this session1466214847371
sessionFlag to indicate if session info is availabletrue

Timestamp enrichments

Based on the timestamp fields, Parse.ly creates an important field called ts_action. This field reinterprets timestamp_info_nginx_ms (Parse.ly’s server time) as a formatted date string that is highly compatible with many systems. For example, it is the same format expected by Amazon Redshift and Google BigQuery’s JSON value parsers.

  • ts_action: "2025-08-07 02:03:24"

This value is derived from epoch time 1754532204; it also lacks timezone information but can be interpreted as a UTC time. Including timezone information, as one might for the “full” ISO8601 standard, makes this string incompatible with some SQL engines, so Parse.ly uses a maximally compatible format instead.

Geo IP enrichments

Based on the visitor_ip field, Parse.ly enriches the following:

namedescriptionexample value
ip_continentContinent from GeoIP“NA”
ip_countryCountry from GeoIP“US”
ip_cityCity from GeoIP“New York”
ip_latLatitude from GeoIP (postal code granularity)40.676
ip_lonLongitude from GeoIP (postal code granularity)-73.963
ip_postalPostal code from GeoIP“11238”
ip_subdivisionSubdivision (e.g. US state) from GeoIP“NY”
ip_timezoneTime Zone of visitor based on GeoIP“America/New_York”
ip_market_nameNielsen DMA name (see note below)“New York”
ip_market_nielsenNielsen DMA ID (see note below)“501”
ip_market_doubleclickGoogle DoubleClick DMA ID (see note below)“3”

On Nielsen-designated market areas (DMA)

ip_market_name, ip_market_nielsen, and ip_market_doubleclick all refer to Nielsen Designated Market Areas, which are only defined in the United States. This means these fields will only be populated for events that originate from U.S.-based IP addresses

URL and referrer enrichments

Based on the url, referrer, session_initial_url and session_initial_referrer fields, Parse.ly provides several enrichments. For illustration, the following examples use values:

fieldvalue
urlhttps://www.example.com/article-1234?campaignid=1234#fragment
referrerhttps://www.google.ca/
session_initial_urlhttps://www.example.com/article-1234?campaignid=1234#fragment
session_initial_referrerhttps://www.google.ca/

On URL parsing

Attributes added to parsed URLs, such as: fragment, netloc, params, query, and scheme adhere to RFC 1808

namedescriptionexample value
url_cleanCleaned url (strip query/fragment)https://www.example.com/article-1234
url_domainurl parsed domain, matched against TLD list“example.com”
url_fragmentFragment portion of url“fragment”
url_netlocNetloc portion of url“www.example.com”
url_paramsParams portion of url“”
url_pathPath portion of url“/article-1234”
url_queryQuery portion of url“campaignid=1234”
url_schemeScheme portion of url“https”
ref_categoryreferrer category (traffic source categorization)“search”
ref_cleanClean referrer URL (strip query/fragment)https://www.google.ca/
ref_domainreferrer parsed domain, matched against TLD list“google.ca”
ref_fragmentFragment portion of referrer“”
ref_netlocNetloc portion of referrer“www.google.ca”
ref_paramsParams portion of referrer“”
ref_pathPath portion of referrer“/”
ref_queryQuery portion of referrer“”
ref_schemeScheme portion of referrer“https”
surl_cleanCleaned session_initial_url (strip query/fragment)https://www.example.com/article-1234
surl_domainsession_initial_url parsed domain, matched against TLD list“example.com”
surl_fragmentFragment portion of session_initial_url“fragment”
surl_netlocNetloc portion of session_initial_url“www.example.com”
surl_paramsParams portion of session_initial_url“”
surl_pathPath portion of session_initial_url“/article-1234”
surl_queryQuery portion of session_initial_url“campaignid=1234”
surl_schemeScheme portion of session_initial_url“https”
sref_categorySession referrer category (traffic source categorization)“search”
sref_cleanClean session referrer URL (strip query/fragment)https://www.google.ca/
sref_domainReferrer parsed domain, matched against TLD list“google.ca”
sref_fragmentFragment portion of session_initial_referrer“”
sref_netlocNetloc portion of session_initial_referrer“www.google.ca”
sref_paramsParams portion of session_initial_referrer“”
sref_pathPath portion of session_initial_referrer“/”
sref_queryQuery portion of session_initial_referrer“”
sref_schemeThe utm_campaign specified in the session_initial_url“https”
surl_utm_campaignThe utm_content specified in the session_initial_url“subscriber_newsletter”
surl_utm_contentThe utm_medium specified in the session_initial_url“template_a”
surl_utm_mediumThe utm_source specified in the session_initial_url“email”
surl_utm_sourceThe utm_term specified in the session_initial_url“newsletter_2020-08-07”
surl_utm_termthe utm_term specified in the session_initial_url“footer”

Metadata

Whether crawled via JSON-LD or meta tags or passed directly in pixels (as is the case in Parse.ly’s video integration), metadata associated with the url field is passed along in a series of metadata_ fields:

namedescriptionexample value
metadataFlag to indicate if metadata is availabletrue
metadata_authorsArray of authors for the post/parse-ly-video-tracking/[“Albert Einstein”, “Richard Feynman”]
metadata_canonical_urlThe canonical URL of a post, or in the case of videos, the video IDhttp://www.example.com/article-1234
metadata_pub_date_tmspPublish date of the post in milliseconds since the UNIX epoch1471392000000
metadata_custom_metadataString of optional custom metadata (for more information, see the integration docs“{“internal_post_id”: “2134”}”
metadata_sectionSection the post/parse-ly-video-tracking/ was published in“Physics”
metadata_tagsArray of tags associated with the post/parse-ly-video-tracking/[“science”, “physics”, “quantum mechanics”]
metadata_titleTitle of the post/parse-ly-video-tracking/“Thoughts on Quantum Electrodynamics”
metadata_image_urlURL to image for the post/parse-ly-video-tracking/https://www.evernote.com/l/AAFSrhKOoExCqKji3f9BS9YKfZEC-yerafgB/image.png
metadata_full_content_word_countWord count of the post (irrelevant for videos)1562
metadata_data_sourceHow the metadata was collected, i.e., ‘crawl’, ‘pixel’, etc.“crawl”
metadata_urlsThe aliased URLs that the post lives on (i.e., Google AMP, http://m., main page) that reference the metadata_canonical_urlhttps://m.google.com/article
metadata_post_idThe post id of the article. This is the unique id of a post when the metadata exists99999
metadata_share_urlsThe social share URLs of the post in a comma-separated list. Share links are from: Facebook, LinkedIn, Pinterest, and Twitter[“http://example.com/post”,”http://example.com/post”,”http://example.com/post”,”http://example.com/post”,”http://example.com/post”]
metadata_page_typeType of page (i.e., post, section, frontpage, etc)“post”
metadata_save_date_tmspSave date of the post in milliseconds (epoch format)1471392000000
metadata_thumb_urlthe url of the thumbnail image for the posthttps://images.example.com/imagelocation

UA and device enrichments

Based on the ua field, Parse.ly enriches the following:

namedescriptionexample value
ua_browserBrowser derived from UA“Mobile Safari”
ua_browserversionBrowser version derived from UA“9.1.2”
ua_devicebrandDevice brand derived from UA“Apple”
ua_devicemodelDevice model derived from UA“iPhone”
ua_devicetouchcapableFlag to indicate if the device is touch-capabletrue
ua_devicetypeDevice type (mobile/tablet/desktop) from UA“mobile”
ua_osDevice operating system from UA“iOS”
ua_osversionDevice operating system version from UA“9.3”

Parse.ly also provides information regarding the display of the device:

namedescriptionexample value
displayFlag to indicate if display info is availabletrue
display_avail_heightAvailable height of the display, in pixels (equivalent to JavaScript’s screen.availHeight property)877
display_avail_widthAvailable width of pixels (equivalent to JavaScript’s screen.availWidth property)1436
display_pixel_depthColor resolution (in bits per pixel)24
display_total_heightTotal height of the display, in pixels900
display_total_widthTotal width of the display, in pixels1440
slotFlag to indicate if the slot position on the page is availabletrue

UTM parameter enrichments

Based on the url field, Parse.ly enriches the following from its query parameters. Note that UTM parameters are a web-wide de facto standard for campaign tracking, first introduced by Urchin and Google Analytics. Google runs a free tool called the URL builder to build URLs with this format, but many tools will automatically add these parameters to allow for easier tracking, especially in places where HTTP referrers are not automatically set.

In this example, the article URL, http://example.com/1234 was clicked from an email newsletter. It might then have had query parameters like the following (scroll to read):

>https://example.com/1234?utm_source=newsletter_2020-08-07&utm_medium=email&utm_term=footer&utm_content=template_a&utm_campaign=subscriber_newsletter

Which would be parsed as follows:

namedescriptionexample value
campaign_idCampaign identifier or name“subscribers_email”
utm_campaignCampaign identifier or name“subscriber_newsletter”
utm_contentTemplate or style (e.g. for A/B tests)“template_a”
utm_mediumMedium campaign ran on (e.g. email, social)“email”
utm_sourceThe specific identifier for the source content“newsletter_2020-08-07”
utm_termA keyword or term associated with the click“footer”

UTM parameter tracking is powerful because it allows grouping, rollup, and slice-and-dice of campaigns, which often have associated costs and can be part of an ROI calculation. It also helps tremendously with decoding “direct” traffic; e.g., in many email service providers, the above click from an email newsletter would have no HTTP referrer set, and thus UTM parameters would be the only way to understand this traffic.

Extra data

Arbitrary key-value pairs can be passed via Parse.ly’s dynamic tracking or Parse.ly’s implementation for custom segments. Such custom data may include subscriber information or IDs for use in joining to other data sources. In these situations, key/value pairs appear as a nested JSON object in the extra_data field.

As part of an ETL process, these fields can be “flattened” into the root document format for inclusion in downstream databases storing Parse.ly raw data.

  • "action": "_scroll"
  • "extra_data": {"_y": 1430}

In this example, a custom event (_scroll) was sent to Parse.ly’s Data Pipeline with associated custom data {"_y": 1430} representing 1,430 pixels on the y-axis of scroll-depth within the browser. This kind of raw data could be used to implement scroll depth tracking.

Other possibilities

This raw data schema is already quite rich and supports many queries not available in the Parse.ly Dashboard or APIs. Additional possibilities for storing data in raw events include:

  • subscriber identifiers, to do detailed loyalty analysis
  • more granular information about on-page or in-app activities
  • a specialized set of query parameters for social virality modeling
  • ad impression or revenue data
  • and other custom data

Next steps

Read on for Code Examples.

Or, get help from Parse.ly:

  • For existing Parse.ly customers, contact Parse.ly to discuss advanced use cases for raw data.
  • For organizations not yet Parse.ly customers, start with the basic integration or schedule a demo to learn about advanced use cases Parse.ly customers have implemented.

Looking for a previous schema version?

This documentation refers to the latest versions of Parse.ly’s Data Pipeline v.2.30. For earlier versions (data prior to October 2019) of the Data Pipeline, see the legacy schema documentation.

Last updated: December 31, 2025