Check out our Medium blog post to learn more about how we built Data Tailor AI!

Overview

Data Tailor AI uses tracked data from Segment to help Large Language Models (LLMs) write hyper-personalized marketing emails for customers. With Data Tailor, marketers won’t need to define their own rules for audience creation, and end-customers benefit from receiving outreach that is greatly curated for what they’d want to purchase.

Preview

User journey and features

  • Customers perform actions on the app: As customers use web or mobile apps, their behaviors and actions are tracked by uploading them to Segment using the HTTP API source.
  • Events processed on Databricks: This events data is piped through Segment to Databricks (powered by AWS!) where it is cleaned, transformed, and loaded into SQL tables.
  • Prompt engineering and LLM generation: Also within Databricks, our app intelligently ingests data from those SQL tables and engineers a detailed prompt to send to the LLM (OpenAI’s GPT-4). It receives a customized email for that customer and stores it in a separate Databricks table.
  • Reverse ETL for use in Sendgrid: Periodically, these generated emails are sent from Databricks to SendGrid (email automation) through Segment’s Reverse ETL feature.
  • Emails sent to end-customers: Marketers can use the excellent tooling of SendGrid to send these customized emails out as a one-time send or part of a recurring campaign. This entire process is automated so marketers can set up a workflow to create nurture campaigns and more, powered by LLMs and AI.

Why we built Data Tailor AI

Problem: We identified a two-sided problem to solve. Marketers want to send personalized emails to customers to increase their intent to purchase, and customers want tailored outreach instead of generic email campaigns. The current way marketers do this is through the creation of ‘audiences’ by building a set of rules and conditions. This process needs marketers to figure out on their own what constitutes an ‘audience’ and still sends the same email to everyone in that audience (it’s not hyper-personalized).

Solution: Data Tailor AI takes a data-driven approach by leveraging a customer’s actions (tracked in Segment) and using the reasoning capabilities of today’s most sophisticated AI LLM models (like ChatGPT) to have the LLM generate a hyper-personalized email for customers. Marketers don’t need to manually create rules or audiences any more - each customer is treated as their own ‘audience’ and the LLM highly personalizes the email for that user, referencing actions they’ve taken through Segment.

Impact for Twilio Segment & Databricks users: Data Tailor has massive impact potential for marketing professionals to supercharge their work. Twilio Segment users benefit greatly by deeply leveraging their Segment events data in more ways than before. Databricks users benefit by ingesting data that they can put to use immediately. And though we cover marketing emails in this hackathon, Data Tailor can easily be expanded for other use cases like sending customer support messages.

How it works (technical overview)

Data Tailor is built through tight integrations between Segment, Databricks, OpenAI, and SendGrid.

Preview

Technologies used (our stack):

  • Twilio Segment: The core of Data Tailor is its deep integration with Twilio Segment. To personalize marketing emails we’ll need a lot of data on a customer and their behavior; Segment is the prime way to get various data sources into one unified system.
  • Databricks (powered by AWS): Databricks gives us the tools to perform ETL on that Segment data as well as the necessary data cleaning and prompt generation to get a generated email back from OpenAI (the data is anonymized before being sent to OpenAI).
  • AWS: All of the processing work is built on AWS (through Databricks). We can scale our clusters in real time as we grow our customer base.
  • Twilio SendGrid: Data Tailor leverages the tools that marketers are already using. SendGrid is an excellent way to automate email campaigns and we’re using this as a Destination within Segment.

How it works (step by step):

  1. Data captured in Segment: For the hackathon we set up two Sources for Segment: the HTTP API and Databricks (we’ll use Databricks as both a Source and Destination). We’ve instrumented our ecommerce sample app to track all user actions through Segment’s HTTP API.
  2. Databricks ETL on Segment data: Using Segment Connections, we’ve set it up so that all of this (raw) events data from Segment is sent to a Databricks data lake through Segment’s new partnership with Databricks. Databricks performs ETL by extracting this raw data, transforming it to fit our SQL table schemas, and loading them into those SQL tables.
  3. Prompt building in Databricks: We leveraged Databricks’ Python notebooks to create a script that ingests data from the SQL tables (from step 2) and builds a highly engineered prompt to send to the AI LLM. This prompt includes instructions for the LLM and a series of recent and relevant data about the customer we want the email for. This data is anonymized before sending to LLMs.
  4. GPT-4 generates the customized email: The whole payload from step 3 is sent to OpenAI’s Chat Completions endpoint from a new Databricks notebook, where GPT-4 sends back a customized email for that customer. These generated email are stored in Databricks in a new SQL table.
  5. Databricks sends generated emails to Segment: Since we added Databricks as a Source in Segment, the data from that SQL table is periodically sent to Segment through Reverse ETL. Segment loads generated email into SendGrid: We’ve set up SendGrid as a Destination in Segment, so the generated emails from Databricks are piped into SendGrid (through Segment’s Reverse ETL).
  6. SendGrid sends generated emails to customers: We use SendGrid’s native email scheduling features to send these emails on a recurring basis.

The best part - it can all be automated! Since Data Tailor ingests the latest user data to build a generated email, we can set this entire pipeline up to work autonomously end-to-end from capturing user’s events in Segment through to sending them customized emails periodically through SendGrid.

What’s next for Data Tailor AI

We had a blast building Data Tailor AI for this hackathon and we’ve been very excited to combine Twilio Segment, Databricks, and AWS to help marketers send highly personalized emails to their customers. There are many upgrades for Data Tailor AI:

  • Use more Segment Sources: For the hackathon we just used the HTTP API Source but in production we’d leverage all the Sources within Segment: mobile app clicks, page views, purchases, etc. The more data we have about a specific customer, the better we can personalize the generated email for them.
  • Preserve privacy through hosted LLMs: We know customer data is important to keep private and secure. There are many options for next steps on this: we can use a hosted version of an LLM like GPT-4, or one of the dozens of open source models to self-host the LLM so the data never has to leave our servers.
  • Expand to other use cases beyond marketing: We realized during the hackathon that there are many use cases for our setup of leveraging data in Segment to generate written content. A natural expansion would be for customer service - knowing all the actions a customer has taken will undoubtedly help LLMs generate personalized support replies at scale,

Thanks to the entire Twilio Segment, Databricks, and AWS teams for giving us support during the hackathon. We can’t wait to see how we can use this stack to make lives easier for more Segment users.

Built With

Share this project:

Updates