Andrew Mende’s Post

How to run an A/B experiment when you don't have enough traffic? Since I work a lot with A/B experiments, I sometimes people reach out for an advice in tricky situations. A friend of mine, talented product manager in a mid-size company, had an idea to significantly boost the appeal of products on his e-commerce website. I'm not at liberty to disclose the idea in details, but let’s say it involved something like a 3D-model of the product (not actually a 3D-model, but this analogy works well to keep the NDA). However, simply running an A/B experiment to see if the conversion does increase wasn’t possible: creating these 3D models for every product in the catalog was quite expensive. If we could prove the models help increase sales, the company could easily afford it, but ordering them just for an experiment for the entire catalog was too risky. By pure chance, about half of the products (we can consider it to be a random half) in one particular section of the catalog already had these 3D models available for free. So they could be shown at no extra cost. But when my friend calculated the statistical power of the proposed experiment, it became clear that if we only took visitors from that one section, the detectable increase in conversion within a 3–4 week test would have to be unrealistically high. Simply put, there was not enough traffic! So I suggested to my friend to go with a totally different experiment design, which would not answer the question directly, but would probably provide enough evidence to move forward with more confidence. What would you suggest to do in this situation to most accurately determine whether these 3D models really make the products more appealing for the customers? Write your ideas in the comments. In the next post, I’ll share my recommendation and what did the experiment results show. (Spoiler alert: the results were quite surprising, which is why I’m telling this story. Still, I believe the technique I suggested was worth trying.)

One of the things we use in my team is alternate metrics to conversion (micro-conversion) which have high correlation to conversion and a higher statistical power. And later run either a blackout of all different successful product changes at the end of quarter or run a retest to ensure the impact in experiment was indeed true. This has helped us reduce the runtime significantly.

Can the experiment be run for 8 or even 12 weeks for example? I get that this is long time, but since it is not a burning idea, who cares it will be 12 instead of 4 weeks?

Thinking out loud here, first things that comes to my mind: running a few rounds of user research to get initial validation and iterations, picking up 20% of highest selling models that contributes to 80% of the sales to invest in the 3D model, running a paid campaign for a few weeks to temporarily increase the traffic for those to run the AB test- if the traffic is still not enough using engagement or funnel metrics as proxies to increase confidence could be some ideas ☺️

Use Causal Impact or Synthetic Control! Quasi all the way :)

Like
Reply

To have a better idea I’d like to know the actual traffic volume and the expected metric change from the hypothesis. If it’s 10 people per week I’d go talk to prospects by foot. If it’s 100 I’d run a survey or a fake door experiment with proxy metric. If it’s 1000… I’d see if have a better hypothesis to focus on.

Like
Reply

During the experiment, you could run PPC campaign for this section of the catalog, and buy as much traffic as needed for experiment power.

Like
Reply

That “let’s rerun it” instinct is so underrated. Easy to get emotionally attached when the variant wins. Major kudos to the team for prioritizing signal over bias — product sense + data sanity is the real power combo.

Like
Reply

Interesting case. Picking a more sensitive metric is always a solution. Did yiu suggest them to try CUPED and Winsorization?

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories