Welcome to the easiest way to create onboarding and activation experiences.
Say hi to BentoAI →
Resources /

Put your intuitions to the (split) test

Emily Wang
5 min read

A common belief I’ve come across is that A/B testing is the lifeblood of growth teams. While that’s often true for Consumer Growth teams, it’s rarely true in B2B. Why?

For starters, sample sizes are really small, especially when you want to run a mid-funnel test (i.e. post sign-up, when users are in your product).

Second, most tests can’t be split by user – they need to be split by account (aka by customer) which further reduces the number of samples. 

Yet the world of Growth requires experimentation. It’s hard to objectively know what kind of pattern catches a user’s attention. What motivates them? How many steps (let alone which) is right to promote?

So, in B2B SaaS, we have to think differently about what constitutes an “experiment”, how to design effective ones, and how to know when to “call it”. 

Bento’s mission is to empower B2B SaaS companies with everboarding in a way that is personalized, visually native, and can keep “humans in the loop” (i.e. CS). To do so, you have to be able to iterate rapidly. Now, we’re making that even easier with the ability to run Split Tests directly in Bento. 

Setting up a split test in Bento is simple and intuitive.

We keep it simple: 

  • Split by companies, not users. 
  • Split audiences evenly
  • We track engagement and completion and yes, you can port all your data to Amplitude (or data tool of choice) to run fancy analyses.

Read on to learn more about the types of tests you could run, how to build great ones, and what to look for in a winning test.

3 most common types of onboarding split tests

“Hold out” vs treatment test

By default, a split test in Bento creates a “control” group that receives no guide experience. This is the most classic way to understand the impact of creating an activation experience. Basically, does it work better than doing nothing?

How to create this test:

  • This is the easiest! Simply set “no guide” as your control group and launch your guide or flow to the other 50%


  • In consumer, losing a few (hundred) users to a bad experience gets lost in the wash. In B2B SaaS? That might be an entire month's worth of new users! So be careful about creating a truly bad "control" experience.
  • Is the “holdout” a real holdout? For example, if you launch an in-product onboarding checklist with no Support or Success, the control cannot be “white-glove success”. A good control is to have no onboarding and no Support or Success

Content (aka which steps and how many)

Ask a PM what a customer should do to get value, and you’ll often get back 5-10 features. Ask CS and you’ll get a different set of tasks.

In reality the “right steps” vary by customer, use-case, and user. But in general we find onboarding experiences too long and too overwhelming.

How to create this test:

  • If any of your guides exceed the recommended length (you’ll see a warning in the guide library), we suggest duplicating the guide and its step groups and removing a few steps that may not be critical
  • Then, create a test where one variation is the original (long) and the other is short.


  • Sometimes people combine steps and make one extra long with multiple calls-to-action. The spirit of this test is to focus on the key actions to take that help the customer get value.
  • Be careful of keeping steps that are easy to complete “i.e. update your profile” and tossing out steps that actually reveal the unique value of your product. In most cases, completing a profile doesn’t actually help a user or customer understand how your product helps them with their job-to-be-done.

Type of onboarding experience

If you’ve spent more than 5 seconds on our website, you know we’re not huge fans of (long) product tours. Ultimately activation requires, well, action. 

But are checklists the right approach for your audience? Are empty states better? Here’s how you can test.

How to create this test:

  • Set each variation to the first/main guide experience. If it’s a checklist, then one variation is the checklist guide. If it’s a chained tour, then one variation is just the first modal or tooltip.


  • By nature, tours are about clicking to the next thing, and not necessarily doing an action. The UX of having an overlay while trying to do a task is tricky.
  • If you’re going to compare, identify one or two key actions that you can use to see impact on adoption.

How long to run a split test for

This varies based on how different the results are. If you have 6 data points (customers) and one variation is at 80% completion and 90% engagement while the other is at 20% engagement, well, it’s fairly clear. In general we recommend watching sessions replays, talking to users, and capturing their feedback to augment data.

The other consideration is: how long can you afford to wait? 

We all operate on incomplete information, and for most companies, it’s unrealistic to wait 6 months on a growth experiment.

What metrics to look at to know if a test has won

All else equal, we suggest focusing on:

  1. High level engagement. Most customers consider “engaged” as the user having at least completed one step. In Bento, step completion means the user actually completed the step!
  2. Is there a clear cohort of users or who complete a majority of the actions? Overall completion rates are not useful because humans behave differently. It’s more critical that this experience works really well for at least a cohort.

What to do when stopping a split test

When you stop a test, we do two things:

  1. We lock the analytics that are part of the Split Test. The underlying guide’s analytics will continue to evolve as engagement changes, but the Split Test’s results won’t keep shifting.
  2. We pause the guides that were part of the test. That gives you time to make tweaks before setting your “winning” variation live again.

Split testing is just one tool to create an experimentation culture. While consumer PMs have long known this, B2B PMs generally don’t have access to this kind of infrastructure or sample sizes. Bento is a powerfully simple tool that allows your organization to rapidly iterate and experiment so every customer you sign can be successful.


We’d be happy to help you design your first test, and get it launched within a week. Find a time and we’ll show you how!

BentoAI - Auto-create guides from existing articles and recordings | Product Hunt