Screenshot A/B testing for indie devs without an A/B test platform
Apple's Product Page Optimization gives you proper A/B tests for screenshots, but only on full-size product pages and only for apps with enough traffic to power the test. Here's a poor-man's playbook that works at indie scale — using Custom Product Pages, storefront splits, and a 14-day measurement window.
Apple shipped Product Page Optimization (PPO) in 2021. It’s a proper A/B test framework for App Store screenshots and icons, run by Apple, with statistical reporting baked into App Store Connect. It’s also frustrating to use as an indie dev for two reasons:
- Traffic threshold. PPO needs enough impressions per variant to declare a winner. If your app gets a few hundred impressions a day, you’ll wait weeks or months for a result.
- Default product page only. PPO tests the default product page — the one users see when they tap your app from search. You can’t use it to test screenshots aimed at a specific marketing channel.
There’s a workable indie playbook that gets you most of the benefit anyway. It uses Custom Product Pages, storefront splits, and a tight measurement window. Here’s how.
The toolkit
Three pieces, all native to App Store Connect:
1. Custom Product Pages (CPPs)
You can create up to 35 alternative product pages per app. Each has its own URL (?ppid=<id>), its own screenshots, its own promotional text. They don’t replace your default page — they sit alongside it. You drive traffic to them via deep links from ads, social, your website.
CPPs aren’t an A/B test framework. They’re separate, named product pages you can point traffic at. Which makes them perfect for the test we’re about to run.
2. Storefront splits
The App Store ranks per storefront. If you have en-US and en-GB both ranking for the same keyword, those are essentially two independent samples — different users, different competitor mix, different impression streams. You can ship a different screenshot set per storefront and compare the conversion ratio against each storefront’s own baseline.
3. The 14-day rule
App Store Connect’s analytics aggregate per-day, but day-of-week effects are large in most categories. The smallest comparison window that controls for day-of-week is two full weeks. Anything less and you’ll be reading noise as signal.
The playbook
Five steps. Whole loop: about three weeks per test cycle.
Step 1: pick one hypothesis
Don’t test “new screenshots” against “old screenshots.” Test one specific question. Examples:
- Does leading with the result (output of using the app) beat leading with the process (how the app works)?
- Does a screenshot of the actual UI beat a stylized marketing illustration of the same screen?
- Does the third screenshot matter? (Most users don’t scroll past three.)
If you can’t write the test as “X beats Y because…”, you’re going to be unable to interpret the result either.
Step 2: build two variants
Build the new screenshot set as a Custom Product Page (or as the en-GB storefront if you want a storefront split). Keep the default page as your control. The variant only changes the one thing your hypothesis is about. Same product, same color palette, same copy except for the variable under test.
This is the part most indie devs get wrong — they end up changing five things at once and then can’t tell what moved the needle.
Step 3: drive matched traffic
For Custom Product Pages: link to the CPP URL from one channel (one tweet, one Reddit post, one ad), and link to your default page from another matched channel. Match channel volumes as best you can. This is the messy part — indie traffic is lumpy by nature, so do your best, document what you did, and move on.
For storefront splits: do nothing extra. Both storefronts get organic traffic. The mix is naturally noisier, but the sample size is real over 14 days.
Step 4: wait 14 days, then read
Open App Store Connect → App Analytics → filter to the product page (CPP or storefront). Compare:
- Conversion rate (impressions → install) for the variant vs the control.
- Drop-off shape: did users get further down the page on the variant?
Don’t read days 1-5. Day 1 is launch noise; days 2-5 will be skewed by whichever channel happened to push that day. Read the 14-day window as a whole.
Step 5: declare a result honestly
The result is one of three:
- Variant wins by 20%+ → ship the variant as the new default. (Anything under 20% on indie traffic is inside noise.)
- Control wins, or it’s a wash → keep the control. The hypothesis was wrong.
- You can’t tell → don’t ship the variant, write down what you’d need to do differently next time, and move on.
Most of your tests will land in category 3. That’s the cost of running tests at indie scale. You learn anyway — the discipline of writing down a hypothesis and reading the data is the work.
Two things that don’t work
I’ve watched a lot of indie devs try variants of this and burn time. Two patterns to avoid:
Switching screenshots and watching ASC for an hour. App Store Connect dashboards lag, day-of-week effects are huge, and the first 24 hours are dominated by whatever else changed that day. Don’t read fast feedback.
“Friends and family” surveys. Asking five people on Twitter “which screenshot is better?” measures aesthetic preference, not conversion. The people you ask are not the people who’d organically search for your app on the App Store. The two populations have different priors.
What we learned across ~12 tests
A grab-bag of patterns from running this loop across half a dozen indie apps over the last year:
- Screenshot 1 carries the conversion, almost regardless of what’s in screenshots 2-N. If you can only optimize one frame, optimize the first one.
- Captions beat no-captions about 80% of the time. The 20% where no-captions wins, the underlying screen is unusually beautiful or unusually self-explanatory.
- A screenshot showing the result of using the app beats one showing how to use it almost every time. Users want to know what they get, not what they have to do.
- App icons rarely move the needle in CPP tests, because users coming through a CPP have already decided to look. Icons matter more on the search results page, which CPPs don’t show.
Tooling
You don’t need anything beyond App Store Connect for the analytics side. For the design side — actually building two variants of a screenshot set without going insane — that’s the part Asomium handles. Native canvas templates, per-locale overlays, instant export to PNG. We use it to spin up and ship variant screenshots in a single afternoon.
Related: the subtitle case study covers the same “measure with a 14-day window, change one thing at a time” approach applied to subtitle copy instead of screenshots.
Share this post
Read next
Case study: rank 80 → rank 4 in 9 weeks on a stalled habit tracker
An indie habit tracker had plateaued at ~30 organic installs a day. We didn't change the product. We didn't run new ads. We rewrote the subtitle, rotated the keywords field, and filled the empty English storefronts. Here's the play-by-play.
ASO keyword tracking 101: what to measure, what to ignore
Most ASO dashboards bury the signal under noise. Here's a stripped-down playbook for what actually matters when you're tracking App Store keywords as an indie dev — and what you can safely skip.
Case study: how a meditation app cracked Japan without translating the app binary
A solo dev had a meditation app doing well in the US storefront and nothing in Japan. We localized the metadata, translated the screenshot overlays, and pushed a tiny ASA Discovery campaign. Five weeks later the app was top 50 in JP Health & Fitness. The binary stayed in English the entire time.
Mario
Founder, AsomiumFounder of Native First, shipping iOS and Mac apps. Building Asomium because the App Store release workflow deserves better.