The Harsh Reality of Virtual Try-On in Fashion

This is my third, and final, post where I dive into the learnings that were uniquely tied to the domain while building a consumer AI product in fashion. The first part traced our journey arc and focused on what did not work. The second stepped aside to explore AI shopping agents and the opportunities (or lack of) they open up. This post turns to virtual try-on, a topic I was asked about constantly and one we experimented with deeply while building Alle. I saw a lot of hype and noise around try-on in tech circles, but our lived experience was very different. Writing this to clear up a few myths, and share the opportunities that actually held up when we tested them in practice.

Illustrated cover: a person in green looks into a mirror showing them wearing a different blue outfit

TLDR; Virtual try-on aims to help shoppers see how clothes would look on their own body using images or avatars. Its success depends on accurately showing both the garment (style, fit, color, etc.) and the user (face, body, skin, proportions, etc.). Today’s technology falls short on all three: realistic fit, faithful body modeling, and garment-body interaction. This leads to awkward results and low utility leading to very weak feature retention. Because the experience does not reliably solve the visualisation problem, it also fails to reduce returns. Using try-on to lower CAC or drive engagement has not worked either, due to high technology costs and novelty that wears off quickly. Until we see breakthroughs in foundation models, virtual try-on remains more hype than true utility.

Longer Version

First, it’s worth grounding virtual try-on in the problem it is meant to solve. At its core, it is a technology solution built to solve a specific consumer problem. (After all, it doesn’t solve a real problem, there is little reason for users to return to it repeatedly. Hold this thought, we will come back to it.) The promise of virtual try-on is simple: to help shoppers visualise how a garment would look on them in real life by simulating it on their body (either on their own uploaded image, or a digital avatar). The efficacy of this solution boils down to two core factors reflecting in how accurately it represents:

Garment’s physical characteristics (style, size & fit, colors, patterns)
User’s physical characteristics (face shape, facial features, body shape, body size, face to body proportion, skin color, skin tone, hair style, hair length)

Failing to meet any of these points meaningfully limits the solution’s ability to solve the original problem. In it’s current technological maturity, try-on falls short in three key areas:

Accurate representation of garment size and fit
Faithful modeling of the user’s body, including proportions, facial features in relation to the body, and skin characteristics
Realistic simulation of how the garment interacts with the user’s body

The current experience appears to show promise (largely driven by marketing hype), right up until users see their try-on image and feel an uncanny sense of discomfort. Fun fact: people are far more sensitive to how accurately their own face, skin, and body are represented than the people who know them ever are. Alternatively, some users may trust the result, place an order, and only then realise that the garment doesn’t fit the way the image suggested it would. Either way, the feedback loop is weak. And weak feedback loops always lead to low repeat usage.

At a high level, today’s technology is still far from a true mirror-like experience. I do believe we will eventually get there, but only with real breakthroughs at the foundation models level. Precisely modeling a person’s body and translating garments into true-to-life size and fit is an exceptionally difficult problem, and has been an active research area for many years. Incremental workflow improvements on their own won’t be enough to close these gaps.

Another business implication of this reality is that try-on does not really reduce returns. It would be great if it did, because that would allow brands and marketplaces to either improve their margin profiles (short term) or pass those cost savings on to customers (long term). These shortcomings in utility make try-on a classic case of “form over substance,” and that reality shows up clearly in how poorly the feature’s retention is. Some people like to believe (as we once did) that weak utility-driven retention can be offset by strong user-driven virality that lowers CAC, or through entertainment-driven engagement that boost app usage frequency. Let’s unpack each of these in turn.

Can user-driven virality offset weak utility?

CAC reduction through user-driven virality depends on how many users talk about what your product can do, how many people they reach, and how many of those people are willing to go through the friction of trying it themselves. The classic way people imagine try-on virality is that users share their try-on results on social platforms, which sparks curiosity in viewers and motivates them to try the product too.
But the reality has a lot of nuance, which goes as follows:
- Since try-on is a high-friction experience, initial adoption is naturally constrained. Users have to upload full-body photos, which they may not have readily available or be able to take in the moment. On top of that, image generation latency is high by consumer standards (> ~15 seconds). In our case, even after funnel and product optimization, adoption was typically 15-25%. The limited adoption cuts down the top-of-funnel for sharing signficantly.
- Of the users who adopt try-on, the funnel typically looks like this:
  - Some users drop off quickly after disliking the results.
  - Most find it briefly fun, till the novelty lasts, but do not any intrinsic motivation to share.
  - A smaller group shares privately with close friends/family to show-off a cool new experience, reaching ~1-3 people, of which a tiny subset will install the app.
  - A tiny fraction posts publicly on stories (Instagram/Snapchat), motivated by the social currency of discovering something new early. Even within this group, most have limited reach, and hence will result in limited downstream installs. You might expect more public sharing since the experience feels novel and fun, but most people are reluctant to post personal things online and are just passive consumers — the classic 1:10:100 participation rule of content platforms.
So in practice, the product’s virality is very low because the funnel from usage to new downloads is so bleak. If you want to increase virality, there are only two real leverage points:
- Increase feature adoption and usage
  
  This quickly runs into hard constraints around cost and latency, which tend to move in opposite directions. For eg, delivering sub-30-seconds try-on results can cost roughly ₹0.5-₹2 per image. Without a reliable revenue model to recover this cost, a company won’t be able to scale this feature. Consumer subscriptions won’t be the right revenue model since the product is not inherently retentive because it fails to solve the recurring online fashion shopping visualisation problem, as discussed earlier. Even if technology costs eventually fall close to zero, adoption will still be limited by the friction of uploading personal images, both in terms of how many are required and how specific the instructions need to be.
  
  Usage is also naturally constrained by the feature’s lack of utility in solving the visualisation problem.
- Increase the reach of people who share the results
  
  Since most users are not inclined to post their personal try-ons publicly on stories/reels, organic reach is structurally limited. The main alternative is to work with creators who have large followings. But at that point, user acquisition is no longer product-led and becomes a business development motion. User acquisition now includes creator fees on top of technology spend, which can render try-on a weaker growth channel than straightforward performance marketing in terms of cost efficiency.

Can entertainment-driven engagement compensate for low utility?

Another variation of try-on we experimented with was repositioning it to solve the boredom use case rather than a shopping one. The essence of this idea was that users could visualise themselves in any scenario like wearing celebrity outfits, red carpet looks, or characters from their favourite movies etc. The assumption was that this would be inherently more fun and engaging than trying on clothes to decide whether to buy them.

The underlying thesis was:

In moments of boredom, imagining yourself in a fantasy world is entertaining.
In moments of boredom, seeing how your friends have imagined themselves in a fantasy world is entertaining.

Combined, this pointed toward a social, entertainment-first product. This is the world where Instagram/TikTok dominate. But that same dominance also revealed the core vulnerability of the idea. The thesis falls apart when users implicitly compared this entertainment experience with that of the content on Instagram. Simply featuring yourself or your friends does not automatically make something entertaining. Entertainment is largely driven by different forms of storytelling. Additionally, static images are a weaker entertainment medium than short-form video. When we launched this experiment, we realised this quickly. While the novelty did drive more adoption and engagement than shopping-focused try-on, it was nowhere near enough to compete with the entertainment value of existing platforms. On top of that, imagining yourself in a fantasy world requires “creation”, and creation always carries more friction than “passive consumption”, especially when you add the latency of waiting for image generation.

Does this mean that try-on offers no opportunities in the consumer space?

While the premise of virtual try-on in improving shopping decisions is still distant, there are two areas where continued investment may actually make sense.

Cataloging for Brands

Try-on can dramatically reduce catalog production costs for brands. Instead of organising repeated photoshoots for every model, pose, or variant, synthetic imagery can generate large portions of a product catalog. However, this is a B2B opportunity, and several SaaS companies are already building focused solutions around this use case. Consumers would benefit indirectly from this. Instead of today’s one-size-fits-all product photos, shopping feeds could feature more inspirational, styled outfit imagery with personalised, relatable models that better reflect their body types, tones, and taste.

Consumer Try-on at the Discovery Layer

Most conversations today position try-on on the consideration stage of the shopping journey. This is the stage where users have already scrolled through hundreds of products, shortlisted a few, and are deciding between them. This stage is where visualisation accuracy matters most, and where today’s tech fails hardest.

Discovery is different. Here, users browse for inspiration with weaker purchase intent. Today, this experience is driven by feeds of model photos shot by brands in studio settings. When try-on is introduced at the discovery layer, the feed can show products on the user’s own body instead. Imagine opening a shopping app and seeing different outfits featuring yourself rather than models. (It may feel uncanny at first, but over time people will get used to it.)

When we tested this at Alle, we saw a meaningful increase in product click-through rates compared to feeds that only showed model images. We also noticed an uplift in retention when these try-on images were refreshed each time a user returned to the app. The underlying insight was simple: seeing new images of yourself feels like consuming new content. From entertainment platforms like Instagram, we know that fresh content is a major driver of engagement and repeat usage, and the same dynamic was emerging in try-on powered discovery experiences.

That said, even this opportunity comes with a few important caveats:

Long-term feature retention depends on whether users find real value in try-on discovery product recommendations. The novelty of new visuals fades quickly and cannot compete with the entertainment users get from media apps. So even at the discovery layer, the only durable value try-on can offer is this: product recommendations combined with self-visualisation are meaningfully better than a simple feed of model photos.

Consider a user choosing between two feeds:
- Feed A: Try-on–led discovery with weaker product recommendations
- Feed B: Model photoshoot images with stronger product recommendations
Most users will find Feed B more valuable, since the primary job-to-be-done for a shopping app is better shopping, not better visualisation. Visualisation helps, but it is a nice-to-have capability. Ultimately, the real utility benchmark is how many products the user ultimately keeps (i.e. what does not get returned).

For try-on to succeed at the discovery layer, it needs to accurately represent both the garment and the user so that return rates are on par with, or better than, a model-photo feed with strong recommendations. Delivering great product recommendations also requires personalization driven by extensive user data collection. As a result, try-on-based discovery is a weak GTM motion for a startup that is still trying to find distribution around a retentive use case.
Cost constraints significantly limit the feasibility of creating such feeds for every user. Even if you roll this out to a small subset of users, each feed can realistically include only a limited number of products in a try-on format. To make this concrete, consider a simple cost structure:
- Daily active users: 1,000
- Try-on adoption rate: 25% (already on the high side)
- Images per user: 50
  
  You need at least ~50 images because, even with an above-market conversion rate of 2%, that only translates to about one transaction. And conversion itself depends not just on volume of these images, but also on personalisation quality, product selection, pricing, delivery time, etc.
- Cost per try-on image: ₹0.25 per image
  
  Since these can be generated offline rather than in real time, this is lower than consideration-stage try-on costs.
- That puts the daily try-on cost for the feed at: 250 users × 50 images × ₹0.25 = ₹3,125
- Now assume a very optimistic transaction funnel:
  - Transaction conversion rate: 2%
  - Average order value: ₹1,000
- This yields GMV from this cohort: 250 × 0.02 × 1,000 = ₹5,000
- With a net margin of 20%, net revenue comes to: ₹5,000 × 20% = ₹1,000

Even under aggressive assumptions, the revenue does not come close to covering the try-on generation costs. But as the cost of try-on approaches zero, this can evolve into a meaningful product differentiator compared to shopping experiences that do not offer such a feed. However, this product opportunity doesn’t imply a business opportunity; that only happens if a viable monetisation model exists. (Refer to the first post for lessons on business models in Consumer AI × Fashion.)

With this, I’m wrapping up this series on what I learned building at the intersection of consumer AI and fashion. I hope these reflections are useful to founders exploring similar ideas and help them avoid a few wrong turns along the way. I don’t see any of these takeaways as universal truths. They are deeply context-dependent and limited by my own ability to interpret and analyse observations. I’m always keen to hear alternative viewpoints and experiences that can challenge or refine my thinking further.