Stitch Fix: Are its data science efforts paying off?

Stitch Fix: Are its data science efforts paying off?

Published 3/12/2018

This research began with the question: Are Stitch Fix’s efforts to apply data science to apparel retailing paying off? Stitch Fix is notoriously vague about key customer metrics needed to answer this question, so I attempt to shed light on them using primary customer research. 

Executive Summary

  • As Stitch Fix receives feedback on fixes from a given user, their algorithm and stylist appear to learn and provide better choices to that user, modestly increasing their keep rate over time. (Subject to survivorship bias caveat).

  • The challenge is getting users to stick around long enough for the algorithms to learn. Two-thirds of respondents have received 4 or fewer fixes since starting, suggesting a mostly "young" population. 6 month retention is about 29%. 

  • Stitch Fix’s overall NPS score appears low, and user experience ratings of new users are lackluster. Keep rates of new users are on average ~10% lower than for users who have provided 5 rounds of feedback, for example. This suggest that the onboarding questionnaires are not providing sufficient information to the algorithm to enable a personalized experience from the get-go. 

  • That said, a cohort analysis shows 1 and 2 month retention rates increasing significantly over the last year, and a large percentage of users perceive their experience to be improving, so Stitch Fix is doing something right to get new users past the algorithm’s learning hump. Additionally, 1st shipment keep rates have recovered from recent lows that were likely caused by the steep learning curves of entering new markets (men, plus-sized, premium).

  • This improvement in 1st fix keep rate is a positive for the overall average keep rate, which appears from this data to be trending slightly down (though Stitch Fix says it's trending up). At a minimum, the hoards of new users Stitch Fix acquired mid to late last year, as well as the new markets entered, have created much volatility in the overall keep rate.

  • While it's possible recent new users are lower quality than early users, Stitch Fix has barely penetrated the addressable market (only ~9% of survey respondents have tried it).

  • Lastly, despite management’s claim that they don’t see seasonality, this data shows a significant uptick in keep rate during the holidays, which seems entirely intuitive. After all, even data science can’t beat centuries old human behavior.

About this study

This study began with the questions: Is Stitch Fix effective at using data science to provide a personalized digital shopping experience for clothes? Is data science creating a better customer experience and is that showing up in keep rates and retention?

This deck contains the results of a survey conducted about Stitch Fix between January 14 – January 16, 2018.  999 people completed the survey, and of these, 92 are/were Stitch Fix users.

To be eligible, survey participants nhad to reside in the U.S. and speak English. All participants were asked if they had used any clothing services from a list provided. Those that checked Stitch Fix were then invited to complete the rest of the survey, though they were not explicitly told that it was a survey about Stitch Fix.

As this study is based on the results of a survey, it comes with the usual caveats: what people say they did and what they actually did can differ, small sample size, sample may not be representative, etc..

Which clothing services are most popular?

More people have tried Stitch Fix than competitor services, but the vast majority haven’t tried any service.

Survey question - Which of the following clothing delivery services have you tried? (select all that apply)

Stitch Fix estimates the addressable market at $353B in 2016. Only 9% of people surveyed have tried it, suggesting much room for growth.

User experience

Rent the Runway crushes it in terms of customer satisfaction, while Stitch Fix is under delivering

Survey question - On a scale from 0-10, how likely are you to recommend this service to a friend or colleague? [Note: question only shown to Stitch Fix users]

Only long-term Stitch Fix users believe the service really understands them well. New users – not so much.

 Survey question: Consider your most recent experience with Stitch Fix and the items they sent you. On a scale from 0-10, how well do you think Stitch Fix understands you and your preferences? (0 = Not at all, 10 = Extremely well)

Overall average = 6.4, not particularly high. 

Not surprisingly, the longer you’ve been a Stitch Fix customer, the more likely you are to believe that the company really understands you (otherwise you would’ve already left).

The question is whether a low-rater would raise his/her rating after the algorithm has had more time to train on them… 

More than other services, Stitch Fix users perceive their experience to be improving over time.

Survey question - Has your overall experience with this service gotten better, worse, or stayed the same over time?  When thinking about your experience, consider things like how well the items sent fit you, matched your style preferences, and fell within your expected price points.[Note: question only shown to Stitch Fix users]

Shipments received

Survey Question - Starting in 2016, please indicate which months you received a Stitch Fix shipment (select all that apply). Please approximate if you can't remember exactly. For example, if you received a shipment in May 2016, select that box. If you did not receive a shipment in June 2016, leave that box empty.

Stitch Fix heavily ramped up customer acquisition in late 2017, likely for its IPO. 2018 on track to show even higher y-o-y user growth than 2017.

Year-on-year Jan 18 shipment growth looks quite strong. The survey data says none of the Jan 18 shipments are to 1st-time customers, so it’s probably largely retention from the acquisition binge late last year. Indeed, management indicated they’ve taken the gas off marketing spend in 2QFY18.

Most Stitch Fix users still relatively new. Only 24% of respondents have received more than 5 shipments.

Mean shipments received by respondents = 4.5, Median = 4, Mode = 3

Cohort analysis and customer retention

Survey Question - Starting in 2016, please indicate which months you received a Stitch Fix shipment (select all that apply). Please approximate if you can't remember exactly. For example, if you received a shipment in May 2016, select that box. If you did not receive a shipment in June 2016, leave that box empty.

Stitch Fix has significantly improved retention, but long-term retention still needs work. 

92 users with start dates spread across 2 years is insufficient data to perform a monthly cohort analysis, so I use rolling 12 month averages. Even with the resulting smoothing effect, we can see that Stitch Fix appears to be significantly improving their retention rates, particularly in the fragile early months when the algorithm is still learning what a user likes and mistakes are more plentiful. 12 month retention seems low, but admittedly, the survey data is thinner there.

Monthly cohort analysis, active customers, 12 month rolling averages.
Note: a customer is “active” in a subsequent month if they receive a shipment, even if they don’t keep anything.

The retention numbers from the survey are remarkably similar to those found by Second Measure up until month 8, both showing a 6 month retention of ~28-29%.

Top chart shows survey derived retention curve. Bottom chart shows retention derived from Second Measure data. Note different time spans. Source

Starting at month 8, the survey data starts to diverge from Second Measure data. This could be attributable to small survey sample sizes in the further out months. It could also be due to Second Measure only using data from 2017 if Stitch Fix made improvements to long-term retention rates between 2016 and 17.

Note that both data sets show a “hump” at month 5/6. Stitch Fix may have some sort of “Please come back” marketing campaign around then. 

Keep rates

Survey Question - Below are the months in 2016/2017/2018 that you indicated you received a Stitch Fix shipment. For each of these months, please tell us how many items you kept from that shipment. Please approximate if you can't recall exactly.  For example, if you indicated that you received a shipment in January 2016, but sent all 5 items back, enter a 0 under January 2016.

In addition to retention, keep rates are also key to understanding whether Stitch Fix algorithms are working.

Keep rate refers to the number of items a customer keeps from a Stitch Fix shipment, which contains 5 items. The higher the keep rate, the more money Stitch Fix makes. Keep rate is also a proxy for customer satisfaction and a driver of retention, as users who consistently get stuff they like are more likely to return. Note: the keep rate is presented as the keep number (out of 5) rather than as a percentage.

The Stitch Fix story relies heavily on their use of Big Data and proprietary algorithms to learn and predict what customers will like, thereby improving their keep rates over time.

There are a few ways we can look at keep rate to see how the algorithm is performing:

  1. Average overall keep rate by month/quarter.
    If their algorithms are working well, we should see the overall average keep rate increasing over time.

  2. Average keep rate by shipment number.
    Since their algorithm needs to train on an individual, we would expect to see the average keep rate for an individual improve over time. For example, the average keep rate on a 5th shipment should be higher than on a 1st shipment, as a user provides feedback and the engine learns.

  3. Average first shipment keep rate by month/quarter.
    Not only should the engine get better at predicting what a returning user will keep over time, it should get better at predicting what a first-time user will keep, based on the questionnaire a new user completes.

Average overall keep rate by quarter: Trending slightly down

If their algorithms are working well, we should see overall average keep rate increasing over time.

The data suggests the overall keep rate is rather volatile and trending slightly down, but not dramatically and it could just be noise.

To the extent it is trending down, it could be that improvements in keep rate driven by the algorithms are being offset by lower quality customers. One often sees this with subscription businesses – early customers are better customers. But Stitch Fix’s penetration is still pretty low.

Another possibility is that improving keep rates for existing customers are being offset by lower keep rates for new customers – which suggests a shifting balance between new customers and older ones (and possibly higher churn than we’d like?). New customers have lower keep rates because the algorithms haven’t trained on them yet.

It’s important to note that Stitch Fix tells a very different story about keep rate improvement over time.

In a November 2017 IR presentation, Stitch Fix showed the keep rate steadily marching upwards. 

Source: Nov 2017 Stitch Fix IR presentation

Because they're using fiscal years (FY ends July), and this study’s survey data only goes back to Jan16, the two can’t be compared cleanly, but looking at CY16 vs 17, the survey data suggests the average keep rate fell by ~15%. New customer adds in Q1FY18 (Aug-Oct2017) could be dragging down CY17 keep, but even if we just look at Jan16-Jul16 vs Aug16-Jul17, we see a 7.3% drop in keep rate in the survey data. The difference could be due to the different time windows, or it could be that the survey data is off, or both.

Survey data doesn't compare cleanly with Stitch Fix data due to time windows used

Note that there could be some survivorship bias in Stitch Fix’s graph. The new adds as a % of total active customers shrinks every year. So if new adds have a lower keep rate (at least initially), then there's less drag on the average keep rate every year, meaning it goes up. The optimistic interpretation is that the older active customers in a given year have a higher keep rate because the algorithm has trained and improved on them. The cynical interpretation is that there's survivorship bias -- the loyal customers that stick around have always had a higher keep rate, while low keep rate folks drop out. The truth has very different implications for the future growth of keep rate, and consequently for Stitch Fix’s ability to grow LTV.

Average keep rate by shipment number: The algorithm appears to get better at predicting what a user will keep the longer they stick around.

Since their algorithm needs to train on an individual, we would expect to see the average keep rate for an individual improve over time.   

Although there were survey takers who had received as many as 24 shipments, the number of people who received more than 10 shipments is low. 

So zooming in on people who’ve received 10 or less shipments, the data seems to show that the algorithm + stylist are able to improve the keep rate slightly with each subsequent shipment and feedback loop, as they learn more about a person.

Caution: Again, there might be some survivorship bias here as well. Long-time users may have always had a high keep rate. As shipments progress, lower keep-rate people drop out, leaving only the loyal high keep rate folks, and the average goes higher, creating an upward trend. 

Average keep rate for 1st time shipments by month/quarter: There are lots of misses in 1st shipments, but the keep rate has recovered from lows.

The algorithm should also be getting better at predicting what a first-time user will keep, based on the questionnaire given to new users.

The data collected on this one is somewhat thin, making it difficult to reach strong conclusions. It appears the keep rate for new users stumbled due to entering new markets, but is steadily recovering to its highs.

Qualitatively, this metric seems harder to improve because all the engine has to work with is what a user says they like in the new-user questionnaire, not what they actually bought, and as Stitch Fix has noted, the two can vary widely.

Note the three dips in the chart: Q1FY17 is when Stitch Fix for Men launched. Q3FY17 is when Plus Size launched. Q1FY18 saw the launch of premium brands. It’s likely all these launches had steep learning curves, which would explain the drag on new-user keep rate.

But importantly, each time, the keep rate bounced back. With the algorithm improving for these new customer segments, it’s likely the steady-state 1st shipment keep rate will trend higher than it is today.

Average keep rate by month: Seasonality apparent in end of year holiday keep rates

Stacking 2016 and 2017 keep rates we see what appears to be some seasonality to the rate. Keep rates are higher in Nov-Jan.

Despite management’s claims that Stitch Fix doesn’t experience a holiday effect like other retailers, the data shows one. There are likely several reasons:

  1. People want to wear something new for holiday parties and family reunions,

  2. People gift a subscription and if you’re the recipient and not paying, you’re more likely to keep something,

  3. People treat themselves to a Fix for Christmas,

  4. People like a “new/fresh start” to the new year. Culturally, new starts are often marked with new clothes.

This may have implications for the next earnings announcement. 

Limitations

Surveys like this rely on self-reported behavior, not actual behavior. The two can differ for many reasons, including poor recall, pride, embarrassment, speeding through the survey, and others.

Additionally, participants for this survey were recruited from Amazon Mechanical Turk, and while Turkers are in many ways representative of the American public (see here and here), they do skew in some important ways like age, income, and internet proficiency (see here). As this was just a convenience sample, the folks that took this survey are not necessarily representative of Stitch Fix’s user population.

This survey gathers shipment and keep rate data back to Jan 2016, but Stitch Fix launched in 2011. Pre-2016 data would provide important context, but it’s simply too long ago for most people to accurately remember. Even early 2016 data probably contains more respondent “guestimates” than recent months.

Lastly, it would’ve been very telling to segment responses between women, men, plus-sizes, and premium brand buyers to get a better picture of what’s happening with the core, but this segmentation data wasn’t collected.

Credits and further reading

This post was inspired and informed by the following posts:

https://www.linkedin.com/feed/update/urn:li:activity:6327139916694323200 and pretty much everything that Dan McCarthy writes

https://www.theinformation.com/articles/what-could-hamper-stitch-fixs-growth

https://techcrunch.com/2017/10/22/unboxing-stitchfixs-s-1/

http://www.goodwatercap.com/thesis/understanding-stitch-fix

http://blog.secondmeasure.com/2017/10/26/why-stitch-fix-isnt-like-blue-apron/

Tuesday Morning Survey

Tuesday Morning Survey

Cardtronics: A rollup ready to implode

Cardtronics: A rollup ready to implode