Monday, June 14, 2021

Review Drift

Guest post by C. Thi Nguyen

Here are three stories about one thing. The first story is about social media and donuts.

Before COVID destroyed travel, I kept having this same experience. I’d be in some new city. I’d do a little online research and hear about some new donut shop that everybody was raving about. I’d go, wait in the enormous line, see all the stickers about winning awards, and admire the gorgeous donuts in joyous anticipation. And then I’d eat the donut — and it would turn out to be some horrible waxy cardboard thing. Each bite was, like, some kind of pasty mouth-death. And then I’d sit there on the curb, with my sad half-eaten donut, watching the line of people out the door, all chattering about their excitement to finally get to be able to get one of these very famous donuts, everybody carefully taking donut pics the whole while.

And weirdly, totally different cities would give me the same kind of bad donut. These donuts all had a similar kind of visual flair: they were vividly colored; and they were big, impressively structural affairs — like little sculptures in the medium of donut. But they all had that weird, tasteless, over-waxy chaw. My theory: these donuts were being optimized, not for deliciousness, but for Instagram pop. And that optimization can involve certain trade-offs. You need a dough that’s optimized for structural stability, and a frosting that’s optimized for intense color.

Right now, Instagram is where food goes viral. And I’m not saying that the visual quality is unimportant. Appearance is part of the aesthetics of food. But what makes food unique, in the aesthetic realm, is the eating part: the taste, the smell, and texture. And I’m not saying it’s impossible to make a beautiful, yet delicious donut. I’ve had some, rarely. But Instagram seems to be enabling the rise of donuts made primarily for the eye. When Instagram becomes a primary medium for recommending food, you get this weird kind of aesthetic capture. Instagram will rewards those restauranteurs who are willing to trade away taste and texture in exchange for more visual pop.

The second story is about clothes. I’ve been buying my clothes online these last few years, and I keep having the same experience. I buy something from a relatively new company with a lot of Internet presence and very good reviews. The clothes arrive. They look awesome; on the first wear, they’re incredibly comfortable. Then, pretty quickly, they start falling apart. They stretch out of shape, they start pilling, they fall apart at the seams.

Some of it is surely the current economics of fast fashion. And some of it is that a lot of these companies are spending more on advertising than on clothing quality. But that doesn’t explain the barrage of good reviews on uncurated sites. What I’m starting to suspect is that, in the online shopping world, a lot of companies are starting to specifically target the moment of review.

The vast majority of online reviews are submitted close to the moment of purchase, after only a few wears. So online reviews mostly capture short-term, and not long-term, data. So new companies are heavily incentivized to optimize for short-term satisfaction. A stiff piece of clothing that slowly breaks in to comfortable and lasts forever won’t get great reviews. A piece of clothing that has been acid-washed to the peak of softness will review quite well — and then fall apart a few months later. (You can find a similar effect on Twitter. Twitter Likes are usually recorded at the moment of first reading — so simple ideas we already agree with are more likely to get Likes, but long-burn difficult ideas that change our minds, eventually, get lost.)

Call this phenomenon review drift. Review drift happens whenever the context of review differs from the context of use. In the current online shopping environment, good online reviews drive sales. So companies are incentivized to make products for the context of review. If the context of review is typically short-term, then companies are incentivized to optimize for short-term satisfaction, even at the cost of long-term quality. (A related phenomenon is purchase drift: when the context of purchase differs from the context of use.)

A third story: Seventeen years ago, I was backpacking and camping almost every weekend. In quick succession, I had three horrifying moments with some cheap folding knives. One of those left me cut to the bone. So I had a “As God as my witness, I’ll never use crappy knives again!” moment. I decided to ask some park rangers for recommendations. The next three park rangers I met all turned out to be carrying variations on the same pocket-knife, from the same company. And I read some reviews online praising these same knives to the stars, as lifelong companions. So I bought one.

Here is a picture of my own personal Spyderco Delica 4, which has served me incredibly well for 17 years.

It is basically indestructible. I dropped if off a 100 foot cliff once and it was fine. It also has a thousand subtle design features that took me years to really appreciate. One of the interesting things about Spyderco knives: they look fucking weird. I think we have a particular Platonic image of a knife — military, stabby, tough — and Spydercos don’t look like that. (A common complaint among bro-type dudes that want to look all tactical tough: “Spyderco looks like wounded pelicans.”) But all those weird organic design swoops are amazing in the hand. Spyderco’s ergonomic design genius is well-known in the online knife appreciation community. The classic Spyderco designs just meld into your hand; they become fully intuitive, natural extensions of you. But it took me years to fully appreciate it. When you first see and hold one of these knives, especially the lightest and grippiest plastic-handled ones, they just feel cheap and weird.

A couple months ago, somebody stole my other favorite pocket-knife out of my car. It was pandemic, and my brain was starved for sensation, so I had no other choice but to go looking at updated knife reviews. And what I found was that, in between my last knife-buying venture, 17 years ago, and the current one, a vast sprawling network of knife reviewers had arisen, mostly clustered around certain YouTube channels. There is now entire online community that had sprung up dedicated to constantly reviewing and collecting knives. And this community had developed an obsession with a feature called “fidget-quality”. This is how fun it is just to sit and open and close the knife, over and over again.

A folding knife has a quality called “action”. The way that it opens and closes — the speed, the feel of the flick, the satisfying hefty click of the locking mechanism — can all be aestheticized. There are even love-odes to which knives sound good — which ring like some kind of hyper-masculinized bell when they snap open and closed. And I’ll give it to you: good action is sweet. I’m totally up for aestheticizing anything and everything. But — and some of the Internet Knife Community[1] have started to notice this — some of these very expensive, wonderfully fidgety knives don’t actually cut that well. Or that some of them have handles with really clean, pretty metals — which Instagram nicely, but which also turn out to be really, really slippery.

Here is a theory: knife sales right now are driven by the Internet Knife Community. The Internet Knife Community is driven by Instagram, but most heavily by YouTube knife reviewers — like the knife-review superstar Nick Shabazz. Nick is a great, fun, lively reviewer. But, to get popular, a reviewer has to put out a lot of regular content — like multiple knife reviews a week. But somebody who is sitting in their room, making multiple knife-review videos a week, isn’t out in the woods for years with the same knife. So what they’re doing, to review the knife, is cutting the few cardboard boxes they might have around, and then fidgeting with it — and paying lots of attention to the fidget-quality. The context of review exaggerates the importance of fidget-quality, compared to the importance of, you know, cutting stuff.

A similar thing seems to happening in the boardgame community. Boardgames are, one might hope, made for hundreds and thousands of plays. One of the reason boardgames are such a good value proposition is that you can slowly discover the depths of the game over years of repeat play. But the community is now getting driven by popular reviewers, often on YouTube, and getting popular requires putting out frequent and regular content — multiple reviews a week. Which means the most dominant voices, which drive the market, are playing each game a couple of times and then reviewing. And that drives the market in a particular direction. It drives it away from deep rich games that take a few plays to wrap your mind around. The current landscape of popular reviewers seems to be driving the market towards games which are immediately comprehensible, fun for a handful of plays, and then collapse into boring sameness.

So: the structure of the online environment right now seems to demand that superstar reviewers put up frequent updates. Which means reviewing lots of products in rapid succession. But if you’re reviewing the kind of thing that is subtle, that takes a long time to really get to know, then the context of review has drifted really far from the context of use. So we’re evolving this perverse ecosystem centered around influential reviewers — but, where, to become influential, their review-context must be really far from the standard use-context.

Review drift isn’t new. Every age has its own mediums for review and every review medium has its strengths and weaknesses. An earlier era was dominated by written reviews, which have their own limitations. (A lot of the times, I suspect that much art that’s been critically revered in the past has gotten that status, in part, because it’s the kind of thing that’s easy to write about. Like, clever symbolic intellectual stuff is easier for academics and clever art critics to write about than subtle, spare, moody stuff.)

The new wrinkle, I think, is the degree to which many modern contexts concentrate review drift and homogenize it. This is starting to become apparent in all kinds of technological circumstances. A lot of modern technologies create concentrated gateways, which channel the majority of the public’s attention through a single portal. So much of our collective attention is set by how, exactly, Google’s search engine algorithms work, and how it ranks the result. So much of our collective purchasing is set by how exactly Amazon’s algorithm works. And one thing we know is: the more a single system becomes dominant, and the more legible its internal mechanics are, the easier it is for interested parties to game that system and to hyper-optimize. There are whole industries that exist around optimizing your Google search ranking and your Amazon product ranking.

So: there’s always going to be review drift; reviews can never be perfect. But if, at least, review drift happens for different reasons, and in a plurality of directions, then it’ll be a hard target for a big company to optimize for. But if there is some kind of systematic, structural feature that encourages the same kind of review drift across a whole reviewing community, then we create a clearer system for companies to target. And this can happen when a whole body of reviews gets filtered through a particular portal — like Instagram or YouTube — which homogenizes the patterns by which reviewers get famous, or strongly filters the kinds of reviews that get recorded. The more uniform the review drift, the more legible the target for the optimizers.

This is part of a larger pattern we’re starting to see more and more. We can call it the phenomenon of squashed evaluations. When an entire rich form of activity gets evaluated through one tiny window, then the importance of whatever’s in-frame gets over-exaggerated — and whatever’s outside of that frame gets swamped. So the same general kind of pressure that’s giving us high schools laser-focused on standardized tests, pre-meds obsessed with their GPAs, and journalists obsessed with click counts, is also giving us beautiful tasteless donuts and sexy flickable knives that aren’t good at cutting.


[1] This is their actual name for themselves.


Simon Whyatt said...

A couple of thoughts here.

1. The internet is full of fake reviews. I think this could definitely be the case for the donuts and the clothes.

Reviews on a company's own website are almost definitely fake, or cherry picked. Positive reviews and likes etc on other platforms can easily be purchased.

2. For the above reason I tend to consult 3rd party reviewers. Again however, one has to bear in mind there can be perverse incentives to give false positive reviews. (I.e. either direct payment or the desire to keep being sent free stuff to review).

3. There are good reviewers though, that give honest feedback and also do follow up reviews. This is the solution to the longevity problem.

4. Maybe marketplaces like Amazon could improve their reviews by sending out long term review requests 1 year+ after purchase?

5. Interesting and semi-related I saw that there'd been a large drop in average ratings of many food business during covid. The hypothesis being that it was driven by the loss of smell and taste caused by the virus.

Anonymous said...

Some thoughts from reading.

Even though the article focuses on reviewing, it's really about short term impression - which can only me marketable, if the consumers focus on it and hence play the game as few times as reviewers do. Reviewers aren't enough - Remember A Few Acres of Snow controversy 10 years ago? Reviewers couldn't detect Halifax Hammer that made the game "br0k3n", but players did. What happened in these 10 years is that modern gamers wouldn't be able to detect any of that - player's mentality has changed.

Unlike with knives where there still is an idea of "knives being able to cut things", with boardgames modern gamers do not understand what depth is. They're not aware of facets of gameplay that could emerge only with repetead play - be it depth or emergent narrative or anything that comes alive from the dialogue between the game and the group playing it, when both are comfortable with each other. We get modern gamers/hobbyist that can't even detect these attributes, let alone engage with them. And speaking to them about it, seems like talking about myth, about things that aren't there.

It's like Modern internet knife community looking at you claiming that knives should cut well, by "what is cutting?" and "we just want to look cool with these" and "I buy knives to fidget" .

Chris Schreiber said...

Thank you for the article. I came from boardgaming and stayed for the extended topic. What I appreciate most is chunking your ideas into a pleasant narrative that I could visualize and then making a constructive effort to introduce a workable vocabulary around these ideas.

Therefore, I have bookmarked the article——so that I may come back to it. In this way, the article made a good immediate impression on me but it will also provide a long-term reference when I want to refresh my ability to speak on this subject. Eventually, perhaps I will carry these ideas around in my back pocket.

If this article were a knife, it would have the immediate fidget appeal (loved the links!) but also a reliable edge to cut my thinking on the subject. I am more keen because of it. Cheers.

Arnold said...

A review of life on a small planet...

A while back the world's "basics supply processing' was challenged...
...Have values changed for awhile...

Should we all eat potluck together-whenever we can ... sort out anew our world's life sustaining processes...

We are here but we are not alone...

Unknown said...

If you're ever in Connecticut, Neil's Donuts in Wallingford is great. Or at least it was. I haven't been there since before Instagram existed.

chinaphil said...

This is the status quo argument against libertarians on healthcare.
I love libertarians - the idea is so optimistic. If we can really stop worrying about regulating things and just let review culture sort the sheep from the goats, that's fantastic. But this thing you call "review drift" seems to me likely to forever stymie this process for drugs/other disease treatments.
Firstly, there's a problematic bias built into all review mechanisms for drugs for fatal diseases: you'll only ever get (first hand) reviews from those who don't die. All the people who use crystals to treat their cancer and subsequently snuff it will not be around to tell the tale; all the tales that are told will be those told by the survivors. So in the reviews, every drug will have a 100% success rate.
More generally, people tend to like drugs that make them feel good - like alcohol and opiates - so it seems likely that reviews would be biased towards those drugs rather than towards the drugs that are good at curing their ailments. And I've never seen a clear explanation of how smart review statistics will fix this problem.