The Truth About WHOOP and Other Fitness Trackers: Part I
I spoke with WHOOP’s Principal Scientist and a critic of some wearable devices to help you understand the new world of wearables.
Housekeeping:
This post and its audio version, like all Monday posts, is free to all.
Wednesday will feature Part II of this post. Friday is The Expedition, the internet’s best roundup of information important for life and living it. Get full access below.
If you’d like full access to those posts, become a Member below.
This post is long. I’ve summarized it in four sentences below. Or you can listen to the audio version above (at 1.5 speed!).
This post summarized in four sentences:
New wearable fitness tracker companies like WHOOP and others are using made-up metrics to score your workouts and daily physical effort.
These grand metrics are all some degree of wrong because wrist-based heart rate tracking during movement is some degree of wrong.
There are cheaper, more accurate ways to measure workouts.
It’s up to you to figure out if you determine if these made-up metrics are useful to you.
Now onto the post …
A few months ago, I posted a thread about activity trackers on Twitter and Instagram. It included the following Tweet, which was part of a larger thread.
I heard from representatives from WHOOP shortly after. For the unfamiliar:
WHOOP is an activity tracker.
WHOOP doesn’t count steps. Instead, it uses the metric of Strain, which WHOOP believes better captures the totality of your movement compared to other activity metrics.
The company was recently valued at $3.6 billion and has deals with major sports leagues.
The WHOOP reps wanted to know if I’d be up for chatting with WHOOP’s Principal Scientist, Kristen Holmes.
Kristen recently got a Ph.D. in psychology and has worked with WHOOP for six years. Before that, she worked on her own health tech ventures and coached one of the most successful NCAA field hockey teams (12 Ivy League titles and one national championship!).
Of course I’d love to chat with Kristen.
I don’t think wearables are perfect. But I also think they can be useful depending on how we use them.
My goal is always to get as much information as possible and find what can help us live better.
So here’s what I did:
I spoke with Kristen from WHOOP for roughly an hour. WHOOP was also kind enough to send me one of their bands. I wore the device for 12 weeks to better understand it (more on that experience on Wednesday).
I read a lot of research on fitness trackers (WHOOP and others) and contacted some of the scientists on those papers.
I exchanged a series of emails with Marco Altini.
Marco is another of the world’s foremost experts on wearable technology.
He has a mix of degrees and experiences that make him an ideal mind for understanding wearable tech: a PhD in applied machine learning, a master’s in computer science engineering, and a master’s in human movement science and high-performance coaching.
He’s published over 50 papers and patents on the intersection between physiology, health, technology, and human performance. He’s also a high-level endurance athlete.
Today is Part I of a multi-part series on the new world of wearables.
This series will mainly focus on WHOOP. But that’s just because I spoke to the company and have questioned their Strain metric and other metrics.
Most of this post’s takeaways apply to other brands of wearables. For example, Oura Ring, Garmin, FitBit, etc, all have features similar to Strain, Recovery, etc.
Today’s post will focus on:
Measuring fitness and activity with Strain and other scores.
The true science of Strain and other made-up metrics.
Strain Vs. Calories.
The most accurate way to track workouts.
Wednesday, we’ll hit:
Steps vs. Strain—upsides and downsides to each.
What I liked about WHOOP.
What I didn’t like about WHOOP.
Downsides of made-up scores
How I think we should use trackers like WHOOP.
Next week, we’ll look at recovery scores.
Let’s roll …
Measuring fitness with Strain
In short
Any way you measure it, the fitness score on your tracker will be a rough estimate of your exercise load and some degree of wrong.
The details
Because of my Tweet, WHOOP primarily wanted to talk about why they don’t count steps and instead use “Strain.”
Kristen pointed out, “not all steps are created equal. It’s really hard to understand your overall health when you don’t understand how hard your heart is working.”
For example, 100 steps on a leisurely stroll differ from 100 “steps” while sprinting. They impact your heart and entire body differently.
She is, of course, correct. (And we’ll run a more direct comparison of steps vs. Strain on Wednesday). Today we’re focusing on Strain and similar metrics.
How Strain works
WHOOP believes ignoring steps and only using Strain—which is measured through your heart rate typically taken through your wrist—allows the device to better capture how everything that happens in your life impacts you.
The idea is to create one “grand metric” that captures how everything you do in a day impacts your physiology.
For example, they believe Strain can more accurately reflect the totality of your walk around Walmart, your workout, and the heart rate spike you had in the office when your boss asked you to speak in a big meeting, etc.
Strain works off your heart rate, so it’s most helpful in capturing cardio exercise. But WHOOP has also developed a setting for weight training that tries to convert muscular load into Strain.
Each day, WHOOP users get a daily numeric Strain goal. The company says it’s the amount of “optimal Strain” you should try to hit that day through living daily life and exercise. It’s scored from 1 to 21.
You accumulate day Strain through your workouts and activities in everyday life.
Go under your target day Strain number, and you didn’t work hard enough.
Go over it, and you may push your body too hard.
Your daily target Strain is based on your daily “recovery score.” That figure is based on measurements of your sleep, heart rate variability (HRV), resting heart rate, respiratory rate, etc.
So, for example, if your daily recovery score was 50%, you’d have a lower Strain target that day compared to if your recovery score was 80%.
The true science of Strain
In short
Strain is not a physiological measurement. It’s a made-up score from WHOOP. Similar scores from other trackers are also made up.
The details
Strain is obviously not a real physiological phenomenon. It’s not like heart rate, HRV, or blood pressure, which can be directly measured and are standard in medicine.
Instead, Strain is a made-up score from WHOOP. WHOOP calculates Strain using your heart rate. “It’s purely heart rate,” Kristen said.
The upside of Strain is that the company is trying to break down something complex—how movement and life impact your heart rate and physiology—and make it digestible. A single number.
The science of measuring heart rate
But, out of the box, WHOOP measures your heart rate through your wrist, like most trackers do.
Research on whether wrist-based trackers get heart rate correct depends on what a person is doing when their heart rate is being tracked. In general:
Heart rate tracking is accurate when we’re doing nothing—i.e., sitting still.
But it’s not as accurate if we’re doing something—i.e., moving and exercising.
For example, one study in Australia found that WHOOP tracked heart rate just as well as a medical-grade machine. But that study was small (6 people).
That study also had to toss out 17% of the trials due to equipment malfunctions or error. Which seems important to note.
Another study Kristen pointed me to tracked the heart rates of 53 people. It found that WHOOP was basically spot on. Compared to a medical-grade device, WHOOP only underestimated heart rate by an average of 0.3 beats per minute.
WHOOP did better than five other devices tracking heart rate during sleep. “As a result, the Australian Institute of Sport picked WHOOP for their next two Olympics,” Kristen said.
Worth noting: Both studies came from labs that receive funding from WHOOP. I don’t think corporate funding automatically negates the results of a study—in fact, we wouldn’t know many important things if we threw out all studies funded by industry. But transparency is transparency.
And there’s one other, bigger catch: Those studies measured heart rate when the participants were resting or sleeping. Not when they were moving around and exercising.
This is important. Marco explained:
The wrist is one of the most challenging locations for optical sensors to provide accurate data. There is no direct access to arteries (which run at the bottom of the wrist, while sensors are typically designed to be worn on the opposite side, trying to capture changes in blood volume via capillaries), and there are artifacts due to movement, poor contact, etc.
At rest, (measuring heart rate through the wrist) is typically no problem, hence the data during sleep can be of high quality. However, this is typically not the case as soon as there is some movement. For certain applications, such as HRV analysis, not only movement but simply contracting your muscles as you type on a keyboard can cause issues that make the data extremely unreliable.
In short: Tracking heart rate through the wrist during movement isn’t as accurate.
For example, sports scientists in Denmark took 29 people and had them complete a workout while wearing the WHOOP 3.0 band and an Apple Watch 6. For comparison, the participants also wore a Polar H10 chest strap, which scientists consider the gold standard for measuring heart rate during exercise.
The participants did a weight training workout comprised of five different exercises. Each exercise consisted of three sets of 15 repetitions. Here’s the data:
In short, the scientists found that both the Apple Watch and WHOOP were off. They wrote:
Apple Watch provided the most accurate measures of heart rate relative to the (gold standard), as (error) values were between 1.6% and 14.0% for all measurements. The Whoop 3.0 was the least accurate, with (error) values ranging between 4.4% and 14.8%.
Another study compared three wrist-based trackers and found that only one was accurate.
There’s a lot of research on this topic. It all mostly comes to the same conclusion: Measuring heart rate at the wrist during exercise typically results in errors.
A review also suggests that wrist-based tracking may be more likely to be incorrect for darker skin tones.
This means tracking would be more likely to be even more off for people with darker skin or who have tattoos or birthmarks on their wrists.
This is all to say, Strain will be some degree of wrong because the heart rate data is some degree of wrong.
This is likely why another study found little correlation between Strain and markers of training stress in swimmers.
It’s also helpful to see what real people are saying. If you search some variation of “Reddit heart rate (insert heart rate device like WHOOP, Garmin, etc)” you’ll find plenty of people reporting incorrect heart rate tracking scores from basically every device.
Like, “I was taking a leisurely walk on the beach and it thought my heart rate was 150 when it was actually only 70.” That whacked-out oddity would render the entire day’s data useless. (This has happened to me with almost every wrist-based heart rate tracker I’ve used).
It calls into question overall utility. If a measurement is consistently wrong, you can still use it and improve. But if a measurement is inconsistently wrong, you can’t use it.
And even though I like WHOOP’s idea of adding life stress into Strain, we need to remember that not all changes in heart rate at “rest” are from bad stress.
For example, Marco pointed out:
A higher heart rate at rest because your boss is yelling at you: probably bad!
A higher heart rate at rest because you’re laughing your ass off with friends: probably good!
In short: It’s challenging to bundle how life and movement impacts our physiology into a single number (a problem every tracker faces).
Marco explained it like this:
When you estimate something, you are making it up: you do not have a sensor on the device that allows you to actually measure the parameter of interest. You exploit the relationship between a somewhat associated variable and the parameter of interest.
In some cases, these estimates are completely made up and cannot even be tested, for example: sleep quality scores, readiness and recovery scores, stress, body battery, Strain, etc. - stay away from all of these and don't let them mess with your head. They are not a thing.
Assumptions are made about how you should feel regardless of how you actually feel … wearable devices can be useful, but be critical. They typically measure very few things (heart rate, pulse rate variability, maybe temperature), and estimate lots.
I asked the researcher who ran the WHOOP and Apple Watch exercise study. He wrote:
The (Strain) algorithm is not available to us as researchers (trade secrets) so it is difficult to guess as to which factors it includes.
But heart rate is, most likely, a main factor (in WHOOP’s Strain measure)—and, given that the WHOOP strap has displayed limited HR accuracy, I would be hesitant to use this in research anyway. One limitation is that there really is, to the best of my knowledge, no gold standard measurement for Strain that you could compare it to.
Is Strain just calories?
In short
Sort of?
The details
Marco mentioned something interesting. When I asked him about Strain, he wrote:
I tend not to dedicate much of my time to made up-scores (this reminds me of Nike Fuel Points and things like that). Eventually, it is just calories (and indeed, strain is highly correlated to calories), that these companies rebrand in ways that only add confusion.
For example, Oura Ring does something similar to WHOOP’s Strain and daily Strain goals. Oura gives you a daily “active calorie burn” goal. And then everything you do in a day feeds into that number.
Garmin uses “Acute Training Load,” a seven-day sum of all your activity. Etc. Etc. Etc.
If you want more info on this, here’s a full breakdown of the science of fitness tracker calorie counts.
In short: One extensive study of trackers and calorie counts found trackers overestimate calorie burn by anywhere from 27 to 93 percent. The scientists wrote:
“None of the devices provided estimates of energy expenditure that were within an acceptable range in any setting.”
In short: Any grand metric of the load on your body is some degree of wrong.
Is a band the best way to track your fitness?
In short
Old-school metrics like pen, paper, clocks, and maps are more accurate.
The details
Many trackers offer ways of “scoring” the intensity of your workout, such as calorie burn or Strain.
But there are cheaper and more accurate ways to track intensity and the workload on your body. For example:
Your workload is higher if you do 100 squats with 100 pounds instead of 50 squats with 100 pounds.
Your workload is higher if you ran one mile in seven minutes instead of eight minutes.
Your workload is higher if you ran 10 miles instead of 6 miles at the same pace.
You get the point.
The most accurate way to track your workouts is with a clock, map, pen and paper.
Clocks, distance, and poundage don’t lie and are indisputable.
I asked Kristen why a person would measure workout Strain instead of using a pen and paper.
She said she enjoys comparing her Strain between the same workouts:
“For example, if I were running a mile, and I ran it eight weeks ago and maybe the workout was an 11 (Strain). But now I run the same mile today and it’s a 9 Strain, I can see that my cardiovascular system is more efficient.”
Big picture: I’m in favor of people measuring their workouts. Exercise and do whatever you can to make it fun. Track it somehow if you want to improve.
But I do worry that relying entirely on wrist-based heart rate tracking metrics to determine how your fitness changes may lead to flaws. For example, let’s say you did your second one-mile run but did any of the following:
Wore the band looser.
Wore the band at a slightly different area of your wrist.
Had sunscreen on your wrist.
This would change how the data was captured and your data might be off—it would be inconsistently wrong. Again, this would happen with any tracker.
Takeaway
Remember that my message is always “work what works for you.” If you benefit from using a device that gives you a grand “daily metric” of your activity, keep it up.
But I think we need to take all the data with a grain of salt. It can give us rough approximations that are highly varied and noisy.
Tracking fitness improvements will be most accurate when we’re tracking outcomes:
Did you finish the same ruck in a faster time?
Did you run the half marathon faster?
Did you lift more weight?
Did you reach the top of the mountain?
You get the point …
Ok, enough for today. Wednesday we’ll cover:
Steps Vs. Strain—upsides and downsides to each.
What I liked about WHOOP.
What I didn’t like about WHOOP.
How I think we should use trackers like WHOOP.
Have fun, don’t die, get fit—any way you measure it.
-Michael
Sponsored by Momentous
Momentous made me feel good about supplements again. Over 150 professional and collegiate sports teams and the US Military trust their products, thanks to the company’s rigorous science and testing. I don’t have the time or desire to cook perfectly balanced meals that give me all the necessary nutrients and protein I need (let’s face it, few of us do!). So I use their collagen in the morning; Recovery protein during hard workouts; essential multivitamin to cover my bases; creatine because it’s associated with all sorts of great things; and Fuel on my longest endurance workouts on 100+ degree days here in the desert (because Rule 2: Don’t die). And I also love (love!) that Momentous is researching and developing women-specific performance supplements.
**Use discount code EASTER for 15% off.**
Sponsored by GORUCK
When I decided to accept sponsorships for this newsletter, GORUCK was a natural fit. Not only is the company's story included in The Comfort Crisis, but I've been using GORUCK's gear since the brand was founded. Seriously. They've been around ~12 years and I still regularly use a pack of theirs that is 11 years old. Their gear is made in the USA by former Special Forces soldiers. They make my favorite rucking setup: A Rucker 4.0 and Ruck Plate.
**Use discount code MEASTER10 for 10% off**
Sponsored by Maui Nui Venison
Axis Deer provides the healthiest meat on the planet. That's according to researchers at Utah State, who compared axis deer meat to beef and found that it contains 1 to 64 times more antioxidants, vitamins, minerals, and healthy fats. It also contains 53% more protein per calorie than beef. Here's a fascinating brief on the research. Equally important is that Maui Nui solves ethical considerations around meat. Axis Deer are an invasive species ravaging the Hawaiian island of Maui, and Maui Nui harvests the deer at night in a stress-free way, improving the ecosystem.
My picks: I like it and eat everything from Maui Nui, but the 90/10 Organ Blend is particularly great for people looking to get more micronutrients in their diet, and the Jerky Sticks are my go-to travel snack.
**Use discount code EASTER for 15% off.**
This post couldn’t be more well timed. I’ve been wearing WHOOP religiously for about 8 months and just yesterday I took it off. It was starting to make me feel bad about myself, and causing me to focus on the wrong things (ie game-ifying my life and exercise instead of just having fun and not dying). That said, I learned a lot from it; it helped me drink less, go to bed earlier, and generally become more consistent with good habits. And now i know how my body responds to my typical workouts. But I’m not sure what the value-add is, now that I know all that.
I always use a chest strap HR monitor when I want accurate readings of my workout effort…typically when I am cycling. I use my Apple Watch for everything else just to track time under effort. Having other means of tracking/measuring my workout effort does not make me workout more or better. I just need to get my ass out the door and do something.