Wednesday, February 20, 2019

Data Analytics

If you know the story of my heart problems you know that I was misdiagnosed for 2 1/2 years prior to my emergency surgery. My symptoms over that whole period were limited to chest tightness and difficulty breathing when I exercised outdoors in the heat of summer. I'm kind of a nutcase about preferring to exercise outdoors whenever possible, even in unpleasant weather*. Every summer there'd be some heat wave when something would feel wrong in my chest when I ran outside. I would go to the doctor and get misdiagnosed as having exercise-induced asthma and would be sent on my way with a handful of pills and inhalers which, the doctor said would take a month or so to take full effect. Sure enough, a month later I would indeed feel better - but it was because the heat had broken, not because of the meds. This happened in 2014. It happened again in 2015 (the doctor did sent me for a cardiac stress test that year, but it failed to reveal any problems). When I showed up with the exact same symptoms again in 2016 the doctor kind of rolled her eyes that I was presenting with the same symptoms yet again like it was some big deal, when it was just my seasonal exercise-induced asthma acting up again. That was the year I finally decided to act on my hunch that something else was going on and kept pressuring my doctor to get to the bottom of it - which, as many readers know, took until spring of the following year and culminated in emergency surgery on March 15th, 2017. Beware the Ides of March.

Anyway, I recount that whole history here because I had a vague notion that my exercise performance had started slowing down in 2014 and that it was another missed sign of my medical issues; however, it hadn't occurred to me that I actually had a data set against which to evaluate that hypothesis. For the most part I didn't consistently track my exercise until I started using Strava in conjunction with Freezing Saddles in 2016, and besides, there are a lot of variables in most of my exercise. Kayaking performance is subject to wind, waves, and stopping to look at birds and stuff. Running could have been a source of data, as I have been running the same route for years, but I hadn't tracked my times over the years. Bicycling? Well, I tend to meander - not a reliable source of performance data Then, just the other day it dawned on me that I have a complete, decade-long data set of my erg rowing. For some reason, I have always recorded my rows in Concpet2's online logbook (Concept2 is the manufacturer of the rowing machine). Holy longitudinal data set, Batman! Furthermore, rowing was my most consistent exercise data set - the same machine in the same room usually at the same time of day and approximately the same level of exertion every time. No weather. No birds to distract me. No hills. The only slight variation comes from my choice of entertainment - I watch videos when I row and admittedly, I row harder when watching action and adventure shows than more mellow dramas or comedy. I row really hard to Fauda, an Israeli coputner-terrorism action show available on Netflix. But that's minor variability.

Take a look at the plot below, which includes all my recorded rows since 2008 (minus a few which inexplicably downloaded from Concept2's web-based logbook with invalid values). In particular, focus on the orange line - a smoothed version of the data.** I start out with some really good numbers - low times per 500m (that's what the Y Axis scale is). When I first started in 2008 I got really into it and rowed really hard. It was Cyndi (KayakCyndi to some readers) who got me started rowing - and any workout one does with Cyndi is going to be balls to the wall. So I start out with low split times. Then, as the initial excitement faded (and I stopped competing with Cyndi) I slowed down a little and settled into a somewhat more moderate pace. Then the line is flat for about three years: mid-2011 through mid-2014. Then, suddenly, right around the time I started feeling symptoms my rowing times start slowing down. The split times start increasing - and keep increasing all the way until my surgery (where there's a gap in the data in early 2017). Then, since I started rowing again after my surgery the times have been coming back down. 

Figure 1: Rowing History 2008 - Present

But here's some good news. Look at the chart below - in this chart I've zoomed into the period from when I started having symptoms up to the present. When you look at the data from 2014 until just before my surgery there's a trend of slowing down (the orange line is a linear fit, or trend line to the pre-surgery data). Then look at the data from when I resumed rowing to the present and you see the trend line there is decreasing - my rowing performance is getting better (the grey line is a linear fit to my post-surgery data). In fact, if you look at my most recent rows you'll see my numbers are back in the range of my 2014 performance. So in rowing, at least, my performance has completely recovered to pre-illness levels, if not the times of ten years ago (too much to ask!). My running is still slower than before, as is my bicycling. Perhaps the spare parts they took out of my leg to repair my heart affect these leg-based sports more than rowing, which uses a wider range of muscles (core and upper body in addition to using legs for the catch and initial drive). I can only speculate.

I'll also point out that these variations are smaller than they appear, since I've zoomed in on the Y axis. In Figure 2 the difference between the duration of the slowest and fastest rows is about 4 minutes out of 30-ish minute workouts. Still, that's more than 10% - enough, in my mind, to represent real trends in the data and not just the randomness of one exercise performance vs. another.

Figure 2: Rowing History during my period of illness and recovery
So there you have it. Clear signs of my cardiac issues years before the day I was belatedly recognized as an acute cardiac case. Plus, the data belies the initial seasonal exercise-induced asthma diagnosis, since if this had been a seasonal ailment presumably my athletic performance would have rebounded each year after the symptoms faded. Someday perhaps our doctors will make use of all this sensor data we're all gathering these days from Fitbits, Apple watches, Strava, etc., in their diagnoses. To tell you the truth, having doctors mine my personal data is a both little bit exciting and equally scary. Certainly it'll require doctors to develop new skill sets, or maybe the emergence of new medical specialties of diagnostic data analytics, or something like that, along with new privacy regulations and practices. Well, into the future we go, one (recorded) step at a time. 

============================
*I admit that lately I have developed somewhat of an aversion to exercising outdoors in snow and ice. 

**The orange line in Figure 1 is a fourth order polynomial fit to the data. The coefficients are shown in the figure. A fifth order polynomial tells a similar story, but the curves are a little more exaggerated. Lower order fits don't catch the nuances of the data.

A Tale of Four Jess's

 Jesse is not all that common a name, and so unlike the Toms, Davids, and Bobs of the world I don't run into much name confusion. So it ...