Visualizing Apple Health Data with Python: A Descriptive Approach

Apple Watch is showing Awards in Activity

In Apple Health app, we cannot see some health metrics side by side altogether. By using Python and various supporting packages, we have the opportunity to venture into the realm of creating personalized charts depicting the desired health measurements.

On my previous post, I have prepared dictionaries and DataFrames ready for further exploration.

As a reminder, this is the list of the DataFrames that we have created before:

  1. record_data → all recorded data as an element that has <Record> tag in xml file.
  2. record_data_cleaned → record_data that has information of type, date, day, value, and unit.
  3. record_data_df_dict → dictionaries of record_data_cleaned that has been:
    • sorted by date
    • the value column has been renamed to its data name
    • filtered only to the interested data.
  4. record_data_df_dict_daily → dictionaries of record_data_df_dict that need to be accumulated daily.
  5. record_data_df_dict_monthly → dictionaries of aggregated record_data_df_dict every month
  6. final_workout_df → workout DataFrame
  7. final_workout_df_cleaned → cleaned workout DataFrame
  8. workout_routine_df → filtered final_workout_df starts from the beginning of workout routine.

Just like what we did in collecting and cleaning the data, we have to import these packages prior further action:

  • Pandas 1.4.3
  • Plotly 5.4.1

Let’s get the descriptive visualization started!

Body Mass Transformation

We will see my body mass progress since I wear this Apple Watch by running code below:

# Body Mass while using Apple Watch
fig = px.line(record_data_df_BodyMass, x=record_data_df_BodyMass["Date"], y=record_data_df_BodyMass["BodyMass"], markers=True)
fig.update_layout(title_text="Body Mass Progress Since Wearing Apple Watch")
fig.show()

Output:

I entered my body weight to Apple Health as 70 kg sometime after October 2021 and started a diet program to lose some kilograms. By September 2022, when I began my gym membership and changed my lifestyle (not just committing to a diet within a specific time frame), we can observe a downward trend and more data input.

To examine the precise weight loss transformation during my lifestyle change, we can execute this code:

# Body Mass while using Apple Watch
fig = px.line(record_data_df_BodyMass_after_workout, x=record_data_df_BodyMass_after_workout["Date"], y=record_data_df_BodyMass_after_workout["BodyMass"], markers=True)
fig.update_layout(title_text="Body Mass Progress After Lifestyle Improvement")
fig.show()

Output:

As we can observe, my body weight fluctuated on a daily basis. However, what holds greater significance is the downward trend that occurs due to consistent exercise and mindful eating habits. Interestingly, during my vacation from December 2022 to January 2023, I indulged in all the foods I loved, yet I still managed to maintain my lifestyle, lose some weight, and have no regrets.

It is important to acknowledge that fluctuations in body weight are inevitable, possibly caused by factors such as high sodium intake (like I ate instant noodle the day before) or hormonal changes that result in water retention. However, I am not bothered by these variations. I am content with my commitment to changing my lifestyle towards mindful eating and refraining from labeling foods as inherently good or bad.

Spiking Active Energy Burned, Distance, and Step Count after Workout Routine

We will explore how differences are my basal and active energy burned, distance of walking-running, and step count alongside with body weight progress out before and after I take moving my body as my daily responsibility, specifically 1 September 2022 as the start date of my gym membership.

First, we filter the DataFrame within the concerned time frame.

# Before vs After Workout
# Workout routine starts from 1 September 2022 -> data filter after workout are specified starts on this date

# Body mass progress before and after Workout
record_data_df_BodyMass_after_workout = record_data_df_dict["BodyMass"].loc[(record_data_df_dict["BodyMass"]['Date'] >= '2022-09-01')]

#Active Energy Burned before and after workout routine
record_data_df_ActiveEnergyBurned_before_workout = record_data_df_dict_monthly["ActiveEnergyBurned"].loc[(record_data_df_dict_monthly["ActiveEnergyBurned"]['Date'] < '2022-08-31')]
record_data_df_ActiveEnergyBurned_after_workout = record_data_df_dict_monthly["ActiveEnergyBurned"].loc[(record_data_df_dict_monthly["ActiveEnergyBurned"]['Date'] >= '2022-08-31')]

# Basal Energy Burned before and after workout routine
record_data_df_BasalEnergyBurned_before_workout = record_data_df_dict_monthly["BasalEnergyBurned"].loc[(record_data_df_dict_monthly["BasalEnergyBurned"]['Date'] < '2022-08-31')]
record_data_df_BasalEnergyBurned_after_workout = record_data_df_dict_monthly["BasalEnergyBurned"].loc[(record_data_df_dict_monthly["BasalEnergyBurned"]['Date'] >= '2022-08-31')]

# Distance Walking-Running before and after workout routine
record_data_df_Distance_before_workout = record_data_df_dict_monthly["DistanceWalkingRunning"].loc[(record_data_df_dict_monthly["DistanceWalkingRunning"]['Date'] < '2022-08-31') & (record_data_df_dict_monthly["DistanceWalkingRunning"]['Date'] > '2020-11-30')]
record_data_df_Distance_after_workout = record_data_df_dict_monthly["DistanceWalkingRunning"].loc[(record_data_df_dict_monthly["DistanceWalkingRunning"]['Date'] >= '2022-08-31')]

# Step count before and after workout routine
record_data_df_StepCount_before_workout = record_data_df_dict_monthly["StepCount"].loc[(record_data_df_dict_monthly["StepCount"]['Date'] < '2022-08-31') & (record_data_df_dict_monthly["StepCount"]['Date'] > '2020-11-30')]
record_data_df_StepCount_after_workout = record_data_df_dict_monthly["StepCount"].loc[(record_data_df_dict_monthly["StepCount"]['Date'] >= '2022-08-31')]

Plotly library will support us on creating multiple subplots of all the data above.

fig = make_subplots(
    rows=3, cols=2,
    specs = [[{'colspan':2}, None],
    [{},{}],
    [{},{}]],
    subplot_titles=("Body Mass", "Active Energy Burned", "Basal Energy Burned", "Distance Walking-Running", "Step Count"))

#Body Mass Chart
fig.add_trace(go.Scatter(x=record_data_df_dict["BodyMass"]["Date"], y=record_data_df_dict["BodyMass"]["BodyMass"],
    name = "Before Workout Routine", marker_color='#063970'), row=1, col=1)

fig.add_trace(go.Scatter(x=record_data_df_BodyMass_start_Sep22["Date"], y=record_data_df_BodyMass_start_Sep22["BodyMass"],
    name = "After Workout Routine", marker_color='#e28743'), row=1, col=1)

# Active Energy Burned Chart
fig.add_trace(go.Bar(x=record_data_df_ActiveEnergyBurned_before_workout["Date"], y=record_data_df_ActiveEnergyBurned_before_workout["ActiveEnergyBurned"],
    name = "Before Workout Routine", marker_color='#063970'), row=2, col=1)

fig.add_trace(go.Bar(x=record_data_df_ActiveEnergyBurned_after_workout["Date"], y=record_data_df_ActiveEnergyBurned_after_workout["ActiveEnergyBurned"],
    name = "After Workout Routine", marker_color='#e28743'),row=2, col=1)

# Basal Energy Burned Chart
fig.add_trace(go.Bar(x=record_data_df_BasalEnergyBurned_before_workout["Date"], y=record_data_df_BasalEnergyBurned_before_workout["BasalEnergyBurned"],
    name = "Before Workout Routine", marker_color='#063970', showlegend=False), row=2, col=2)

fig.add_trace(go.Bar(x=record_data_df_BasalEnergyBurned_after_workout["Date"], y=record_data_df_BasalEnergyBurned_after_workout["BasalEnergyBurned"],
    name = "After Workout Routine", marker_color='#e28743', showlegend=False),row=2, col=2)

# Distance Chart
fig.add_trace(go.Bar(x=record_data_df_Distance_before_workout["Date"], y=record_data_df_Distance_before_workout["DistanceWalkingRunning"],
    name = "Before Workout Routine", marker_color='#063970', showlegend=False), row=3, col=1)

fig.add_trace(go.Bar(x=record_data_df_Distance_after_workout["Date"], y=record_data_df_Distance_after_workout["DistanceWalkingRunning"],
    name = "After Workout Routine", marker_color='#e28743', showlegend=False),row=3, col=1)

# Step Chart
fig.add_trace(go.Bar(x=record_data_df_StepCount_before_workout["Date"], y=record_data_df_StepCount_before_workout["StepCount"],
    name = "Before Workout Routine", marker_color='#063970', showlegend=False), row=3, col=2)

fig.add_trace(go.Bar(x=record_data_df_StepCount_after_workout["Date"], y=record_data_df_StepCount_after_workout["StepCount"],
    name = "After Workout Routine", marker_color='#e28743', showlegend=False),row=3, col=2)

# Update yaxes
fig.update_yaxes(title_text="Body Mass (kg)", row=1, col=1)
fig.update_yaxes(title_text="Calorie (kcal)", row=2, col=1)
fig.update_yaxes(title_text="Calorie(kcal)", row=2, col=2)
fig.update_yaxes(title_text="Distance (km)", row=3, col=1)
fig.update_yaxes(title_text="Number of Steps", row=3, col=2)

# Update height and width
fig.update_layout(height=800, width=1100, title_text="Before vs After Workout")
fig.show()

Output:

The number of calories burned shows a clear increase in active energy expenditure after a workout, with some even reaching nearly double the pre-workout routine amount. As Apple stated in the developer page, this number refers to the energy expended by the individual as a result of engaging in physical activity and exercise (source: link).

On the other hand, the basal energy burned appears to remain relatively consistent both before and after workout routine. This number represent the amount of energy expended by my body to sustain its regular, non-active condition, such as respiration, blood circulation, and the regulation of cell growth and maintenance. My fitness in this metric support the study published by Bingham et al that exhibits there is no significant long-term elevation on basal metabolic rate (BMR) after exercise.

In terms of distance walking-running and step count, it is not surprising that they appear similar since, as long as I maintain a consistent step length, the number of steps taken and the distance covered will always remain the same. However, the notable feature of the graph is that, similar to the increase in active energy burned, there is a substantial rise in both the distance covered and the number of steps taken after a persistent fitness program.

It is not unexpected that the burning of calories through increased activity and daily steps taken would result in weight loss.

I move the most in Monday and Thursday

As I delve into examining my workout schedule, I am intrigued to determine the day when I experience the highest level of intense physical activity. Hopefully, I could identify a training program that aligns perfectly with my needs and is the most convenient for my schedule.

We will evaluate our parameters in daily time span.

fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=("Active Energy Burned","Stand Time", "Step Count", "Distance"))

fig.add_trace(go.Bar(x=record_data_df_dict_daily["ActiveEnergyBurned"]["Day"], y=record_data_df_dict_daily["ActiveEnergyBurned"]["ActiveEnergyBurned"]), 1, 1)

fig.add_trace(go.Bar(x=record_data_df_dict_daily["AppleStandTime"]["Day"], y=record_data_df_dict_daily["AppleStandTime"]["AppleStandTime"]), 1, 2)

fig.add_trace(go.Bar(x=record_data_df_dict_daily["StepCount"]["Day"], y=record_data_df_dict_daily["StepCount"]["StepCount"]), 2, 1)

fig.add_trace(go.Bar(x=record_data_df_dict_daily["DistanceWalkingRunning"]["Day"], y=record_data_df_dict_daily["DistanceWalkingRunning"]["DistanceWalkingRunning"]),  2, 2)

# Update xaxes
fig.update_xaxes(categoryorder="array", categoryarray=["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"])

# Update yaxes
# Update yaxes
fig.update_yaxes(title_text="Calories (kcal)", row=1, col=1)
fig.update_yaxes(title_text="Calorie (kcal)", row=1, col=2)
fig.update_yaxes(title_text="Number of Steps", row=2, col=1)
fig.update_yaxes(title_text="Distance (km)", row=2, col=2)

# Update layout
fig.update_layout(height=700, width=1100, showlegend=False)
fig.show()

Output:

Based on the figure, Thursday and Monday stand out as the days with the highest levels of activity. Besides, it appears that I fully embrace my weekends, especially on Sundays, as I tend to be less active compared to other days.

I also would like to use this insight to create my own running training program. Considering that I demonstrate a couch potato on Sunday, I recognize that I cannot adhere to freely follow programs provided by most runners blog that typically allocate Sundays for long runs. Perhaps, Monday or Thursday would be the best day for me.

Running is the most frequently performed exercise

Next, we will see which sport take over the most of our Apple Health data. There are a lot of activities that Apple Watch can record. However, I am not a sporty person that my workout range is kinda boring. Still, I would like to explore how the chart looks like.

fig = px.bar(workout_routine_df, x='Day', y='duration', color='workoutActivityType', barmode='group')

# Update xaxes
fig.update_xaxes(categoryorder="array", categoryarray=["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"])

# Update yaxes
fig.update_yaxes(title_text="Duration (minutes)")

# Update layout
fig.update_layout(legend_title="Workout Activity Type")
fig.show()

Output:

Without a doubt, running takes precedence in my exercise regimen, with walking and strength training followed behind. Other workouts, such as cycling, core training, and hiking, comprise only a minor portion of my overall physical activity.

Running burns the highest of my active energy

After we recognize running has been performed the longest, we would like to grasp about the active calories burned by it and other activities.

fig = px.bar(workout_routine_df, x='Day', y='activeEnergyBurned', color='workoutActivityType', barmode='group', title=" Active Energy Burned by Every Workout")

# Update xaxes
fig.update_xaxes(categoryorder="array", categoryarray=["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"])

# Update yaxes
fig.update_yaxes(title_text="Calorie (kcal)")

# Update layout
fig.update_layout(legend_title="Workout Activity Type")
fig.show()

Output:

Clearly, similar to its duration, running also proves to be the most effective in burning active calories. Due to the high intensity nature of running, there is a significantly greater disparity proportions when compared to other activities.

Action plan derived from obtained insights

Taking into account the findings from the investigation of Apple Health data above, there are several actions and considerations that I should prioritize:

  • Engage in cross-training activities

Having a wider range of workout options allows me to engage and target a greater variety of muscles throughout my body.

  • Consider Thursday or Monday as the long run day in running training program

The upcoming post will explore in-depth the correlation perspective of each health information.

Feel free to give some comments on what code to improve or more visualizations that I possible could explore from Apple Health.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *