Take Home Exercise 04

Comparison of daily routines of 2 selected particpants from Engagement, Ohio, USA.

Rakendu Ramesh https://www.linkedin.com/in/rakendu-ramesh/ (Singapore Management University)https://www.linkedin.com/showcase/smumitb/
2022-05-23

Overview

In this Take Home exercise, we will be studying the daily routines of 2 participants. We will analyse the similarities as well as the differences between 2 particiapnts and their daily patterns. We will also analyse the patterns over different days of the week as well as over the available period, to understand the variation in patterns on weekends and possibly, vacations.

Getting Started

We will first load the required packages using the below code chunk

packages = c('tidyverse','ViSiElse','lubridate','ggplot2','ggthemes','hrbrthemes','scales','ggdist','gghalves')
for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

Importing the dataset

participants_data <- read_csv("data/Participants.csv")

We will select the data related to the Particiapants that we are interested in and save it as an rds file so that we do not need to deal with large unnecessary data anymore

March_3rd_pattern <- read_rds("data/rds/partsample1.rds")

Data Wrangling

To visualize the daily routine of our selected participants, the data need to extensively manipulated. Lets get to it!

In the below code chunk, we create a new column to indicate when the participant is eating. This is based on when the hungerStatus changes to ‘Just Ate’. We will also consider that the participant is eating all the time when he is at a restaurant.

March_3rd_pattern$eat <- rep(0, length(March_3rd_pattern$timestamp))

March_3rd_pattern$eat[March_3rd_pattern$currentMode == 'AtRestaurant'] <- 1


March_3rd_pattern <- March_3rd_pattern[order(as.POSIXct(March_3rd_pattern$timestamp)),]

March_3rd_pattern <- March_3rd_pattern %>%
    group_by(participantId) %>%
  mutate(eat = ifelse(lag(hungerStatus) != 'JustAte' & hungerStatus =='JustAte' & lag(eat) != 1,1,eat))%>%
  ungroup()

We will create new columns to indicate when he is travelling, when he is at work, when he is at some recreation activity, when heis at home and when he is sleeping. So these are the basic activities that we will track to understand his routine.

We will then pivot the table to have activity as one column and a status column to indicate if the corresponding activity is ‘ON’ or ‘OFF’.

March_3rd_pattern <- March_3rd_pattern %>%
  mutate(travel = ifelse(March_3rd_pattern$currentMode == 'Transport',1,0)) %>%
  mutate(work = ifelse(March_3rd_pattern$currentMode == 'AtWork', 1,0)) %>%
  mutate(recreation = ifelse(March_3rd_pattern$currentMode == 'AtRecreation',1,0)) %>%
  mutate(athome = ifelse(March_3rd_pattern$currentMode == 'AtHome',1,0)) %>%
  mutate(sleep = ifelse(March_3rd_pattern$sleepStatus == 'Sleeping',1,0))


March_3rd_pattern <- March_3rd_pattern %>%
  select(timestamp,participantId,athome,sleep,eat, travel,work,recreation)


LongTable <- March_3rd_pattern %>%
pivot_longer(cols = c(athome,sleep,eat,travel,work,recreation), names_to = "Activity",values_to = "Status")


Activity_Levels <- c('recreation','work','travel','eat','sleep','athome')
Status_Levels <- c(0,1)

LongTable <- LongTable %>%
  mutate(Status = ifelse(LongTable$Status != 0 , 1, 0)) %>%
  mutate(Activity = factor(LongTable$Activity, levels = Activity_Levels)) %>%
  mutate(Status = factor(LongTable$Status, levels = Status_Levels))

Visualising the Daily Routine of the participants

We will visualize the daily activities using geom tile for the 2 participants.

LongTable %>%
ggplot(aes(timestamp, Activity, fill = as.factor(Status)))+
  geom_tile(color = "white")+
  facet_wrap(~LongTable$participantId, nrow = 2, colors())+
  theme_tufte(base_family = "Helvetica")+
  scale_fill_manual(values =  c("white","skyblue"))+
  scale_x_datetime(date_labels = "%H:%M") +
  xlab("")+
  ylab("")+
  theme(legend.position = "none",
        axis.text = element_text(size = 14),
        axis.ticks = element_blank(),
        panel.grid.minor.y = element_line(size = 0.5, colour = "grey70")
        )+
  ggtitle("Daily Routine of selected Participants")

Inference

We notice that there is major difference in the routines of the 2 participants. Let us take a look at the details of each participants to understand these differences.

For ease of discussion, let us call them John and Bob.

participantDetails <- participants_data%>%
  filter(participantId == 0 | participantId == 173) %>%
  mutate(name = c('John','Bob'))

print.data.frame(participantDetails)
  participantId householdSize haveKids age      educationLevel
1             0             3     TRUE  36 HighSchoolOrCollege
2           173             1    FALSE  19            Graduate
  interestGroup   joviality name
1             H 0.001626703 John
2             C 0.785213730  Bob