Comparison of daily routines of 2 selected particpants from Engagement, Ohio, USA.
In this Take Home exercise, we will be studying the daily routines of 2 participants. We will analyse the similarities as well as the differences between 2 particiapnts and their daily patterns. We will also analyse the patterns over different days of the week as well as over the available period, to understand the variation in patterns on weekends and possibly, vacations.
We will first load the required packages using the below code chunk
packages = c('tidyverse','ViSiElse','lubridate','ggplot2','ggthemes','hrbrthemes','scales','ggdist','gghalves')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
participants_data <- read_csv("data/Participants.csv")
We will select the data related to the Particiapants that we are interested in and save it as an rds file so that we do not need to deal with large unnecessary data anymore
March_3rd_pattern <- read_rds("data/rds/partsample1.rds")
To visualize the daily routine of our selected participants, the data need to extensively manipulated. Lets get to it!
In the below code chunk, we create a new column to indicate when the participant is eating. This is based on when the hungerStatus changes to ‘Just Ate’. We will also consider that the participant is eating all the time when he is at a restaurant.
March_3rd_pattern$eat <- rep(0, length(March_3rd_pattern$timestamp))
March_3rd_pattern$eat[March_3rd_pattern$currentMode == 'AtRestaurant'] <- 1
March_3rd_pattern <- March_3rd_pattern[order(as.POSIXct(March_3rd_pattern$timestamp)),]
March_3rd_pattern <- March_3rd_pattern %>%
group_by(participantId) %>%
mutate(eat = ifelse(lag(hungerStatus) != 'JustAte' & hungerStatus =='JustAte' & lag(eat) != 1,1,eat))%>%
ungroup()
We will create new columns to indicate when he is travelling, when he is at work, when he is at some recreation activity, when heis at home and when he is sleeping. So these are the basic activities that we will track to understand his routine.
We will then pivot the table to have activity as one column and a status column to indicate if the corresponding activity is ‘ON’ or ‘OFF’.
March_3rd_pattern <- March_3rd_pattern %>%
mutate(travel = ifelse(March_3rd_pattern$currentMode == 'Transport',1,0)) %>%
mutate(work = ifelse(March_3rd_pattern$currentMode == 'AtWork', 1,0)) %>%
mutate(recreation = ifelse(March_3rd_pattern$currentMode == 'AtRecreation',1,0)) %>%
mutate(athome = ifelse(March_3rd_pattern$currentMode == 'AtHome',1,0)) %>%
mutate(sleep = ifelse(March_3rd_pattern$sleepStatus == 'Sleeping',1,0))
March_3rd_pattern <- March_3rd_pattern %>%
select(timestamp,participantId,athome,sleep,eat, travel,work,recreation)
LongTable <- March_3rd_pattern %>%
pivot_longer(cols = c(athome,sleep,eat,travel,work,recreation), names_to = "Activity",values_to = "Status")
Activity_Levels <- c('recreation','work','travel','eat','sleep','athome')
Status_Levels <- c(0,1)
LongTable <- LongTable %>%
mutate(Status = ifelse(LongTable$Status != 0 , 1, 0)) %>%
mutate(Activity = factor(LongTable$Activity, levels = Activity_Levels)) %>%
mutate(Status = factor(LongTable$Status, levels = Status_Levels))
We will visualize the daily activities using geom tile for the 2 participants.
LongTable %>%
ggplot(aes(timestamp, Activity, fill = as.factor(Status)))+
geom_tile(color = "white")+
facet_wrap(~LongTable$participantId, nrow = 2, colors())+
theme_tufte(base_family = "Helvetica")+
scale_fill_manual(values = c("white","skyblue"))+
scale_x_datetime(date_labels = "%H:%M") +
xlab("")+
ylab("")+
theme(legend.position = "none",
axis.text = element_text(size = 14),
axis.ticks = element_blank(),
panel.grid.minor.y = element_line(size = 0.5, colour = "grey70")
)+
ggtitle("Daily Routine of selected Participants")
We notice that there is major difference in the routines of the 2 participants. Let us take a look at the details of each participants to understand these differences.
For ease of discussion, let us call them John and Bob.
participantDetails <- participants_data%>%
filter(participantId == 0 | participantId == 173) %>%
mutate(name = c('John','Bob'))
print.data.frame(participantDetails)
participantId householdSize haveKids age educationLevel
1 0 3 TRUE 36 HighSchoolOrCollege
2 173 1 FALSE 19 Graduate
interestGroup joviality name
1 H 0.001626703 John
2 C 0.785213730 Bob
John goes to bed early where as Bob is a late sleeper. The fact that John is 36 and Bob is 19 could be the reason.
John travels a longer distance to work. Bob evidently stays closer to his work place and hence gets to sleep a little longer.
Bob spends a long stretch at work in the morning, he goes for his lunch a little later compared to John.
Bob spends a fairly larger time on recreational activities in the evening than John.
John seems to be more health conscious, having only 2 meals a day. Bob has a late night munch, probably hungry after his recreational activities.