options(width = 300) # This code chunk will make that the chunks are wider: 

2 Data

We create some data for this example:

data<-read.table(text="id   Tag_Monat_Jahr  Monat_Tag_Jahr  dmy_hms dob dod date_event  date_release_white_album    start_rooftop_concert   end_rooftop_concert
1   28.1.2020   1.28.2020   28:01:2020:17:55:05 18.06.1942  NA  3.01.1962   22.11.1968  30.01.1969.12:30    30.01.1969.13:12
2   03.2.2019   2.3.2019    03.2.2019:09:30:10  09.10.1940  08.12.1980  3.01.1962   22.11.1968  30.01.1969.12:30    30.01.1969.13:12
3   01.12.1987  12.01.1987  01/12/1987/20:30:10 25.02.1943  29.11.2001  3.01.1962   22.11.1968  30.01.1969.12:30    30.01.1969.13:12
4   8.9.1821    9.8.1821    08:09:1821:10:45:20 7.7.1940    NA  3.01.1962   22.11.1968  30.01.1969.12:30    30.01.1969.13:12", 
                 header=TRUE)

3 Add the current date (without time)

data<-data %>% 
  mutate(date_exercise=today())
head(data)
##   id Tag_Monat_Jahr Monat_Tag_Jahr             dmy_hms        dob        dod date_event date_release_white_album start_rooftop_concert end_rooftop_concert date_exercise
## 1  1      28.1.2020      1.28.2020 28:01:2020:17:55:05 18.06.1942       <NA>  3.01.1962               22.11.1968      30.01.1969.12:30    30.01.1969.13:12    2022-01-22
## 2  2      03.2.2019       2.3.2019  03.2.2019:09:30:10 09.10.1940 08.12.1980  3.01.1962               22.11.1968      30.01.1969.12:30    30.01.1969.13:12    2022-01-22
## 3  3     01.12.1987     12.01.1987 01/12/1987/20:30:10 25.02.1943 29.11.2001  3.01.1962               22.11.1968      30.01.1969.12:30    30.01.1969.13:12    2022-01-22
## 4  4       8.9.1821       9.8.1821 08:09:1821:10:45:20   7.7.1940       <NA>  3.01.1962               22.11.1968      30.01.1969.12:30    30.01.1969.13:12    2022-01-22

4 Add the current date and time

data<-data %>% 
  mutate(date_time_exercise=now())

5 Look at the data before transforming them to date formats

We use the skim function from the skimr package:

str(data)
## 'data.frame':    4 obs. of  12 variables:
##  $ id                      : int  1 2 3 4
##  $ Tag_Monat_Jahr          : chr  "28.1.2020" "03.2.2019" "01.12.1987" "8.9.1821"
##  $ Monat_Tag_Jahr          : chr  "1.28.2020" "2.3.2019" "12.01.1987" "9.8.1821"
##  $ dmy_hms                 : chr  "28:01:2020:17:55:05" "03.2.2019:09:30:10" "01/12/1987/20:30:10" "08:09:1821:10:45:20"
##  $ dob                     : chr  "18.06.1942" "09.10.1940" "25.02.1943" "7.7.1940"
##  $ dod                     : chr  NA "08.12.1980" "29.11.2001" NA
##  $ date_event              : chr  "3.01.1962" "3.01.1962" "3.01.1962" "3.01.1962"
##  $ date_release_white_album: chr  "22.11.1968" "22.11.1968" "22.11.1968" "22.11.1968"
##  $ start_rooftop_concert   : chr  "30.01.1969.12:30" "30.01.1969.12:30" "30.01.1969.12:30" "30.01.1969.12:30"
##  $ end_rooftop_concert     : chr  "30.01.1969.13:12" "30.01.1969.13:12" "30.01.1969.13:12" "30.01.1969.13:12"
##  $ date_exercise           : Date, format: "2022-01-22" "2022-01-22" "2022-01-22" "2022-01-22"
##  $ date_time_exercise      : POSIXct, format: "2022-01-22 14:44:11" "2022-01-22 14:44:11" "2022-01-22 14:44:11" "2022-01-22 14:44:11"

The variable Tag_Monat_Jahr is not yet in the correct data format.

class(data$Tag_Monat_Jahr)
## [1] "character"

The variable date_exercise is already in the correct format, because we created with with the r function today.

class(data$date_exercise)
## [1] "Date"
skimr::skim(data)
Data summary
Name data
Number of rows 4
Number of columns 12
_______________________
Column type frequency:
character 9
Date 1
numeric 1
POSIXct 1
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Tag_Monat_Jahr 0 1.0 8 10 0 4 0
Monat_Tag_Jahr 0 1.0 8 10 0 4 0
dmy_hms 0 1.0 18 19 0 4 0
dob 0 1.0 8 10 0 4 0
dod 2 0.5 10 10 0 2 0
date_event 0 1.0 9 9 0 1 0
date_release_white_album 0 1.0 10 10 0 1 0
start_rooftop_concert 0 1.0 16 16 0 1 0
end_rooftop_concert 0 1.0 16 16 0 1 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
date_exercise 0 1 2022-01-22 2022-01-22 2022-01-22 1

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
id 0 1 2.5 1.29 1 1.75 2.5 3.25 4 ▇▇▁▇▇

Variable type: POSIXct

skim_variable n_missing complete_rate min max median n_unique
date_time_exercise 0 1 2022-01-22 14:44:11 2022-01-22 14:44:11 2022-01-22 14:44:11 1

6 Transform data to data

First, we try to do it the wrong way. The next code will not work because the value for months would be 28, which is not possible.

data$Tag_Monat_Jahr
## [1] "28.1.2020"  "03.2.2019"  "01.12.1987" "8.9.1821"
mdy(data$Tag_Monat_Jahr) # does not work because 28 cannot be the months
## Warning: 1 failed to parse.
## [1] NA           "2019-03-02" "1987-01-12" "1821-08-09"

7 Here the correct way to do it (with the dplyr/tidyverse approach)

data<-data %>% 
  mutate(Tag_Monat_Jahr=dmy(Tag_Monat_Jahr), 
         Monat_Tag_Jahr=mdy(Monat_Tag_Jahr), 
         dmy_hms=dmy_hms(dmy_hms), 
         dob=dmy(dob), 
         dod=dmy(dod), 
         date_event=dmy(date_event),
         date_release_white_album=dmy(date_release_white_album),
         start_rooftop_concert=dmy_hm(start_rooftop_concert), 
         end_rooftop_concert=dmy_hm(end_rooftop_concert))

8 Calculations with dates and times

Calculate year of death and age at timepoint of the creation of this exercise, as well as the age at the rooftop concert

data<-data %>% 
  mutate(age_at_death=time_length(interval(dob, dod), "years"), 
         age_at_time_of_exercise=time_length(interval(dob, date_exercise), "years"), 
         age_at_rooftop_concert=time_length(interval(dob, start_rooftop_concert), "years"))

Calculate duration of the rooftop concert

data<-data %>% 
  mutate(duration_minutes_rooftop_concert=time_length(interval(start_rooftop_concert, end_rooftop_concert), "minutes"))

Calculate duration between the release of the white album and the rooftop concert

data<-data %>% 
  mutate(interval_months_white_album_rooftop_concert=time_length(interval(date_release_white_album, start_rooftop_concert), "months"))

9 Extract elements

Extract year, month, day, time from the start of the rooftop concert

data<-data %>% 
  mutate(year_rooftop_concert=year(start_rooftop_concert), 
         months_rooftop_concert=month(start_rooftop_concert), 
         day_rooftop_concert=mday(start_rooftop_concert), 
         day_in_year_rooftop_concert=yday(start_rooftop_concert), 
         day_in_week_rooftop_concert=wday(start_rooftop_concert), 
         weekday_name_rooftop_concert=wday(start_rooftop_concert, label=TRUE, abbr=FALSE))

DT::datatable(data, filter='top', style = 'bootstrap')
We can now use the extracted elements here, see the picture of the code to use r variables inline in the text .

The Rooftop concert was in the year 1969. It was the 1. months of the year and the 30 day of the months. This was a Thursday, Thursday, Thursday, Thursday and the duration was 42.

If you want to learn more, a good video is the following one (about 12 minutes long and with way more details than I provided above):

So that’s it. With these examples you will be able to do most things you want to do with dates and times. If you wondered what the rooftop concert was: It was the last, let’s say so called concert, of The Beatles.

As one variables was the release date of the album The Beatles, which is better known as the White Album, I have to provide you here with a cover version of my most favourite song from this album. This version is not as good as the original song, but there is an excellent guitar solo performed by Prince (at the end of the video).