Convert to Date in R

How to Convert Character Strings into Dates using lubridate

By Daniel D. Bonneau in R Basics

June 23, 2022


Character to Date in R

Often, when we read in a data set that is supposed to contain date values, R will treat them as character types. This can cause problems if we’re attempting to use that date for a visualization, to extract the individual components (month, day, year), or simply order our data set from the most recent data to the oldest.

Luckily, we can convert our character strings to date variable types in R with just a few short lines of code. In this article, I’ll show you how to do this using the handy lubridate package inside of the tidyverse.

Reading in Our Data

To get started, we are going to use the same simple revenue data set that we used in the How to Load a CSV in R tutorial.

With that in mind, we first need to load in the lubridate package (or install it using the install.packages() function if you haven’t done so yet). I’ll also load in dplyr to make use of some of it’s functionality later on.

library(lubridate)
library(dplyr)

revenue <- read.csv("simple_revenue.csv")

Now that we’ve loaded our data set, let’s take a look to remind you all what the data inside of this file looks like:

glimpse(revenue)
## Rows: 30
## Columns: 2
## $ date    <chr> "6/1/2022", "6/2/2022", "6/3/2022", "6/4/2022", "6/5/2022", "6…
## $ revenue <chr> "$307.00 ", "$557.00 ", "$549.00 ", "$1,159.00 ", "$1,525.00 "…

Here, we can see that we have two columns date and revenue, both of which are read in as character’s. We won’t deal with the revenue column in this article (you can see that here), though we will also need to convert that to a numeric. However, our focus is on the date column.

Month Day Year Format

In order to transform a date column of this structure into a proper date format, we can use the mdy() function from the lubridate package. mdy() just stands for ‘Month Day Year’ and will work for separators of / and -, which are likely to be the way your date is separated between the individual values. We use this particular function because that is the way our data is structured.

If our data was in day/month/year format, we would use the dmy() function, and if it was year/month/day we could use the ymd() function, for example.

revenue %>%
  mutate(new_date = mdy(date)) %>%
  glimpse()
## Rows: 30
## Columns: 3
## $ date     <chr> "6/1/2022", "6/2/2022", "6/3/2022", "6/4/2022", "6/5/2022", "…
## $ revenue  <chr> "$307.00 ", "$557.00 ", "$549.00 ", "$1,159.00 ", "$1,525.00 …
## $ new_date <date> 2022-06-01, 2022-06-02, 2022-06-03, 2022-06-04, 2022-06-05, …

Here, we create a new column using dplyr mutate() and title it new_date. We do this just so we can keep our original date column and get a peek at the difference. As you can see from the output above, our new_date variable has a data type of <date>. Now you may be wondering… ‘okay, but what was the point of that it looks like it just reformatted it a bit?’.

I’m so glad you asked.

Using Date Variables to Extract Details

Now that we have a proper date variable, we’re able to use a handful of functions within the lubridate package to extract details about our date. For example, the day(), month(), and year() functions will extract the components of their same name. Let’s take a look.

revenue %>%
  mutate(new_date = mdy(date)) %>%
  mutate(day = day(new_date),
         month = month(new_date),
         year = year(new_date)) %>%
  head()
##       date    revenue   new_date day month year
## 1 6/1/2022   $307.00  2022-06-01   1     6 2022
## 2 6/2/2022   $557.00  2022-06-02   2     6 2022
## 3 6/3/2022   $549.00  2022-06-03   3     6 2022
## 4 6/4/2022 $1,159.00  2022-06-04   4     6 2022
## 5 6/5/2022 $1,525.00  2022-06-05   5     6 2022
## 6 6/6/2022 $1,310.00  2022-06-06   6     6 2022

Notice that all we did was apply those functions to the new_date variable we created and it allowed us to very easily separate out the basic components that make up our date. If we were to try that with our regular date column, an error would be thrown.

revenue %>%
  mutate(day = day(date),
         month = month(date),
         year = year(date)) %>%
  head()
## Error in `mutate()`:
## ! Problem while computing `day = day(date)`.
## Caused by error in `as.POSIXlt.character()`:
## ! character string is not in a standard unambiguous format

All we did in the code above is use the date column that is of a character type. This gives us the error that character string is not in a standard unambiguous format. If you get something like this, be sure to use the glimpse() or head() functions to take a look at the data and each variable’s type; or the str() function to see similar information. If it says the column you’re passing into day(), month(), or year() is a character, make sure you convert it to a Date type using the steps from above.

Dates Formatted Differently and Other Parsing Functions

While the date in our example was structured in a way that looks like it should automatically be treated as a date type, we can also parse this information if it is written in a more text-heavy way. For example:

mdy("July 4th, 2022")
## [1] "2022-07-04"
dmy("4th of July '22")
## [1] "2022-07-04"

As you can see, the lubridate package gives us a lot of flexibility when dealing with dates. After we convert this information to a formal date string, many more functions are at our disposal. For example, we can get the week with the week() and isoweek() functions; the day of the week with the wday() function, and even check if the year is a leap year with leap_year(), among many other things.

We’ll likely continue convering the lubridate package throughout this series as working with dates can be an important part of analyzing your data, but for now - I hope this has helped you convert your character string into a formal Date variable so you can use it properly.

For a cheat-sheet on the lubridate package, check out the lubridate website.