Heat warning weather at YYZ

I have, somewhat by accident, become a bit of a cold weather expert for Toronto. My published research (Anderson and Gough 2017) and my blog explorations made me a natural choice to ask questions of when the weather got cold. So, when I got an email this week from UTSC communications to ask about heat in the city, I was a little taken aback. Was the sweltering weather that we saw at the start of July something that we can expect more of? I wasn’t sure, but as someone who is too frugal to pay for AC, this topic is of interest to me! So, I decided to take a look. Read on!

Definitions

To start this post, we need to set out some definitions. In Toronto, a heat warning is issued when the hot weather could pose a risk to people or pets. The city follows the Environment and Climate Change Canada (ECCC) thresholds. You can find the definitions for ECCC here, and for Toronto here. In short, heat warnings are issued by ECCC when two or more consecutive days have projected \(T_{max}\) at or above 31°C and \(T_{min}\) at or above 20°C, or, when the humidex is projected to reach or exceed 40 on two or more consecutive days. Toronto uses two classes of heat warnings: “normal” heat warnings when the conditions are forecast to last two days, or “extended” heat warnings, when the forecast conditions will be around for three or more days.

It is also important to define the humidex, a uniquely Canadian metric. According to the ECCC Meteorological Service of Canada glossary, the humidex is essentially a measure of how hot the weather “feels” when we factor in the humidity. The following equations summarize the humidex:

\[ \begin{align*} \mathrm{Humidex} &= T_{air} + h \\ h &= 0.5555 \times (e - 10.0)\\ e &= 6.11 \times \mathrm{exp}^{(5417.7530 \times (\frac{1}{273.16} - \frac{1}{T_{dew}} ))}\\ \mathrm{exp} &= 2.71828 \end{align*} \]

In the above equations, \(T_{dew}\) refers to the dew-point temperature, converted to Kelvin. While the glossary definition used 273.16 K as the conversion to 0°C, I assume that this is a typo, or an error. See this very detailed StackExchange Answer for some hints about where the number might have come from. I’ll use 273.15 K as my 0°C point (but it won’t make much of a difference). The \(\mathrm{exp}\) in the above equation represents the Euler number, a special constant. The very large number is a rounded constant based on the properties of water and vapour. The humidex is expressed in integers, so we round the above to the nearest whole number.

We can wrap all this up into a nice R function

hdx <- function(tair, tdew) {
  tdew = tdew + 273.15
  e = 6.11 * exp(1)^(5417.7530  * ((1 / 273.16) - (1 / tdew)))
  h = 0.5555 * (e - 10)
  round(tair + h)
}

So, for a temperature of 30 °C and a dew-point temperature of 15 °C, we’d have a humidex of 34.

hdx(30, 15)
## [1] 34

Obviously heat warnings are issued before a heat event, but in this case, my analysis is retrospective. As such, I’ll look at what I’ll call “heat warning weather”, that is, past conditions that met the forecast conditions and would have merited a heat warning. This is what I’ll be referring to from now on. I can’t guarantee that heat warning were actually emitted for the events that I detect.

Preparing the analysis

I typically work with daily data, but daily data doesn’t include humidex values. We’ll have to use hourly data instead. First, I’ll load the libraries I’ll use throughout this post, and then I’ll do a search of stations with hourly data for the past ~30 years

library(canadaHCDx)
library(claut)
library(dplyr)
library(lubridate)
library(plotly)
library(purrr)
library(tibble)
library(tidyr)

Finding the station

find_station("Toronto", baseline = 1988:2018, type = "hourly")
## Warning in get_station_data(assume_yes, quiet = TRUE): We can't download
## data, but none exists in the cache. An error occured during our update
## check. We will try again within 15 minutes. Falling back to built-in
## data.frame.
## # A tibble: 0 x 7
## # ... with 7 variables: Name <chr>, Province <chr>, StationID <int>,
## #   LatitudeDD <dbl>, LongitudeDD <dbl>, HourlyFirstYr <int>,
## #   HourlyLastYr <int>

Uhh ohh! There were no stations found that have 30 years of data. This is due to station code changes. I’ve accounted for this in the canadaHCDx search function with a special argument. Let’s search again with the “duplicates” parameter set to TRUE.

find_station("Toronto", baseline = 1988:2018, type = "hourly", duplicates = TRUE)
## Warning in get_station_data(assume_yes, quiet = TRUE): We can't download
## data, but none exists in the cache. An error occured during our update
## check. We will try again within 15 minutes. Falling back to built-in
## data.frame.
## Note: In addition to the stations found, the following combinations may provide sufficient baseline data.
## 
## >> Combination 1 at coordinates 43.6 -79.4 
## 
## 5085: TORONTO ISLAND A
## 5086: TORONTO IS A (AUT)
## 30247: TORONTO CITY CENTRE
## 48549: TORONTO CITY CENTRE
## 
## >> Combination 2 at coordinates 43.7 -79.4 
## 
## 5051: TORONTO
## 31688: TORONTO CITY
## 
## >> Combination 3 at coordinates 43.7 -79.6 
## 
## 5097: TORONTO LESTER B. PEARSON INT'L A
## 51459: TORONTO INTL A
## 
## >> Combination 4 at coordinates 43.9 -79.4 
## 
## 4841: TORONTO BUTTONVILLE A
## 53678: TORONTO BUTTONVILLE A
## 54239: TORONTO BUTTONVILLE A
## # A tibble: 0 x 7
## # ... with 7 variables: Name <chr>, Province <chr>, StationID <int>,
## #   LatitudeDD <dbl>, LongitudeDD <dbl>, HourlyFirstYr <int>,
## #   HourlyLastYr <int>

It looks like we have three options. I’ll use the Pearson airport station for this analysis. First, I should check the years of available data.

find_station(name = c("TORONTO LESTER B. PEARSON INT'L A", "TORONTO INTL A"), type = "hourly") %>%
  select(-Province)
## Warning in get_station_data(assume_yes, quiet = TRUE): We can't download
## data, but none exists in the cache. An error occured during our update
## check. We will try again within 15 minutes. Falling back to built-in
## data.frame.
## # A tibble: 2 x 6
##   Name         StationID LatitudeDD LongitudeDD HourlyFirstYr HourlyLastYr
##   <chr>            <int>      <dbl>       <dbl>         <int>        <int>
## 1 TORONTO INT…     51459       43.7       -79.6          2013         2018
## 2 TORONTO LES…      5097       43.7       -79.6          1953         2013

It would seem that in 2013, The “Toronto Lester B. Pearson Int’l A” station was renamed to “Toronto Intl A” and recoded. Let’s pull in data from these two stations for the summer months: June, July, and August (JJA).

tpe <- hcd_hourly(station = 5097, year = 1988:2013, month = 6:8)
tpe2 <- hcd_hourly(station = 51459, year = 2013:2018, month = 6:8)

Let’s see where we can merge these data. An important first step is to set the time zone of these data, which doesn’t happen by default in canadaHCD. The DateTime column of my data is set to UTC time. If I change the timezone, R automatically converts the time to the new timezone, i.e. midnight becomes 8 pm. I don’t want this. I could convert the time and then add four hours, but the force_tz() function in the lubridate package makes short work of this. Note that the ECCC data is always in the local standard time, even when we’re observing daylight savings time. (Thanks, Andrew for the reminder!)

tpe$DateTime <- force_tz(tpe$DateTime, tzone = "Etc/GMT-5")
tpe2$DateTime <- force_tz(tpe2$DateTime, tzone = "Etc/GMT-5")

Now to see about completeness. I’ll search for rows with less than five missing values. My use of five here is arbitrary. There are usually a couple of missing values in a row, such as WindChill, which doesn’t apply to the summer data, so my threshold needs to be above 2. Any day that is completely missing would have 10 missing values.

tpe[rowSums(is.na(tpe)) < 5,] %>% arrange(DateTime) %>% tail()
## # A tibble: 6 x 12
##   Station DateTime             Temp DewPointTemp RelHumidity WindDir
##     <int> <dttm>              <dbl>        <dbl>       <int>   <int>
## 1    5097 2013-06-13 08:00:00  15.6         15.2          97       1
## 2    5097 2013-06-13 09:00:00  15.9         14.8          93       1
## 3    5097 2013-06-13 10:00:00  17.4         15.3          87      36
## 4    5097 2013-06-13 11:00:00  17.9         15            83       1
## 5    5097 2013-06-13 12:00:00  20.9         15.6          72       4
## 6    5097 2013-06-13 13:00:00  21.8         14.5          63       3
## # ... with 6 more variables: WindSpeed <int>, Visibility <dbl>,
## #   Pressure <dbl>, Humidex <int>, WindChill <int>, Weather <chr>

Station 5097 collected data until June 13, 2013 at 1 pm.

tpe2[rowSums(is.na(tpe2)) < 6,] %>% arrange(DateTime) %>% head()
## # A tibble: 6 x 12
##   Station DateTime             Temp DewPointTemp RelHumidity WindDir
##     <int> <dttm>              <dbl>        <dbl>       <int>   <int>
## 1   51459 2013-06-11 09:00:00  17.8         16            89      34
## 2   51459 2013-06-11 10:00:00  17.4         15.9          91      34
## 3   51459 2013-06-11 11:00:00  19.5         16.2          81      33
## 4   51459 2013-06-11 12:00:00  19.6         15.3          76      32
## 5   51459 2013-06-11 13:00:00  21           14.7          67      33
## 6   51459 2013-06-11 14:00:00  21.4         14.9          66      33
## # ... with 6 more variables: WindSpeed <int>, Visibility <dbl>,
## #   Pressure <dbl>, Humidex <int>, WindChill <int>, Weather <chr>

Station 51459 collected data starting on June 11, 2013 at 9:00 am. Since we have overlapping data, we can have a quick look to see how the stations compare. I’ll look at the Temp column for this comparison.

val <- tibble(datetime = seq(from = as.POSIXct("2013-06-11 09:00:00", tz = "Etc/GMT-5"),
                             to = as.POSIXct("2013-06-13 13:00:00", tz = "Etc/GMT-5"),
                             by = 3600),
             tpe = filter(tpe, DateTime %in% datetime) %>% select(Temp) %>% unlist(),
             tpe2 = filter(tpe2, DateTime %in% datetime) %>% select(Temp) %>% unlist())
val %>% gather(key = "station", value = "Temp", -datetime) %>% 
  plot_ly(x = ~datetime, y = ~Temp, color = ~station) %>% add_markers()
summary(abs(val$tpe - val$tpe2))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.100   0.200   0.208   0.300   0.700

There seems to be a very minor cold bias at station 51459 overnight between June 12 and 13, but not enough to perform any significant bias correction. There is an average difference between points of just over two tenths of a degree Celsius.

What about the dew-point temperature?

val <- tibble(datetime = seq(from = as.POSIXct("2013-06-11 09:00:00", tz = "Etc/GMT-5"),
                             to = as.POSIXct("2013-06-13 13:00:00", tz = "Etc/GMT-5"),
                             by = 3600),
             tpe = filter(tpe, DateTime %in% datetime) %>% select(DewPointTemp) %>% unlist(),
             tpe2 = filter(tpe2, DateTime %in% datetime) %>% select(DewPointTemp) %>% unlist())
summary(abs(val$tpe - val$tpe2))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.10    0.80    0.90    1.04    1.20    3.10

There is actually a larger deviation between the DewPointTemp variable at the two stations, with differences of up to 3.1°C and an average of 1.0°C. This could influence the humidex results by a few points, but it is hard to predict, since the metric depends on the magnitude of the dew point temperature. I will leave the data as is. After all, I don’t want to end up fudging the numbers the wrong way.

For convenience, I’ll join the two station data sets between June 10 and 11.

tpe <- bind_rows(filter(tpe, DateTime < as.POSIXct("2013-06-12 00:00:00", tz = "Etc/GMT-5")),
                 filter(tpe2, DateTime >= as.POSIXct("2013-06-12 00:00:00", tz = "Etc/GMT-5"))) %>%
  arrange(DateTime)

As of the time of writing, I have fairly complete data up to July 20, 2018.

Calculating the humidex

Now for our humidex values.

Environment Canada seems to have calculated the humidex … sometimes. I can ensure that my equation is correct by performing a quick validation on their values.

val <- filter_at(tpe, vars(Temp, DewPointTemp, Humidex), all_vars(!is.na(.))) %>%
  mutate(hdx = hdx(Temp, DewPointTemp))
sum(abs(val$Humidex - val$hdx))
## [1] 0

Good, it seems that there is no difference between my calculations and theirs. Let’s fill in the missing values.

tpe <- mutate(tpe, Humidex = hdx(Temp, DewPointTemp))

Aggregating to daily

To generate daily data, Environment Canada uses 6 am UTC as the “start” of a new meteorological day. This is standard across the country. For this analysis, that means that the overnight temperatures after midnight but before 1 am tomorrow morning will still officially be “tonight”. This method isn’t necessarily optimal. We can take a look at an examples.

Let’s first take a look at the late evening on June 28 and the early morning hours of June 29, 2018.

tpe %>% filter(DateTime >= as.POSIXct("2018-06-28 22:00:00", tz = "Etc/GMT-5") &
                 DateTime <= as.POSIXct("2018-06-29 08:00:00", tz = "Etc/GMT-5"))
## # A tibble: 11 x 12
##    Station DateTime             Temp DewPointTemp RelHumidity WindDir
##      <int> <dttm>              <dbl>        <dbl>       <int>   <int>
##  1   51459 2018-06-28 22:00:00  22.9         16.7          68      28
##  2   51459 2018-06-28 23:00:00  22.6         17.2          71      27
##  3   51459 2018-06-29 00:00:00  21.2         17.2          78      27
##  4   51459 2018-06-29 01:00:00  19.9         16.8          82      25
##  5   51459 2018-06-29 02:00:00  19.9         17.3          85      29
##  6   51459 2018-06-29 03:00:00  20.1         17.8          86      29
##  7   51459 2018-06-29 04:00:00  20.3         18            86      30
##  8   51459 2018-06-29 05:00:00  19.4         16.5          83      36
##  9   51459 2018-06-29 06:00:00  21.1         17.2          78      30
## 10   51459 2018-06-29 07:00:00  23.5         17.5          69      25
## 11   51459 2018-06-29 08:00:00  25.9         16.8          57      27
## # ... with 6 more variables: WindSpeed <int>, Visibility <dbl>,
## #   Pressure <dbl>, Humidex <dbl>, WindChill <int>, Weather <chr>

As we can see, above, the temperatures overnight between June 28 and June 29 fell to as low as 19.4°C at 5 am. That means that we didn’t see weather that merited a heat alert.

Now on the other side, late evening on June 29 and the early morning on June 30, 2018.

tpe %>% filter(DateTime >= as.POSIXct("2018-06-29 22:00:00", tz = "Etc/GMT-5") &
                 DateTime <= as.POSIXct("2018-06-30 08:00:00", tz = "Etc/GMT-5"))
## # A tibble: 11 x 12
##    Station DateTime             Temp DewPointTemp RelHumidity WindDir
##      <int> <dttm>              <dbl>        <dbl>       <int>   <int>
##  1   51459 2018-06-29 22:00:00  26.1         18.1          60      21
##  2   51459 2018-06-29 23:00:00  25.4         18.4          65      19
##  3   51459 2018-06-30 00:00:00  24.5         18.6          69      19
##  4   51459 2018-06-30 01:00:00  24           18.2          70      22
##  5   51459 2018-06-30 02:00:00  23.7         18.1          70      20
##  6   51459 2018-06-30 03:00:00  22.4         17.8          75      25
##  7   51459 2018-06-30 04:00:00  22.2         18.5          79      24
##  8   51459 2018-06-30 05:00:00  23.2         18.8          76      18
##  9   51459 2018-06-30 06:00:00  23.6         19.3          77      22
## 10   51459 2018-06-30 07:00:00  24.1         19.6          76      13
## 11   51459 2018-06-30 08:00:00  25.5         20.8          75      19
## # ... with 6 more variables: WindSpeed <int>, Visibility <dbl>,
## #   Pressure <dbl>, Humidex <dbl>, WindChill <int>, Weather <chr>

Temperatures overnight from June 29 into June 30 did not fall below 22.2°C. Thus, given that it got as hot as 31°C on June 29, and the temperature didn’t fall below 20°C, that day should be classed as heat warning weather. If we choose 1 am as the cutoff time, however, the cooler period overnight from June 28 to June 29 means that June 29 did not meet the heat warning criteria.

When a heat warning is issued, it is issued for forecast temperatures. ECCC doesn’t say, “It is going to be hot tonight but since it was cool at 1 am last night, you have nothing to worry about”. We, likewise, live chronologically. What do I care that it was cool in the wee hours of the previous night if I am suffering now? This is where the so-called “climatological observing window” (COW) becomes important. See this great article on the subject by Žaknić-Ćatović and Gough (2018).

Given that I am trying to detect days when the temperatures got hot, and didn’t drop below 20°C overnight, I’m going to use the sun-based COWND proposed by Žaknić-Ćatović and Gough (2018), which I have implemented in the claut package (See: COW_ND()). COWND defines each climatological day as one hour past sunrise until one hour past sunrise the following calendar day. This way, we can avoid double-counting late-evening cooling on both sides of 1 am, and thus, avoid having a day shaved off a heat event due to temperatures from the night before. I’ll create a data set of the maximum daily temperature (\(T_{max}\)), minimum daily temperature (\(T_{min}\)), the average of those two values (\(T_{mean}\)), and the peak daily humidex value. I’ll also add a Boolean column to flag days that met the criteria for a heat warning.

tpe_dly <- COW_ND(dat = select(tpe, DateTime, Temp, Humidex), lat = 43.68, lon = -79.63, tz = -5) %>%
  select(-MinHumidex) %>%
  mutate(MeanTemp = round((MaxTemp + MinTemp)/2, 1),
         Criteria = (MaxTemp >= 31 & MinTemp >= 20) | MaxHumidex >= 40) %>%
  filter(Date >= "1988-06-01" & Date <= "2018-07-19")

Let’s take a look at the weather since late June, for instance.

tpe_dly %>% filter(Date >= "2018-06-28")
## # A tibble: 22 x 6
##    Date       MaxTemp MaxHumidex MinTemp MeanTemp Criteria
##    <date>       <dbl>      <dbl>   <dbl>    <dbl> <lgl>   
##  1 2018-06-28    29.4         34    19.4     24.4 FALSE   
##  2 2018-06-29    31           37    22.2     26.6 TRUE    
##  3 2018-06-30    34.8         46    24       29.4 TRUE    
##  4 2018-07-01    34.1         43    24.5     29.3 TRUE    
##  5 2018-07-02    33.1         43    19.3     26.2 TRUE    
##  6 2018-07-03    30.2         34    19.7     24.9 FALSE   
##  7 2018-07-04    32.7         39    23.4     28.1 TRUE    
##  8 2018-07-05    33.2         43    17.3     25.2 TRUE    
##  9 2018-07-06    22.8         24    NA       NA   FALSE   
## 10 2018-07-07    25.3         26    15.8     20.6 FALSE   
## # ... with 12 more rows

Analyzing heat events

I established a methodology of analyzing short, non-uniform periods when I looked at cold snaps back in January. My methodology hasn’t changed much since the winter, but I have wrapped it up in a cleaner function and shipped it in the claut package. See: describe_snaps() for the code. So, let’s take a look at some summertime heat warnings.

tpe_warns <- tpe_dly %>% 
  group_by(Year = year(Date)) %>%
  mutate(Start = as.Date(describe_snaps(rle(Criteria), 2, Date, min), origin = origin),
         End = as.Date(describe_snaps(rle(Criteria), 2, Date, max), origin = origin),
         Length = describe_snaps(rle(Criteria), 2)) %>%
  filter(!is.na(Length)) %>%
  group_by(Year, Start, End, Length) %>%
  summarize_all(mean) %>%
  select(-Criteria, -Date)

tpe_warns
## # A tibble: 68 x 8
## # Groups:   Year, Start, End [68]
##     Year Start      End        Length MaxTemp MaxHumidex MinTemp MeanTemp
##    <dbl> <date>     <date>      <int>   <dbl>      <dbl>   <dbl>    <dbl>
##  1  1988 1988-06-19 1988-06-20      2    32.7       37      18.1     25.4
##  2  1988 1988-07-07 1988-07-10      4    36.0       39.8    21.1     28.5
##  3  1988 1988-08-02 1988-08-05      4    32.5       42.5    23.6     28.1
##  4  1988 1988-08-12 1988-08-14      3    33.2       42      23.3     28.3
##  5  1991 1991-06-27 1991-06-28      2    32.8       38.5    22.0     27.4
##  6  1991 1991-07-17 1991-07-20      4    32.9       39.2    21.6     27.3
##  7  1993 1993-07-08 1993-07-09      2    31.8       41      21.2     26.6
##  8  1993 1993-08-26 1993-08-27      2    32.5       40      20.8     26.6
##  9  1994 1994-06-16 1994-06-18      3    33.5       41      21       27.2
## 10  1994 1994-07-05 1994-07-06      2    31.5       41      20.8     26.2
## # ... with 58 more rows

Since June 1988, there have been 68 weather events that merited a heat warning in Toronto. Of those, 27 should have been “extended warnings”. That leaves 41 events that brought exactly two-days of extra hot weather. The average length of heat wearning weather is 2.63 days. In the plot below, I show these heat warnings. The y-axis represents the average daily maximum humidex. The colour represents the average daily maximum temperature. The size of the points represents the length of the heat warning weather.

plot_ly(data = tpe_warns, x = ~Start, y = ~MaxHumidex, color = ~MaxTemp, size = ~Length,
        hoverinfo = "text", hovertext = paste("Length:", tpe_warns$Length,
                                              "days")) %>% add_markers() %>%
  layout(
    title = "Summertime heat warning weather in Toronto, 1988-2018 (YTD)",
    yaxis = list(title = "Average Daily Maximum Humidex"),
    xaxis = list(title = "Start Date",
                      range = c(min(tpe_warns$Start - 1000), max(tpe_warns$Start) + 1000)))
## Warning: `line.width` does not currently support multiple values.

Heat warning rankings

Let’s look at some top five lists.

Longest heat events

tpe_warns %>% arrange(desc(Length)) %>% head(5)
## # A tibble: 5 x 8
## # Groups:   Year, Start, End [5]
##    Year Start      End        Length MaxTemp MaxHumidex MinTemp MeanTemp
##   <dbl> <date>     <date>      <int>   <dbl>      <dbl>   <dbl>    <dbl>
## 1  1999 1999-07-22 1999-07-26      5    32.4       36.8    21.2     26.8
## 2  2001 2001-08-05 2001-08-09      5    35.1       41      NA       NA  
## 3  2005 2005-07-10 2005-07-14      5    33.5       39.2    22.9     28.2
## 4  2010 2010-07-04 2010-07-08      5    32.7       41.6    23.4     28.1
## 5  2016 2016-08-09 2016-08-13      5    32.6       39.6    22.9     27.7

Hottest heat events

tpe_warns %>% arrange(desc(MaxTemp)) %>% head(5)
## # A tibble: 5 x 8
## # Groups:   Year, Start, End [5]
##    Year Start      End        Length MaxTemp MaxHumidex MinTemp MeanTemp
##   <dbl> <date>     <date>      <int>   <dbl>      <dbl>   <dbl>    <dbl>
## 1  2011 2011-07-20 2011-07-21      2    36.4       44.5    24.5     30.4
## 2  1988 1988-07-07 1988-07-10      4    36.0       39.8    21.1     28.5
## 3  2001 2001-08-05 2001-08-09      5    35.1       41      NA       NA  
## 4  2011 2011-07-17 2011-07-18      2    34.6       41.5    23.6     29.1
## 5  1995 1995-07-13 1995-07-15      3    34.4       44.3    21.7     28.0

Least relief

tpe_warns %>% arrange(desc(MinTemp)) %>% head(5)
## # A tibble: 5 x 8
## # Groups:   Year, Start, End [5]
##    Year Start      End        Length MaxTemp MaxHumidex MinTemp MeanTemp
##   <dbl> <date>     <date>      <int>   <dbl>      <dbl>   <dbl>    <dbl>
## 1  2006 2006-07-31 2006-08-02      3    34.4       45.7    25.0     29.7
## 2  2011 2011-07-20 2011-07-21      2    36.4       44.5    24.5     30.4
## 3  2002 2002-07-31 2002-08-01      2    33.3       40      24.3     28.8
## 4  2013 2013-07-16 2013-07-19      4    33.5       42.8    23.9     28.7
## 5  2002 2002-06-30 2002-07-03      4    34.0       41.8    23.8     28.8

Highest Humidex

tpe_warns %>% arrange(desc(MaxHumidex)) %>% head(5)
## # A tibble: 5 x 8
## # Groups:   Year, Start, End [5]
##    Year Start      End        Length MaxTemp MaxHumidex MinTemp MeanTemp
##   <dbl> <date>     <date>      <int>   <dbl>      <dbl>   <dbl>    <dbl>
## 1  2006 2006-07-31 2006-08-02      3    34.4       45.7    25.0     29.7
## 2  2011 2011-07-20 2011-07-21      2    36.4       44.5    24.5     30.4
## 3  1995 1995-07-13 1995-07-15      3    34.4       44.3    21.7     28.0
## 4  2002 2002-07-28 2002-07-29      2    32.1       43.5    21.6     26.8
## 5  1999 1999-07-04 1999-07-06      3    33.6       43.3    23       28.3

We can also summarize these events by year. For the temperature and humidex, I’ll weight the averages by the heat alert length.

summ <- left_join(tpe_warns %>% group_by(Year) %>% count(),
                  tpe_warns %>% group_by(Year) %>%
                    summarize(AvgLen = mean(Length),
                              AvgTmax = weighted.mean(MaxTemp, Length),
                              AvgTmax = weighted.mean(MinTemp, Length),
                              AvgTmean = weighted.mean(MeanTemp, Length),
                              AvgMaxHumidex = weighted.mean(MaxHumidex, Length)),
                  by = "Year")

print(summ, n = Inf)
## # A tibble: 25 x 6
## # Groups:   Year [?]
##     Year     n AvgLen AvgTmax AvgTmean AvgMaxHumidex
##    <dbl> <int>  <dbl>   <dbl>    <dbl>         <dbl>
##  1  1988     4   3.25    21.9     27.8          40.7
##  2  1991     2   3       21.8     27.3          39  
##  3  1993     2   2       21.0     26.6          40.5
##  4  1994     2   2.5     20.9     26.8          41  
##  5  1995     3   2.67    21.4     27.4          40.4
##  6  1997     1   2       21.4     27.0          39  
##  7  1998     3   2       21.3     26.4          38.3
##  8  1999     4   3.25    21.8     27.3          39.6
##  9  2001     3   3       NA       NA            39.9
## 10  2002     6   2.5     22.7     27.9          39.9
## 11  2003     3   2.67    20.7     26.4          38.5
## 12  2004     1   2       19.3     25.2          39  
## 13  2005     4   2.75    22.2     27.8          40.1
## 14  2006     3   2.67    23.0     28.2          41.9
## 15  2007     3   2.67    22.0     27.6          40  
## 16  2008     1   2       19.4     25.9          39.5
## 17  2009     1   2       19.9     24.6          40.5
## 18  2010     3   3       22.7     27.7          41.3
## 19  2011     4   2       23.0     28.2          40.6
## 20  2012     2   3       22.2     27.9          40.5
## 21  2013     2   3       23.3     28.1          41.8
## 22  2015     2   2       20.3     25.9          38.5
## 23  2016     5   2.8     21.5     27.1          38.5
## 24  2017     1   2       21.6     26.8          38.5
## 25  2018     3   2.67    21.4     27.2          41.5

According to this summary, we have already seen heat warning weather three times in 2018, which is above the average of 2.72. If this weather keeps up, we might beat the record of 6 heat events in 2002. Let’s take a look on a plot. I have added an extra variable, n, so I’ll have to discard one of my climate variables. In the below plot, the y-axis represents the number of heat waves per year, the size of the point is the average length, and the colour is the average daily maximum humidex during heat warnings.

plot_ly(data = summ, x = ~Year, y = ~n, color = ~AvgMaxHumidex, size = ~AvgLen,
        hoverinfo = "text", hovertext = paste(summ$n,
                                              "event(s) with mean duration of",
                                              round(summ$AvgLen, 1),
                                              "days")) %>% add_markers() %>%
  layout(
    title = "Annual summertime heat warning weather in Toronto, 1988-2018 (YTD)",
    yaxis = list(title = "Number of heat warnings"))
## Warning: `line.width` does not currently support multiple values.

What about 2018?

If we aren’t seeing increasing trends in the last 30 years, and the warm events so far this year have been relatively short, what was all the fuss about this month? Was it just a case of our memory biasing more recent events? Do we Canadians just love to complain about the weather.1 Not exactly. I think we found the heatwave in July to be particularly unbearable because of the lack of relief. The first ECCC weather alert (which someone has, thankfully, archived here) came out on June 26, forecasting extreme weather from June 28 or June 29 through to July 2. Their forecast was spot on. On July 3, another alert was issued for extreme weather through to Thursday July 5. ECCC called it “the most significant heat event in the past few years”. Indeed, two of the three heat events so far this year were June 29 to July 2, and July 4 to July 5. July 3 saw temperatures above 30 degrees, but missed the \(T_{max}\) threshold by 0.8°C and the \(T_{min}\) threshold by just three tenths of a degree Celsius. We can tweak our criteria a little. Let’s say, “if the two days before were super hot, and the two days after were super hot, then it was damn hot!”

tpe_dly <- tpe_dly %>%
  mutate(Criteria = Criteria | 
           (lead(Criteria, 2) & lead(Criteria) & lag(Criteria) & lag(Criteria, 2)))

tpe_warns_lax <- tpe_dly %>% 
  group_by(Year = year(Date)) %>%
  mutate(Start = as.Date(describe_snaps(rle(Criteria), 2, Date, min),
                         origin = origin),
         End = as.Date(describe_snaps(rle(Criteria), 2, Date, max),
                         origin = origin),
         Length = describe_snaps(rle(Criteria), 2)) %>%
  filter(!is.na(Length)) %>%
  group_by(Year, Start, End, Length) %>%
  summarize_all(mean) %>%
  select(-Criteria, -Date)

Now, let’s take a look at the length of the top 3 events.

tpe_warns_lax %>% arrange(desc(Length)) %>% head(3)
## # A tibble: 3 x 8
## # Groups:   Year, Start, End [3]
##    Year Start      End        Length MaxTemp MaxHumidex MinTemp MeanTemp
##   <dbl> <date>     <date>      <int>   <dbl>      <dbl>   <dbl>    <dbl>
## 1  2018 2018-06-29 2018-07-05      7    32.7       40.7    21.5     27.1
## 2  1999 1999-07-22 1999-07-26      5    32.4       36.8    21.2     26.8
## 3  2001 2001-08-05 2001-08-09      5    35.1       41      NA       NA

There you have it! The Canada Day 2018 heat event was the longest in the past 30 years, if we leave room for one brief moment of “relief”.

Origin of air during heat events

So, where does this hot weather come from? I think this goes without saying, but it certainly isn’t polar air in the city when we’re in the middle of a heat wave. Indeed, ECCC’s heat warnings often contain language like “a passing warm front” or, like the June 26 warning linked to above, “hot, humid air from the Gulf of Mexico”. So, is it always warm, tropical air that causes us to feel the heat? Let’s look at the SSC2 air masses by Sheridan (2002) that I’ve used in my previous posts.

I will bin these values, as I have done in the past, but I will split the tropical air into dry and moist bins, which were less important for winter weather, but are useful here.

  1. ‘cool’: SSC codes 2 (dry polar) and 5 (moist polar)
  2. ‘mod’: SSC codes 1 (dry moderate) and 4 (moist moderate)
  3. ‘dt’: SSC code 3 (dry tropical)
  4. ‘mt’: SSC codes 6 (moist tropical), 66 (moist tropical plus), and 67 (moist tropical double plus)
  5. ‘trans’: SCC code 7 (transition)
f <- tempfile()
download.file("http://sheridan.geog.kent.edu/ssc/files/YYZ.dbdmt", f, quiet = TRUE)
air <- read_table(f, col_names = c("Station", "Date", "Air"), col_types = cols(col_character(), col_date(format = "%Y%m%d"), col_factor(levels = c(1, 2, 3, 4, 5, 6, 7, 66, 67))))
levels(air$Air) <- list(cool = c(2, 5), mod = c(1, 4), dt = 3, mt = c(6, 66, 67), trans = "7")

I will add these air mass codes to the daily data set, and determine the proportion of each air mass type during heat warning weather.

tpe_air <- left_join(tpe_dly, select(air, -Station), by = "Date") %>%
  filter(Criteria) %>%
  group_by(Air) %>%
  count() %>%
  mutate(prop = n/sum(tpe_dly$Criteria, na.rm = TRUE)) %>%
  select(-n) %>%
  ungroup %>%
  spread(Air, prop, fill = 0)

tpe_air
## # A tibble: 1 x 5
##     mod    dt    mt trans `<NA>`
##   <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 0.012 0.472  0.46 0.048  0.008

It’s true! When we experience super hot weather in the city, it is due to the presence of a warm front of hot, tropical air over the city. Of all the summer days where the criteria for a heat warning were met, none showed air of polar origin, and only 1.2% showed continental, “moderate” air. Indeed, a full 93.2% of days that merited a heat warning showed air of tropical origin (47.2% dry tropical and 46% moist tropical). That’s right. Toronto, at 43°N latitude, is affected by air from the tropics! No need to visit the Caribbean to get that clima caribeño!

Outro

So, that’s that. A brief exploration of the kind of weather that provokes heat warnings in Toronto. So remember, next time you are sweating buckets and wishing the heat would just go away, head over to Toronto Island and pretend you’re in some tropical paradise. If that doesn’t work, wait a minute. It’ll be winter in no time!

References

Anderson, Conor I., and William A. Gough. 2017. “Evolution of Winter Temperature in Toronto, Ontario, Canada: A Case Study of Winters 2013/14 and 2014/15.” Journal of Climate 30 (14): 5361–76. https://doi.org/10.1175/JCLI-D-16-0562.1.

Anderson, Conor I., William A. Gough, and Tanzina Mohsin. 2018. “Characterization of the Urban Heat Island at Toronto: Revisiting the Choice of Rural Sites Using a Measure of Day-to-Day Variation.” Urban Climate 25 (September): 187–95. https://doi.org/10.1016/J.UCLIM.2018.07.002.

Mohsin, Tanzina, and William A. Gough. 2012. “Characterization and Estimation of Urban Heat Island at Toronto: Impact of the Choice of Rural Sites.” Theoretical and Applied Climatology 108 (April): 105–17. https://doi.org/10.1007/s00704-011-0516-7.

Sheridan, Scott C. 2002. “The Redevelopment of a Weather-Type Classification Scheme for North America.” International Journal of Climatology 22 (1): 51–68. https://doi.org/10.1002/joc.709.

Žaknić-Ćatović, Ana, and William A. Gough. 2018. “A Comparison of Climatological Observing Windows and Their Impact on Detecting Daily Temperature Extrema.” Theoretical and Applied Climatology 132 (1-2): 41–54. https://doi.org/10.1007/S00704-017-2068-Y.


  1. Yes! Of course!

Related

comments powered by Disqus