This package is makes it easier to search for and download multiple months/years of historical weather data from Environment and Climate Change Canada (ECCC) website.
Bear in mind that these downloads can be fairly large and performing multiple downloads may use up ECCC's bandwidth unecessarily. Try to stick to what you need.
For more details and tutorials checkout the weathercan website
Use the devtools
package to directly install R packages from github:
install.packages("devtools") # If not already installed
devtools::install_github("ropensci/weathercan")
To build the vignettes (tutorials) locally, use:
devtools::install_github("ropensci/weathercan", build_vignettes = TRUE)
View the available vignettes with vignette(package = "weathercan")
View a particular vignette with, for example, vignette("weathercan", package = "weathercan")
To download data, you first need to know the station_id
associated with the station you're interested in.
weathercan
includes a data frame called stations
which includes a list of stations and their details (including station_id
.
head(stations)
## # A tibble: 6 x 12
## prov station_name station_id clima… WMO_id TC_id lat lon elev inte… start end
## <fctr> <chr> <fctr> <fctr> <fctr> <fct> <dbl> <dbl> <dbl> <chr> <int> <int>
## 1 BC ACTIVE PASS 14 10100… <NA> <NA> 48.9 -123 4.00 hour NA NA
## 2 BC ALBERT HEAD 15 10102… <NA> <NA> 48.4 -123 17.0 hour NA NA
## 3 BC BAMBERTON OCEAN CEMENT 16 10105… <NA> <NA> 48.6 -124 85.3 hour NA NA
## 4 BC BEAR CREEK 17 10107… <NA> <NA> 48.5 -124 350 hour NA NA
## 5 BC BEAVER LAKE 18 10107… <NA> <NA> 48.5 -123 61.0 hour NA NA
## 6 BC BECHER BAY 19 10107… <NA> <NA> 48.3 -124 12.2 hour NA NA
glimpse(stations)
## Observations: 26,232
## Variables: 12
## $ prov <fctr> BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, BC, B...
## $ station_name <chr> "ACTIVE PASS", "ALBERT HEAD", "BAMBERTON OCEAN CEMENT", "BEAR CREEK", "BEA...
## $ station_id <fctr> 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 24, 23, 26, 27, 28, 29, 30, 31, 3...
## $ climate_id <fctr> 1010066, 1010235, 1010595, 1010720, 1010774, 1010780, 1010960, 1010961, 1...
## $ WMO_id <fctr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
## $ TC_id <fctr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
## $ lat <dbl> 48.87, 48.40, 48.58, 48.50, 48.50, 48.33, 48.60, 48.57, 48.57, 48.58, 48.5...
## $ lon <dbl> -123.28, -123.48, -123.52, -124.00, -123.35, -123.63, -123.47, -123.45, -1...
## $ elev <dbl> 4.00, 17.00, 85.30, 350.50, 61.00, 12.20, 38.00, 30.50, 91.40, 53.30, 38.0...
## $ interval <chr> "hour", "hour", "hour", "hour", "hour", "hour", "hour", "hour", "hour", "h...
## $ start <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ end <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
You can look through this data frame directly, or you can use the stations_search
function:
stations_search("Kamloops", interval = "hour")
## # A tibble: 3 x 12
## prov station_name station_id climate_id WMO_id TC_id lat lon elev interval start end
## <fctr> <chr> <fctr> <fctr> <fctr> <fctr> <dbl> <dbl> <dbl> <chr> <int> <int>
## 1 BC KAMLOOPS A 1275 1163780 71887 YKA 50.7 -120 345 hour 1953 2013
## 2 BC KAMLOOPS A 51423 1163781 71887 YKA 50.7 -120 345 hour 2013 2018
## 3 BC KAMLOOPS AUT 42203 1163842 71741 ZKA 50.7 -120 345 hour 2006 2018
Time frame must be one of "hour", "day", or "month".
You can also search by proximity:
stations_search(coords = c(50.667492, -120.329049), dist = 20, interval = "hour")
## # A tibble: 3 x 13
## prov station_name station_id climate_id WMO_id TC_id lat lon elev inte… start end dist…
## <fctr> <chr> <fctr> <fctr> <fctr> <fctr> <dbl> <dbl> <dbl> <chr> <int> <int> <dbl>
## 1 BC KAMLOOPS A 1275 1163780 71887 YKA 50.7 -120 345 hour 1953 2013 8.64
## 2 BC KAMLOOPS AUT 42203 1163842 71741 ZKA 50.7 -120 345 hour 2006 2018 8.64
## 3 BC KAMLOOPS A 51423 1163781 71887 YKA 50.7 -120 345 hour 2013 2018 9.28
Once you have your station_id
(s) you can download weather data:
kam <- weather_dl(station_ids = 51423, start = "2016-01-01", end = "2016-02-15")
kam
## # A tibble: 1,104 x 35
## stat… stat… prov lat lon elev clim… WMO_… TC_id date time year month
## * <chr> <dbl> <fct> <dbl> <dbl> <dbl> <chr> <chr> <chr> <date> <dttm> <chr> <chr>
## 1 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 00:00:00 2016 01
## 2 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 01:00:00 2016 01
## 3 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 02:00:00 2016 01
## 4 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 03:00:00 2016 01
## 5 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 04:00:00 2016 01
## 6 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 05:00:00 2016 01
## 7 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 06:00:00 2016 01
## 8 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 07:00:00 2016 01
## 9 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 08:00:00 2016 01
## 10 KAML… 51423 BC 50.7 -120 345 1163… 71887 YKA 2016-01-01 2016-01-01 09:00:00 2016 01
## # ... with 1,094 more rows, and 22 more variables
You can also download data from multiple stations at once:
kam_pg <- weather_dl(station_ids = c(48248, 51423), start = "2016-01-01", end = "2016-02-15")
kam_pg
## # A tibble: 2,208 x 35
## stat… stat… prov lat lon elev clim… WMO_… TC_id date time year month
## * <chr> <dbl> <fct> <dbl> <dbl> <dbl> <chr> <chr> <chr> <date> <dttm> <chr> <chr>
## 1 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 00:00:00 2016 01
## 2 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 01:00:00 2016 01
## 3 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 02:00:00 2016 01
## 4 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 03:00:00 2016 01
## 5 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 04:00:00 2016 01
## 6 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 05:00:00 2016 01
## 7 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 06:00:00 2016 01
## 8 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 07:00:00 2016 01
## 9 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 08:00:00 2016 01
## 10 PRIN… 48248 BC 53.9 -123 680 1096… 71302 VXS 2016-01-01 2016-01-01 09:00:00 2016 01
## # ... with 2,198 more rows, and 22 more variables
And plot it:
library(ggplot2)
ggplot(data = kam_pg, aes(x = time, y = temp, group = station_name, colour = station_name)) +
theme_minimal() +
geom_line()
citation("weathercan")
##
## To cite 'weathercan' in publications, please use:
##
## LaZerte, Stefanie E and Sam Albers (2018). weathercan: Download and format weather data
## from Environment and Climate Change Canada. The Journal of Open Source Software
## 3(22):571. doi:10.21105/joss.00571.
##
## A BibTeX entry for LaTeX users is
##
## @Article{,
## title = {{weathercan}: {D}ownload and format weather data from Environment and Climate Change Canada},
## author = {Stefanie E LaZerte and Sam Albers},
## journal = {The Journal of Open Source Software},
## volume = {3},
## number = {22},
## pages = {571},
## year = {2018},
## url = {http://joss.theoj.org/papers/10.21105/joss.00571},
## }
The data and the code in this repository are licensed under multiple licences. All code is licensed GPL-3. All weather data is licensed under the (Open Government License - Canada).
weathercan
and rclimateca
were developed at roughly the same time and as a result, both present up-to-date methods for accessing and downloading data from ECCC. The largest differences between the two packages are: a) weathercan
includes functions for interpolating weather data and directly integrating it into other data sources. b) weathercan
actively seeks to apply tidy data principles in R and integrates well with the tidyverse including using tibbles and nested listcols. c) rclimateca
contains arguments for specifying short vs. long data formats. d) rclimateca
has the option of formatting data in the MUData format using the mudata2
package by the same author.
CHCN
is an older package last updated in 2012. Unfortunately, ECCC updated their services within the last couple of years which caused a great many of the previous web scrapers to fail. CHCN
relies on one of these older web-scrapers and so is currently broken.
We welcome any and all contributions! To make the process as painless as possible for all involved, please see our guide to contributing
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.