Comments (4)
Hi @Armin-RS
The error message is not very informative, but I believe there's something wrong with the file for May 21'.
I have downloaded the same file on my laptops (Mac, Linux) and the message is still the same - file is corrupted. However, it works in a browser and in system shell, so there is sth wrong actually with the way how R tries to extract this file on Linux/Unix systems. I think it might be related to the zlib library which is used by unzip
function on non-Windows systems:
The internal C code uses zlib and is in particular based on the contributed minizip application in the zlib sources (from https://zlib.net/)
What I can recommend is to :
- (1) check the same command on Windows
- (2) slightly more complicated - overwrite R's
unzip
:
Solution for no. 2:
# first - modify the default unzip:
unzip = function (zipfile, files = NULL, list = FALSE, overwrite = TRUE,
junkpaths = FALSE, exdir = ".", unzip = "internal", setTimes = FALSE) {
if (identical(unzip, "internal")) {
if (!list && !missing(exdir))
dir.create(exdir, showWarnings = FALSE, recursive = TRUE)
res <- system(paste0("/usr/bin/unzip -o ", zipfile, " -d ", exdir))
if (list) {
dates <- as.POSIXct(res[[3]], "%Y-%m-%d %H:%M", tz = "UTC")
data.frame(Name = res[[1]], Length = res[[2]], Date = dates,
stringsAsFactors = FALSE)
}
else invisible(attr(res, "extracted"))
}
else {
WINDOWS <- .Platform$OS.type == "windows"
if (!is.character(unzip) || length(unzip) != 1L || !nzchar(unzip))
stop("'unzip' must be a single character string")
zipfile <- path.expand(zipfile)
if (list) {
res <- if (WINDOWS)
system2(unzip, c("-ql", shQuote(zipfile)), stdout = TRUE)
else system2(unzip, c("-ql", shQuote(zipfile)), stdout = TRUE,
env = c("TZ=UTC"))
l <- length(res)
res2 <- res[-c(2, l - 1, l)]
res3 <- gsub(" *([^ ]+) +([^ ]+) +([^ ]+) +(.*)",
"\\1 \\2 \\3 \"\\4\"", res2)
con <- textConnection(res3)
on.exit(close(con))
z <- read.table(con, header = TRUE, as.is = TRUE)
dt <- paste(z$Date, z$Time)
formats <- if (max(nchar(z$Date) > 8))
c("%Y-%m-%d", "%d-%m-%Y", "%m-%d-%Y")
else c("%m-%d-%y", "%d-%m-%y", "%y-%m-%d")
slash <- any(grepl("/", z$Date))
if (slash)
formats <- gsub("-", "/", formats, fixed = TRUE)
formats <- paste(formats, "%H:%M")
for (f in formats) {
zz <- as.POSIXct(dt, tz = "UTC", format = f)
if (all(!is.na(zz)))
break
}
z[, "Date"] <- zz
z[c("Name", "Length", "Date")]
}
else {
args <- character()
if (junkpaths)
args <- c(args, "-j")
if (overwrite)
args <- c(args, "-oq", shQuote(zipfile))
else args <- c(args, "-nq", shQuote(zipfile))
if (length(files))
args <- c(args, shQuote(files))
if (exdir != ".")
args <- c(args, "-d", shQuote(exdir))
if (WINDOWS)
system2(unzip, args, stdout = NULL, stderr = NULL,
invisible = TRUE)
else system2(unzip, args, stdout = NULL, stderr = NULL)
invisible(NULL)
}
}
}
# overwrite the unzip in utils package:
assignInNamespace("unzip", unzip, ns = "utils")
# activate climate with the newer unzip command:
library(climate)
m = meteo_imgw(interval="daily", rank="precip", year=2021,
coords=TRUE, status=TRUE, col_names="full")
from climate.
Hi @bczernecki,
thank you for your very detailed reply!
I tried the same command on a Windows 10 computer and it failed with the same error message.
I also believe that there is something wrong with these files because when I extract the June 2021 on the Windows computer, it has less than 5000 lines compared to the typical 11000 of most other months.
For now, I worked around the problem by manually downloading, extracting and reformatting the 2021 precipitation files.
Is there anyone at IMGW whom I could notify about the corrupt files ?
Thanks again and have a nice weekend,
Armin
from climate.
The solution that I have provided above returned over 11k rows for May 2021, so it seems to be working fine. Here's the output for all that is available so far for 2021: 2021.xlsx
I have never had this kind of situation before, but I think you can try to contact IMGW by official form: https://imgw.pl/kontakt/reklamacje (in Polish, but writing in English shouldn't be a problem). I can also ask some of my friends working there about contact person to address this question.
from climate.
Further investigation (with "zip -F --out fixed.zip 2021_05_o.zip") showed that the CRC checksum of the ZIP archive is wrong. That's probably why R does not want to decompress the file.
You are right, the May 2021 file has 11k rows. But the June 2021 file has only 4.7k rows which I also find a bit odd that suddenly so many stations are not reporting in June.
Anyway, as this is not a bug in the "climate" package, I will close this issue.
from climate.
Related Issues (20)
- Calculations in WGS 84 are inaccurate HOT 3
- Suggestions to `nearest_stations_imgw()` HOT 1
- IMGW hydrological data has different number of columns HOT 2
- Error in another wyoming url HOT 3
- CRAN email HOT 2
- New error message when downloading OGIMET data HOT 2
- IMGW: incorrect results if `station` length > 1 HOT 2
- No internet connection HOT 4
- meteo_imgw speed up for a single station
- CRAN policy HOT 2
- two suggestions for sounding_wyoming HOT 1
- hydro_imgw_daily not returning results for failures HOT 1
- Walter diagram with Climate HOT 1
- support for meteorological IMGW datastore dataset
- Error when trying to find the nearest meteorological stations in Spain HOT 6
- HTTP conection error HOT 1
- Why not on CRAN anymore? HOT 1
- Issue with Climate R Package for Streamflow Data HOT 3
- Error occurs Downloading OGIMet Stations for Luxembourg HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from climate.