Comments (6)
Thanks for the report! While I agree that this is a poor error message, this error isn't really under ggplot2's control as it is thrown from {grid}. Unrelated to linetype, I'm not sure that doing (computationally 'expensive') checks on input data is the best idea as it can hamper some performance.
from ggplot2.
Thanks for clarifying. I understand better now why the error is thrown, seems like dash
is getting passed as a hex string to grid, and of course s
and h
are not valid hex.
Related to performance, I'm not a maintainer of a huge codebase used by 100k's of people, so I don't understand everyone's use case. However I fail to see how input validation could impose a substantial performance cost. I'll venture that the time it took me to understand this error was several orders of magnitude greater than the runtime of e.g. linetype %in% c("dashed", "dot" ...) | str_detect(linetype, "^[0-9a-f]+$")
...
If performance is a concern (e.g. for folks making 1000's of plots), why not include a validate_input
argument that defaults to true?
from ggplot2.
Please don't take the following personally, I do really empathise with the issue of encountering bad error messages. However, I have some remarks about why I think ggplot2 shouldn't validate all data input.
I'll venture that the time it took me to understand this error was several orders of magnitude greater
Yes, but the more relevant comparison is whether the runtime of the checks in all the plots that you ever made that didn't throw warnings exceeds the time it took for you to understand this error.
However I fail to see how input validation could impose a substantial performance cost.
For the sake of argument, let's say ggplot implements a validity check for linetype
. Next, people are going to ask ggplot to also do a validity check for colour
. Then, shape
, or maybe the interaction between shape
and fill
. So the ask to do input validation for linetype
can easily escalate to do many input validation steps.
Aside from the escalation argument, it should be noted that {grid} already does validity checks (as evident by the errors you've encountered). I appreciate a good error message as much as the next person, but I don't think ggplot2 should reimplement hairy parts of grid to throw friendlier messages. It might be more appropriate to petition {grid} directly to improve the messages.
Now any one validation step might not be expensive in 99% of plots, but let me illustrate the following.
The function below checks whether every value in a vector is a valid colour that {grid} understands.
# perhaps there is a faster way, but this is what I got at the moment
is_color <- function(x) {
grepl("^#(([[:xdigit:]]{2}){3,4}|([[:xdigit:]]){3,4})$", x) |
x %in% grDevices::colours()
}
In some fields, it is not uncommon to plot giant scatterplots with millions of points. To check if a million values are valid colours takes ~100ms.
input <- sample(c(colours(), NA, "foobar"), 1e6, replace = TRUE)
bench::mark(is_color(input))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 is_color(input) 116ms 116ms 8.62 22.9MB 43.1
However, the rest of the plumbing in making a plot (here represented by a small scatterplot) only takes ~35ms.
So relative to the time it takes to render a plot, the check is 'expensive'.
library(ggplot2)
p <- ggplot(mpg, aes(displ, hwy)) +
geom_point()
tmp <- tempfile(fileext = ".pdf")
pdf(tmp)
bench::mark(print(p))
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 print(p) 33.2ms 35.3ms 28.2 6.76MB 43.2
dev.off()
#> png
#> 2
Created on 2024-04-25 with reprex v2.1.0
from ggplot2.
Ah, that makes perfect sense. Thank you for clarifying!
I don't know that the error message is particularly bad from grid
's perspective. For all it knows, it's just being passed an invalid hex color. And trying to ask them to change the error message just for ggplot
creates a sort of 'accidental api', which I doubt would be mutually beneficial.
It's a bigger ask, but perhaps we could consider whether the 'linetype' argument is a little too flexible? I understand the attraction of flexibility, but I really think the error message is pretty bad... :-)
from ggplot2.
I don't think limiting what can be considered a linetype is the right way forward, as it'll hurt backward compatibility. All valid input is documented in the vignette. Grid's error message could just be Invalid linetype: "dash"
like they do for colours. It doesn't need to be tailored to ggplot2. Saying that the hex code is wrong (incorrectly) assumes that the user was trying to pass a linetype as a hex code, which might start off the user on the wrong path to solving the issue.
In any case, while I agree with you that the error message can be improved, I don't think it is ggplot2's place to do input validation for the reasons outlined in my previous message. I'll close the issue here and we can reopen if other maintainers disagree with me.
from ggplot2.
Great, thank you very much!
from ggplot2.
Related Issues (20)
- Release ggplot2 3.5.1
- custom guide with `theme_void()` will throw a warning message HOT 1
- VSCode and ggplot (randomly) throw error "Error in UseMethod("depth")". HOT 7
- Feature request: flexible parametrisation of rectangles
- AsIs sometimes not preserved when computing geom parameters HOT 4
- Version 3.5.0 or 3.5.1 cannot render a plot made with 3.4.4 HOT 2
- Standardising calls to `gpar()`
- Fill and color guides for factor with `drop=FALSE` don't show color. HOT 7
- `guide_coloursteps(show.limits = T)` produces strange lower limit HOT 1
- Warnings due to partial matching HOT 2
- Reproducible example for `position_jitter()` not working HOT 1
- geom_sf_label() returns an error HOT 3
- date_breaks and date_breaks minor don't check argument type HOT 2
- `scale_*_*` `labels` argument often doesn't work as expected with a function HOT 8
- geom_histogram produces wrong number of bins in special cases HOT 2
- x/ylim Arguments in coord_fixed Don’t Seem to Be Applied to Contours: HOT 1
- Feature request: allow adding aesthetics together HOT 2
- Adding bootstrap customization for stat_summary(fun.data="mean_cl_boot") HOT 1
- Have header font be part of theme specification HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ggplot2.