Git Product home page Git Product logo

Comments (3)

teunbrand avatar teunbrand commented on July 21, 2024

Hi thanks for the suggestions!

Yes, using areas for histograms satisfies the proportional ink principle, but below are a few reasons I don't think we should do it.

  • Users have come to expect counts by default. We have parted with defaults before, but I don't think we should depart a very clear and simple metric (counts) in favour of more complicated metrics.
  • Counts and the proposed metric are only the same when the width of the bars are 1. If you replace the breaks by binwidth = 0.01, you see several values reach 200 with the proposed metric, whereas the data only has 32 observations in total.
  • after_stat(sum(count) * density) sums the counts over groups, which it shouldn't as density is calculated within groups. The appropriate metric would be after_stat(count / width). As this is available as a simple combination of already available computed variables, I don't think this merits a novel computed variable.

from ggplot2.

mattansb avatar mattansb commented on July 21, 2024

Yes, changing a default is a pain... IMO it's worth it, but I don't have a huge community to serve ;)

At the very least, I think this should be written somewhere in the docs (as this is how histograms are commonly defined*). Additionally, an example with after_stat(count / width) can be added, with or without (or both) non-equi-width bins.

I'm willing to make (the world's smallest) PR if you'd like.


If you replace the breaks by binwidth = 0.01, you see several values reach 200 with the proposed metric, whereas the data only has 32 observations in total.

I don't see this as an issue - in PDFs, densities can also exceed 1 - it's just stats being stats 🤷‍♂️


* I only came to notice this when I was teaching histograms and a student pointed out that my plot didn't match what I had just said.

from ggplot2.

teunbrand avatar teunbrand commented on July 21, 2024

it's just stats being stats

Agreed, but it was meant to illustrate how it departed from counts even for equi-bins 🤓

Adding an example is a good idea, we'd welcome a PR for this.

from ggplot2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.