Git Product home page Git Product logo

Comments (6)

eitanlees avatar eitanlees commented on July 17, 2024

Here are the current weather related datasets:

  • annual-precip.json
  • climate.json
  • co2-concentration.csv
  • seattle-temps.csv
  • seattle-weather.csv
  • sf-temps.csv
  • weather.csv
  • weather.json
  • windvectors.csv

What were you thinking with respect to consolidation?

from vega-datasets.

domoritz avatar domoritz commented on July 17, 2024

I think we can probably merge (after looking at examples and the data more carefully).

  • seattle-temps.csv
  • seattle-weather.csv
  • sf-temps.csv
  • weather.csv

weather.json is similar but contains a prediction and is used for a specific chart so let's keep it.

from vega-datasets.

eitanlees avatar eitanlees commented on July 17, 2024

Thoughts on weather data

There seems to be one set of daily records and one set of hourly records.

Daily Weather

seattle-weather.csv contains daily weather information.

date precipitation temp_max temp_min wind weather
0 2012/01/01 0 12.8 5 4.7 drizzle
1 2012/01/02 10.9 10.6 2.8 4.5 rain
2 2012/01/03 0.8 11.7 7.2 2.3 rain
3 2012/01/04 20.3 12.2 5.6 4.7 rain
4 2012/01/05 1.3 8.9 2.8 6.1 rain

weather.csv contains the same data from seattle-weather.csv but has data from New York as well.

location date precipitation temp_max temp_min wind weather
0 Seattle 2012-01-01 0 12.8 5 4.7 drizzle
1 Seattle 2012-01-02 10.9 10.6 2.8 4.5 rain
2 Seattle 2012-01-03 0.8 11.7 7.2 2.3 rain
3 Seattle 2012-01-04 20.3 12.2 5.6 4.7 rain
4 Seattle 2012-01-05 1.3 8.9 2.8 6.1 rain

Note: The dates are slightly different formats.

Hourly Weather

seattle-temps.csv contains hourly weather information.

date temp
0 2010-01-01T01:00:00-08:00 39.2
1 2010-01-01T02:00:00-08:00 39
2 2010-01-01T03:00:00-08:00 38.9
3 2010-01-01T04:00:00-08:00 38.8
4 2010-01-01T05:00:00-08:00 38.7

sf-temps.csv also contains hourly weather information with the same dates.

date temp
0 2010-01-01T01:00:00-08:00 47.4
1 2010-01-01T02:00:00-08:00 46.9
2 2010-01-01T03:00:00-08:00 46.5
3 2010-01-01T04:00:00-08:00 46
4 2010-01-01T05:00:00-08:00 45.9

They can be easily concatenated with the addition of a location variable.

Proposal

I think a sensible consolidation would be to combine the hourly and daily datasets respectively

  • weather-daily.csv
  • weather-hourly.csv

Concerns

Removing seattle-weather.csv would break many examples and tutorials. If the data is combined then any example would have to filter before being visualized.

I haven't checked the usage of the datasets but there would be similar issues.

from vega-datasets.

domoritz avatar domoritz commented on July 17, 2024

Thank you for the analysis @eitanlees. I agree that removing the seattle-weather dataset would break too many examples so let's keep it. I like that you can quickly import it and create a demo visualization without having to filter/facet by location.

seattle-temps.csv and sf-temps.csv are not used that much and only contain temperature information. The source also says (which I know you created) "30-year temperature averages recorded hourly from the Seattle Tacoma International Airport weather station". But the dates are all for 2010. I'm inclined to just remove the two datasets and we can add seattle-weather-hourly.csv instead.

The data for Seattle is used in https://vega.github.io/vega-lite/examples/trellis_area_seattle.html and https://vega.github.io/vega/examples/annual-temperature/ and https://vega.github.io/vega/examples/heatmap/.

from vega-datasets.

eitanlees avatar eitanlees commented on July 17, 2024

The temperature data comes from a 30 year observation period and the average temperatures were reported. The study ended in 2010 and that is where the date comes from. This type of measurement is called an Hourly Normal, rather than a direct temperature measurement at only one time. It's used for studying broader trends in climate sciences. For more information see the NOAA Hourly Normal Documentation

I agree with the recommended changes.

from vega-datasets.

domoritz avatar domoritz commented on July 17, 2024

Oh, I see. Maybe it would be nice to join the normal temperatures (and other values) into the dataset then so you can compare actual to normal temperatures.

from vega-datasets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.