Comments (6)
Here are the current weather related datasets:
- annual-precip.json
- climate.json
- co2-concentration.csv
- seattle-temps.csv
- seattle-weather.csv
- sf-temps.csv
- weather.csv
- weather.json
- windvectors.csv
What were you thinking with respect to consolidation?
from vega-datasets.
I think we can probably merge (after looking at examples and the data more carefully).
- seattle-temps.csv
- seattle-weather.csv
- sf-temps.csv
- weather.csv
weather.json is similar but contains a prediction and is used for a specific chart so let's keep it.
from vega-datasets.
Thoughts on weather data
There seems to be one set of daily records and one set of hourly records.
Daily Weather
seattle-weather.csv
contains daily weather information.
date | precipitation | temp_max | temp_min | wind | weather | |
---|---|---|---|---|---|---|
0 | 2012/01/01 | 0 | 12.8 | 5 | 4.7 | drizzle |
1 | 2012/01/02 | 10.9 | 10.6 | 2.8 | 4.5 | rain |
2 | 2012/01/03 | 0.8 | 11.7 | 7.2 | 2.3 | rain |
3 | 2012/01/04 | 20.3 | 12.2 | 5.6 | 4.7 | rain |
4 | 2012/01/05 | 1.3 | 8.9 | 2.8 | 6.1 | rain |
weather.csv
contains the same data from seattle-weather.csv
but has data from New York
as well.
location | date | precipitation | temp_max | temp_min | wind | weather | |
---|---|---|---|---|---|---|---|
0 | Seattle | 2012-01-01 | 0 | 12.8 | 5 | 4.7 | drizzle |
1 | Seattle | 2012-01-02 | 10.9 | 10.6 | 2.8 | 4.5 | rain |
2 | Seattle | 2012-01-03 | 0.8 | 11.7 | 7.2 | 2.3 | rain |
3 | Seattle | 2012-01-04 | 20.3 | 12.2 | 5.6 | 4.7 | rain |
4 | Seattle | 2012-01-05 | 1.3 | 8.9 | 2.8 | 6.1 | rain |
Note: The dates are slightly different formats.
Hourly Weather
seattle-temps.csv
contains hourly weather information.
date | temp | |
---|---|---|
0 | 2010-01-01T01:00:00-08:00 | 39.2 |
1 | 2010-01-01T02:00:00-08:00 | 39 |
2 | 2010-01-01T03:00:00-08:00 | 38.9 |
3 | 2010-01-01T04:00:00-08:00 | 38.8 |
4 | 2010-01-01T05:00:00-08:00 | 38.7 |
sf-temps.csv
also contains hourly weather information with the same dates.
date | temp | |
---|---|---|
0 | 2010-01-01T01:00:00-08:00 | 47.4 |
1 | 2010-01-01T02:00:00-08:00 | 46.9 |
2 | 2010-01-01T03:00:00-08:00 | 46.5 |
3 | 2010-01-01T04:00:00-08:00 | 46 |
4 | 2010-01-01T05:00:00-08:00 | 45.9 |
They can be easily concatenated with the addition of a location variable.
Proposal
I think a sensible consolidation would be to combine the hourly and daily datasets respectively
weather-daily.csv
weather-hourly.csv
Concerns
Removing seattle-weather.csv
would break many examples and tutorials. If the data is combined then any example would have to filter before being visualized.
I haven't checked the usage of the datasets but there would be similar issues.
from vega-datasets.
Thank you for the analysis @eitanlees. I agree that removing the seattle-weather dataset would break too many examples so let's keep it. I like that you can quickly import it and create a demo visualization without having to filter/facet by location.
seattle-temps.csv
and sf-temps.csv
are not used that much and only contain temperature information. The source also says (which I know you created) "30-year temperature averages recorded hourly from the Seattle Tacoma International Airport weather station". But the dates are all for 2010. I'm inclined to just remove the two datasets and we can add seattle-weather-hourly.csv
instead.
The data for Seattle is used in https://vega.github.io/vega-lite/examples/trellis_area_seattle.html and https://vega.github.io/vega/examples/annual-temperature/ and https://vega.github.io/vega/examples/heatmap/.
from vega-datasets.
The temperature data comes from a 30 year observation period and the average temperatures were reported. The study ended in 2010 and that is where the date comes from. This type of measurement is called an Hourly Normal, rather than a direct temperature measurement at only one time. It's used for studying broader trends in climate sciences. For more information see the NOAA Hourly Normal Documentation
I agree with the recommended changes.
from vega-datasets.
Oh, I see. Maybe it would be nice to join the normal temperatures (and other values) into the dataset then so you can compare actual to normal temperatures.
from vega-datasets.
Related Issues (20)
- Movies release dates off by 100 years HOT 2
- Add OHLC Data HOT 7
- add vega-datasets JS notebook to docs ...
- Update sf-temps HOT 1
- Update to 2017 Census HOT 8
- Urls should point to stable source
- Add penguin data
- Birdstrikes dataset missing HOT 2
- Is their a license for this dataset? HOT 1
- Can not load earthquakes dataset HOT 1
- build/vega-datasets.min.js is now an iife, breaking require HOT 3
- Published build artifacts have the wrong version (2.5.0 instead of 2.5.1) HOT 5
- 7 datasets that cannot be loaded HOT 1
- movies.json Release Date is sometimes in the future HOT 3
- Unable to process .arrow file in the datasets HOT 3
- An in-range update of rollup is breaking the build 🚨 HOT 1
- Add one dataset for the sports fans HOT 12
- An in-range update of rollup is breaking the build 🚨 HOT 1
- An in-range update of rollup is breaking the build 🚨 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vega-datasets.