Git Product home page Git Product logo

ds-python-geospatial's Introduction

Python for GIS and Geoscience

Introduction

An important aspect of daily work in geographic information science and earth sciences is the handling of potentially large amounts of data. Reading in spatial data, exploring the data, creating visualisations and preparing the data for further analysis may become tedious tasks. Hence, increasing efficiency and reproducibility in this process without the need of a GUI interface is beneficial for many scientists. The usage of high-level scripting languages such as R and Python are increasingly popular for these tasks thanks to the development of GIS oriented packages.

This course trains students to use Python effectively to do these tasks, with a focus on geospatial data. It covers both vector and raster data. The course focuses on introducing the main Python packages for handling such data (GeoPandas, Numpy and Rasterio, Xarray) and how to use those packages for importing, exploring, visualizing and manipulating geospatial data. It is the aim to give the students an understanding of the data structures used in Python to represent geospatial data (geospatial dataframes, (multi-dimensional) arrays and composite netCDF-like multi-dimensional datasets), while also providing pointers to the broader ecosystem of Python packages for GIS and geosciences.

The course has been developed as a specialist course for the Doctoral schools of Ghent University, but can be taught to others upon request.

Aim & scope

This course targets researchers that want to enhance their general data manipulation and analysis skills in Python specifically for handling geospatial data.

The course does not aim to provide a course in specific spatial analysis and statistics, cartography, remote sensing, OGC web services, ... or general Geographical Information Management (GIS). It aims to provide researchers the means to effectively tackle commonly encountered spatial data handling tasks in order to increase the overall efficiency of the research. The course does not tackle desktop GIS Python extensions such as arcpy or pyqgis.

Getting started

The course uses Python 3, data analysis packages such as Pandas, Numpy and Matplotlib and geospatial packages such as GeoPandas, Rasterio and Xarray. To install the required libraries, we highly recommend Anaconda or miniconda (https://www.anaconda.com/download/) or another Python distribution that includes the scientific libraries (this recommendation applies to all platforms, so for both Window, Linux and Mac).

For detailed instructions to get started on your local machine , see the setup instructions.

In case you do not want to install everything and just want to try out the course material, use the environment setup by Binder Binder and open de notebooks rightaway.

Contributing

Found any typo or have a suggestion, see how to contribute.

Meta

Authors: Joris Van den Bossche, Stijn Van Hoey

ds-python-geospatial's People

Contributors

jorisvandenbossche avatar stijnvanhoey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ds-python-geospatial's Issues

Wrong variable name when assigning coords to `gent` variable (visualization-04-interactive)

gent = xr.open_dataarray("./data/gent/raster/2020-09-17_Sentinel_2_L1C_B0408.tiff", engine="rasterio", mask_and_scale=False)
gent = xr_array.assign_coords(band=("band", ["b4", "b8"]))

should be

gent = xr.open_dataarray("./data/gent/raster/2020-09-17_Sentinel_2_L1C_B0408.tiff", engine="rasterio", mask_and_scale=False)
gent = gent.assign_coords(band=("band", ["b4", "b8"]))

Feedback xarray intro

  • Better explain DataArray vs DataSet:

    • Probably we can just start with explaining DataArray first (and do exercises on it), and only later Dataset
    • The raster satellite image might be less suited to explain the Dataset, we can better take an example where its clearer that there are multile variables (like temperature and pressure, ..)
  • Only explain "there can be more coordinates as dimensions" later on (eg when it actually occurs in the data of the exercises)

  • Be consistent with .values vs .data

  • (for pandas -> also use the .plot.<type>() pattern instead of .plot(kind=<type>) for consistency)

  • mistake in exercise: use salinity instead of pressure to calculate the buoyancy

  • only explain conversion of dataarray <-> dataset when both concepts are already known (and then can show benefit of either way)

Some typos in case-curieuzeneuzen-air-quality

  • Ex. 3 of "Combining with municipilaties": "TODO HINTS" can be removed.
  • "Combining with land use data": the code for downloading/cropping LC is not yet included ([INCLUDE LINK])
  • "The goal is now to to query" -> "The goal is now to do query"

country areas are not areas

In notebook 02, the areas you compute with countries.geometry.area are not areas; they're computed as if geographic coordinates are Cartesian - their unit is squared degrees, which is meaningless (because dependent on latitude).

raster rework for next course

@jorisvandenbossche , I think we should do some rework on the current state of the raster info in the repo. My proposal:

  • Remove 11-numpy.ipynb. Could be an option to integrate the convolution example at the end in an advanced exercise or use-case as it is a useful GIS element
  • Remove 12-rasterio.ipynb
  • Combine 10-introduction-raster.ipynb, 13-xarray.ipynb & 14-xarray-intro.ipynb to have a single introduction on rasters, but using xarray and rio-xarray directly. Focus on xarray based on analogy with geopandas instead of analogy/extension of numpy ('adds context to NumPy'). Start with single data source, i.e. DataArray; add write to file example/exercise

For clarity/consistency during the course: we handle bands/channels of a single data source as a DataArray dimension (RGB,..), we handle different kinds of data (temperature, salinity,...) as DataSets. We might make a remark in the end that switching is fine.

  • 15-xarray-datasets.ipynb -> rework towards DataSets containing different variables instead of the b4/b8 example; add write to file example/exercise

  • Combine 16-raster-processing.ipynb and 20-raster-vector-tools.ipynb into a single notebook on raster/vector tooling. Proposal of topics to cover: clip region (with/without buffer), conversion raster/vector, proximity (xarray-spatial ) and rasterstats; move "Cloud: only download what you need" to the 'big-data' notebook. I would add an addendum on 'calling external tools' like https://www.whiteboxgeo.com/, http://www.saga-gis.org/saga_tool_doc/2.2.7/a2z.html,...

  • 21-xarray-dask-big-data.ipynb: extend with the COG-info, mention also https://geemap.org/ (requires account of google earth engine)?

  • What to use for the zonal statistics -> what about https://github.com/corteva/geocube vs https://xarray-spatial.org/user_guide/zonal.html vs rasterstats package?

cleanup raster processing notebook 13

Some sections are now a bit redundant to show clip and ``rasterize; we should improve the structure to make clear what the repitition is and what is new.

  • mask_and_scale has already been tackled before -> shorten in this notebook (just a recap)
  • it is a lot of clipping no (both on initial section as on dem example); might be better to stick to 1 of those or just start from the clipped dem directly to have the focus on the rasterize only.

Add more background on link epsg to cartopy ccrs

When someone uses latlon, not projected, use Plattecarree. Other situations the link in between epsg and projecten in cartopy need to be clarified. Specific, add the function ccrs.epsg(....).

e.g. case of Lambert72 (EPSG:31370), the ccrs.epsg(31370) works and zooms the the proper extent as well. Cfr also ccrs.epsg(3035) as well.

Typo#1 in 01-introduction-tabular-data

There is a typo in the table "An overview of the possible comparison operations"
It should be:

An overview of the possible comparison operations:

Operator Description
== Equal
!= Not equal
> Greater than
>= Greater than or equal
< Lesser than
!> Lesser than or equal

and to combine multiple conditions:

Operator Description
& And (cond1 & cond2)
| Or (cond1 | cond2)

Feedback numpy notebook

  • Do we need squeeze() ?
    • Or can we use indexing as well?
    • If we use it for numpy, should maybe use it for geopandas as well? (instead of .item())
  • Leave where for the "extra" section at the bottom
  • Add some more visual illustrations (see new numpy docs)

Update description of dataarray

Other metadata, such as the spatial information) provided by the tiff are stored in the Attributes is not fully correct as we now read the metadata in the crs using DataArray

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.