Git Product home page Git Product logo

geoservice_harvester_poc's Introduction

GitHub commit Run scraper

Geoservice Harvester POC open geo services reported by the Swiss Gov Agencies and Third parties

Important note

Find and use high quality data published by our data colleagues of the SWISS FEDERAL ADMINSTRATION CANTONS for all Cantons and FL:

  1. GeoServiceHarvesterPOC visualized (Dashboard): https://davidoesch.github.io/geoservice_harvester_poc
  2. geo.admin.ch visualized (Dashboard): https://docs.google.com/spreadsheets/d/1QdxYv6RYWe9PFIq5XQ-BNLnKfgtFcDkfZ58ftZ_Va9I/edit?usp=sharing
  3. all published as 'open government data': https://opendata.swiss/de/dataset/?groups=geography

Aim of this repository

The aim of this repository is to provide a POC to open OGC Compliant geodata services provided by the Swiss Confederation, Cantons Municipalities, the Principality of Liechtenstein and third parties. Inspired by the pre-POC wmsChecker and driven by the 4th Geounconference Workshop: Sujet / Thema 16 –– Service-Verzeichnis. For key findings, dive into the blog post. Updates of services are infrequent.

If you have any questions, please don't hestitate to contact us:

Datasets of all services : geoservices_CH.csv

General description
This data is generated and validated weekly using automated procedures. Note that we only publish data that are OGC compliant. Thus, gaps might be the case.

Data

https://github.com/davidoesch/geoservice_harvester_poc/blob/main/data/geoservices_CH.csv
Description: Data description for each layer separately
Spatial unit: Swiss cantons and Principality of Liechtenstein covered
Updated: weekly
Format: csv
Additional remark: )

Field Name Description Format Note
OWNER Both Owner type and Owner Name are reflected,they corespond as well to the corrsponing py file with scraper info.The owner of the data, likely an organization or agency that created or manages the data Text <OwnerType_OwnerName>
TITLE The title of the dataset, which provides a brief description of what the data represents. Text
NAME The name or identifier for this particular data layer or feature within the larger dataset Text
TREE Layertree derived from service info.A hierarchical category or grouping for this data layer Text , tree separator „/“
GROUP Group Name of agreggated datasets. A grouping or category for this data layer, which may be used for organization or visualization purposes Text only applicable for WMS
ABSTRACT A brief summary or abstract of the data, which may provide additional context or details beyond the title. Text
KEYWORDS A list of relevant keywords or tags associated with this data layer, which can help users discover and filter relevant data. keywords, commaseparated
LEGEND Link to legend for the symbols or colors used to represent this data layer, which can aid in interpreting the data. URL to Image
CONTACT The contact information for the individual or organization responsible for maintaining or providing access to the da Text, email
SERVICELINK Link to Get Capabilities URL to Image
METADATA A link or URL to additional metadata or documentation about the data, such as data dictionaries or technical specifications URL
UPDATE Publication Date Text
SERVICETYPE OGC Service type Text WMTS WMS and WFS
MAX ZOOM Zoom level (mapzoom) on which the data is visible Int
CENTER_LAT Lat center of data WGS84 Float
CENTER_LON Lon center of data WGS84 Float
MAPGEO permalink to map.geo.admin.ch URL
BBOX The bounding box or extent of this data layer, represented as a list of four coordinates in the order west, south, east, and north. Textt

Unified data : geodata_CH.csv

General description
This data is generated and validated weekly using automated procedures based on geoservices_CH.csv (1NF / normalization). Note that we only publish data that are OGC compliant. Thus, gaps might be the case.

Data

https://github.com/davidoesch/geoservice_harvester_poc/blob/main/data/geodata_CH.csv
Description: This dataset has been aggregated to include data from all of its different services. This means that the information is now more comprehensive and includes all relevant occurrences from each of the services that contribute to the dataset.
Spatial unit: Swiss cantons and Principality of Liechtenstein covered
Updated: weekly
Format: csv
Additional remark: )

Field Name Description Format Note
OWNER Both Owner type and Owner Name are reflected,they corespond as well to the corrsponing py file with scraper info Text <OwnerType_OwnerName>
TITLE Title of the dataset Text
NAME Name of the dataset Text
MAPGEO permalink to map.geo.admin.ch URL
CONTACT Contact info Text, email
WMSGetCap Link to WMSGetCap Link
WMTSGetCap Link to WMTSGetCap Link
WFSGetCap Link to WFSGetCap Link

Dataset title : geodata_simple_CH.csv

General description
This data is generated and validated weekly using automated procedures based on geodata_CH.csv. Note that we only publish data that are OGC compliant. Thus, gaps might be the case.

Data

https://github.com/davidoesch/geoservice_harvester_poc/blob/main/data/geodata_simple_CH.csv
Description: This dataset contains only the title of the datset and a link to map.geo.admin.ch. (no link if only WFS is availbale) . It's sole pupose was to serve as source for https://davidoesch.github.io/geoservice_harvester_poc
Spatial unit: Swiss cantons and Principality of Liechtenstein covered
Updated: weekly
Format: csv
Additional remark: will be decomissioned soon, since it is not needed)

Field Name Description Format Note
OWNER Both Owner type and Owner Name are reflected,they corespond as well to the corrsponing py file with scraper info Text <OwnerType_OwnerName>
TITLE Title of the dataset Text
MAPGEO permalink to map.geo.admin.ch URL in case of a WFS only: link will not work

Dataset title : geodata_stats_CH.csv

General description
This data is generated and validated weekly using automated procedures based on geoservices_CH.csv and is used to show overall quality stats parameter

Data

https://github.com/davidoesch/geoservice_harvester_poc/blob/main/data/geodata_stats_CH.csv
Description: This dataset contains OWNER and the total number datasets and the corresponing completness / existence of parameters. It's sole pupose is to serve as source for https://davidoesch.github.io/geoservice_harvester_poc/#anchor-QualityContro
Spatial unit: Swiss cantons and Principality of Liechtenstein covered
Updated: weekly
Format: csv
Additional remark: )

How to fix / add additonal WMS WMTS Services

  1. Fix / add your service to sources.csv following the OWNER Naming Convention and URL (only https) to the service endpoint
  2. make a pull request

Current status Provider & Services

Provider Status Notes
AG ok Source: https://www.ag.ch/de/verwaltung/dfr/geoportal/geodienste-(wms)
AI ok Source: https://www.ai.ch/themen/planen-und-bauen/geodaten-und-plaene/geobasisdaten some layers do cover AI AR SG
AR ok Source: reverse engineered from AI some layers do cover AI AR SG
BE ok Source https://www.agi.dij.be.ch/de/start/geoportal/geodienste/angebot-an-geodiensten.html FR to be done
BL ok Source: https://www.baselland.ch/politik-und-behorden/direktionen/volkswirtschafts-und-gesundheitsdirektion/amt-fur-geoinformation/geoportal/geodienste
BS ok Source: https://www.geo.bs.ch/geodaten/geodienste.html
FR ok Source https://geo.fr.ch/ags/rest/services
GE ok Source: https://ge.ch/sitg/services/services-carto/open-data, 3 datasets can not be parsed, see error log in /tools
GL ok Source https://www.gl.ch/verwaltung/bau-und-umwelt/hochbau/raumentwicklung-und-geoinformation/geoportal-kanton-glarus.html/808 Drops WFS warnings, seems to be ok see error log in /tools
GR ok Source https://geo.gr.ch/geodienste/katalog
JU ok Source: reverse engineered from the once working NE, no love for WebmerCator
LU ok
NE ok
NW ok
OW ok
SG ok Source: extracted the PDf from https://www.sg.ch/bauen/geoinformation/gi/geodienste.html
SH ok Soiurce: https://sh.ch/CMS/Webseite/Kanton-Schaffhausen/Beh-rde/Verwaltung/Volkswirtschaftsdepartement/Amt-f-r-Geoinformation-2303920-DE.html
SO ok Source: https://so.ch/verwaltung/bau-und-justizdepartement/amt-fuer-geoinformation/geoportal/geodienste/wmts-web-map-tile-service/
SZ ok Source: https://www.sz.ch/behoerden/vermessung-geoinformation/geoportal/daten-und-dienste.html/72-416-414-1762-1761
TG ok Source: scraped them from opendata.swiss API with https://ckan.opendata.swiss/api/3/action/package_search?fq=organization:kanton-thurgau%20AND%20res_format:WMS&rows=10000 and https://ckan.opendata.swiss/api/3/action/package_search?fq=organization:kanton-thurgau%20AND%20res_format:WFS&rows=10000
TI ok Source https://www4.ti.ch/dt/sg/sai/ugeo/temi/geoportale-ticino/geoportale/geoservizi/
UR ok Source: https://oereb.ur.ch/?basemap=AV&lat=46.87491213706447&lng=8.645001065327628&zoom=13.75 sources and www.geo.ur.ch
VD ok Source: https://www.ogc.vd.ch/public/services/OGC/wmsVD/Mapserver/WMSServer?
VS ok
ZG ok Source https://www.zg.ch/behoerden/direktion-des-innern/geoportal/geodaten-einbinden
ZH ok Source https://ckan.opendata.swiss/api/3/action/package_search?fq=organization:geoinformation-kanton-zuerich%20AND%20res_format:WMS&rows=10000 and https://ckan.opendata.swiss/api/3/action/package_search?fq=organization:geoinformation-kanton-zuerich%20AND%20res_format:WFS&rows=10000
LI ok Source https://www.llv.li/inhalt/11694/amtsstellen/internet-kartendienst-geowebservices
Geodienste ok Source: Source https://github.com/geoadmin/mf-geoadmin3/blob/master/src/js/ImportController.js FR IT to be done
Bund ok Source: Source https://www.geo.admin.ch/de/geo-dienstleistungen/geodienste/darstellungsdienste-webmapping-webgis-anwendungen.html FR IT EN RM to be done

Operation

Automated daily run of scraper.py via GithubAction scheduler. The scraper results are logged in debug.log, faulty or offline services in sources.csv are logged in tools. Harvested data in geoservices_CH.csv

Roadmap and Ideas

Are collected in Issues

geoservice_harvester_poc's People

Contributors

davidoesch avatar p1d1d1 avatar rastrau avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

geoservice_harvester_poc's Issues

GROUP

@davidoesch so far for the attribute "GROUP" you're setting the closest parent of a layer. Would it be possible to set this to empty is the closest parent is the root of the entire hierarchy?

SERVICELINK as GetCap

Currently the service link contains some GetMap approach

do offer a solution that you can COPY the service, a GetCap URL would be nice. this is actually the source (with the version which is retrieved in the scraper.py)

Datamodel harvester

Which information is available in most services? Based on this define attributes to scrape

GetFeature Attributes

To improve the discoverability: any chance to retrieve the all getFeature Variable names?

Language

  • recognize multilang services
  • detect language
  • translate using python libs like googletrans or deepl into missing lang DE FR IT EN

Add WFS as well

Add as well WFS ( can't be shown in map.geo.admin.ch)
is supported by OWSlib

falsche assignments (bzw. comparisons statt assignments)

falsche assignments (bzw. comparisons statt assignments)

image
.B. auf der zweitletzten zeile. im Beispiel ist version bei KeyError None. die zweitletzte zeile ergibt ein stills True. der return retourniert None. Sprich: macht hier glaub's nix aus.
btw, Vergleiche mit None sollte man wie folgt machen:

if x is None:

(aber auch anderes Zeug als None kann halt zu False evaluated werden, daher ist beim 2. viel vorsicht geboten)

thanks @rastrau for raising the issue

BBOX in WGS84

    layer_data["CENTER_LAT"]=(service[i].boundingBoxWGS84[1]+service[i].boundingBoxWGS84[3])/2
    layer_data["CENTER_LON"]=(service[i].boundingBoxWGS84[0]+service[i].boundingBoxWGS84[2])/2
    layer_data["BBOX"]=' '.join([str(elem) for elem in (service.contents[i].boundingBox)])

-> check for WGS84 by default

Tag to inspire or eCH catalog

using NLP on abstract name title, try to create tags/categories

There are several NLP Python libraries that can be used to analyze text and add it to a predefined group. One popular library is the Natural Language Toolkit (NLTK).

NLTK provides a wide range of tools for natural language processing, including text classification. You can use NLTK's nltk.classify module to train a classifier on a dataset of labeled text, and then use the trained classifier to classify new text into predefined groups.

Another popular library is the scikit-learn, it's a machine learning library for Python, it provides various tools for natural language processing, including text classification. With the sklearn.feature_extraction.text.CountVectorizer and sklearn.feature_extraction.text.TfidfVectorizer classes, you can convert a collection of text documents to a matrix of token counts (or TF-IDF values) that can be used as input for a classifier. The sklearn.naive_bayes.MultinomialNB , sklearn.svm.SVC and sklearn.linear_model.LogisticRegression are some of the classifiers provided by scikit-learn which can be used for text classification.

You can also use other libraries such as spaCy, TextBlob, Gensim, etc, they all have their own features and capabilities which you can use to classify text into predefined groups.

It is important to note that, before using these libraries, you need to have labeled data to train the classifier and also preprocess the text data.

Add all Kantons

Google and search all official and unofficial cantonal WMS WMTS WFS and try to add them

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.