urbanbigdatacentre / ideamaps-models Goto Github PK

Models of deprivation sub-domains for the IDEAMAPS data ecosystem project. This repo contains the source code used to run models and the model outputs. It also contains logic to upload model outputs to the IDEAMAPS platform.

License: MIT License

Python 0.01% Jupyter Notebook 100.00%

ideamaps-models's Introduction

ideamaps-models

ideamaps-models's People

Contributors

Stargazers

Watchers

ideamaps-models's Issues

Write Source Code

@Gtregon to coordinate the reference data team @Adenikemie + Alex in creating the training dataset for the new morphological infomality model based on the reference data created in #9

The task involves using 3-point reference data to generate training data from the following datasets:

Satellite imagery (sentinel)
Google Buildings ( Building Density )
Irregular Layout
Road connectivity (TBC) **
Population density (TBC) **

The datasets marked with an ** are new to the modelling process and will require some time to familiarise.

This task can be completed when we have a training dataset for the morphological informality model - and this dataset is referenced from within the Github and is likely stored in an accessible place like CRIB.

Create Reference Dataset

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

run deep learning course for internal members (sub-domain coordination team)

Review Lehmans Document - Model Spec (FULL TEAM)

We need to bring the lehamns document created in #21 with the full team.

This issue can be closed once a co-design session that reviews the document has been done and the feedback has been recorded.

@Gtregon to provide a summary of the feedback here and a plan of action to incorporate any changes (if appropriate)

Calculate Road Metrics - NAIROBI ONLY

We need to create a model ready dataset for road connectivity for Nairobi.

The starting point of this issue is that two dataset are ready to go within CRIB:

OSM Road Data
Million Neighbourhood Block Data

This process will involve:

Calculate length of internal streets within blocks connected to external st network (m)
Calculate length of internal streets within blocks NOT connected to external st network (m)
This issue can be closed when there is an output dataset that is stored on CRIB and is model ready for NAIROBI.

literature review

In this task, @Gtregon will review existing literature relating to WP2/3 activities to ground myself in novel methods etc. This will enable @Gtregon to justify his actions/decisions during sprint meetings.

What is ongoing atm?
What is the gap?
How are we going to fulfil that gap?

This task can be closed when @Gtregon has completed a literature review of the top 10 papers and their relevance to the project. Link to literature review here.

Create Model Spec for Morphological Informality

We need to produce a 1 - 2 page document about the morphological informality model to present at the full team meeting for around 30 minutes.

In this document we need to detail the following elements:

Definitions from Monika with examples
Compare this to previous descriptions
What does this mean in each city

This is not a technical summary.

This issue can be closed with a link posted in this thread to the document.

Model Spec: Review and Incorporate Feedback

We need to incorporate changes as a result of the feedback given in the full team meeting on the model spec.

We also need to send an email to city liasons with the ability to provide comments and suggestions on the model spec.

This issue can be closed once all comments have been considered and the document has reached a final 'workable' condition.

Review Reusability Document

@Gtregon to provide comments and feedback on the Reusability document created by @Adenikemie and @AlexandraMiddleton .

This issue can be closed once these comments are in the document and a link has been shared in this issue.

Begin first silo meeting for sub-domains

The full sub-domain modelling network has met (prev w/c 15th April). Going forward teams will meet within their individual sub-domain groups every two weeks. Between these weeks, the team will also regroup at the WP2/3 meetings (also operating every 2 weeks).

The first meeting will help to define a roadmap for each sub-domain.

Envisaged tasks include:

define model spec
define roles and responsibilities within the team
define model parameters and thresholds
define datasets associated with model (or activities for data collection/acquisition)
TBC

Create 1st Draft of Ref Data

We need to incorporate the feedback given by the WP2/3 team on the first draft of reference data and the resusability document created in #9 .

Any feedback that is provided should result in action that can be completed within a week.

Stucture Github Repository

Upload the scripts used to model each sub-domain during phase 1 e.g. SDS and the scripts used to model the 11 metrics etc.

2 Page Model Brief Draft for Healthcare Model

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

Technical Model Spec Finalisation

Grant should focus on this this week
was delayed by sensor project
meeting with QZ this Friday afternoon. -- initial draft reviewed and bring something on table to this meeting next week.

develop satellite branch

kjsfejksdf

Render Documentation Pages

We need to render documentation pages so that the markdown documents prduced in the Docs repository can be rendered automatically on the interface.

This code should reflect the existing UBDC approach to rendering documentation in either the CCTV or UBDC Web Starter Kit Repos - 🚨 So long as this approach is able to render on React Native Applications.

Be careful of mobile rendering!

Standardised file names for validated model outputs.

See feedback from #44. we need to name file names in a standardised way so that future modellers (not us) are able to locate files and understand how they are related to the methods outlined in #43 . A meeting ahs been scheduled so that this work can be discussed with @Gtregon @AlexandraMiddleton @Adenikemie and instructions can be given.

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

Update DPIA document on the ethics application

We need to update the DPIA in the ethics application to reflect the use of Nhost as a new cloud provider.

We will update the following sections:

Once this is updated, we pass to JPA for review and submission to the ethics board. This issue can be closed when the updates are submitted to the ethics board.

Building Density & Irregular Layout - Process New Datasets for Training

We need to discuss with the WP2/3 team about what datasets to use and the approach to modelling building density and irregular layout.

There has been a discussion about using the outputs from phase 1 (small dense structure & irregular settlement layout) as inputs to the new model. However this depends upon the quality of these outputs regarding the feedback from the validation actiivty.

@Gtregon will need to source the datasets that will be used for this purpose and process them so that they are ready for use.

@Gtregon will need to provide a link to a document that describes the method - this should not be a long document that takes very long to prepare just something for our working method.

The outcome of this task is a dataset that is prepared for the training process

Training materials for internal deep learning model course

Notebook to Create Summary Statistics of Validation Data

We need to create a notebook to analyse the validation data from phase 1.

This notebook will ...

load validation data
provide summary statistics of validation data at a grid level.

These summary statistics will directly inform the reference data team and provide them a new layer containing the summary statistics at a grid level.

This task can be closed when one notebook has been created - is readable - and has been uploaded to the Github @Gtregon to provide further details on the location later.

Also - the reference data team should have the layer in order to close this issue.

Accuracy assessment of model outputs

The outputs from the Morphological informality model will be assessed for accuracy using a sample of the reference data generated in #15 #16 #17 .

This issue can be closed once metrics have been developed for the model outputs and circulated to the WP2/3 team.

Create Word Document with Onboarding Questions

We need to create a simple word document for internal review of the onboarding questions that we will ask users as they initially gain acces to the interface.

This issue can be closed after this document has been reviewed by WP4/5.

Full Technical Model Documentation

Cleanup old versions of models

Refine scripts used for ISL, SDS and OD i.e. pull out metric scripts used for each model.

This issue can be closed when each sub-domain model has a list of scripts that are explicitly used to model the sub-domain i.e. SDS contains scripts for the 11 metrics used to model SDS etc.

Landing Logic

We need to update the interface logic with the following criteria when the user lands on the homepage

Is the user logged in or not
If no - on validation attempt - add modal to continue browsing or log in
...

This issue can be closed when the interface correctly handles user navigation as the user reaches the domain for the first or returning time.

Process Buildings Data for LAGOS

We need to follow the tracability process outlined in the image below for creating model ready datasets for buildings (small dense settlements + irregular layout) - FOR LAGOS ONLY.

Data required is already on CRIB.

This process will involve:

Reperform block analysis
Perform metric clustering
Analyse metric importance
Define metrics for irregularity
Define metrics for SDS
Aggregate metrics to the grid level
K-Means clustering
Create city-level data product.
The issue can be closed when LAGOS has a city level data product ready for the modelling process and this dataset is stored on CRIB. Access information for the dataset should be included in this issue.

Migrate Spatial Data from G-Drive to CRIB

We need to move the raw data to CRIB. The reason for doing this is to have a single source of truth for all training - reference data (and other) that is used for the modelling. This is a secure location that takes away any need to consider data management plan.

This issue can be closed when we have all spatial data relating to the project has been migrated to CRIB.

Population Density - Process New Dataset for Training

We need to discuss with the WP2/3 team about what datasets to use and the approach to modelling population density.

There has been discussions already about using building density as a proxy for population density.

@Gtregon will need to source the datasets that will be used for this purpose and process them so that they are ready for use.

@Gtregon will need to provide a link to a document that describes the method - this should not be a long document that takes very long to prepare just something for our working method.

The outcome of this task is a dataset that is prepared for the training process

Draft of model design/process

@Gtregon will draft a model process for the morphological informality model developed and tested in phase 2.1. This wil include:

datasets related to variables
processing of datasets
modelling technique
expected outputs (three classes of morphological informality
how these outputs will feed into the model developed in phase 2.2

This issue can be closed when a draft of the model process has been completed and has been reviewed by @qszhao . Link to model draft here.

Upload notebook to utils github repo

Upload the notebook created in #8 to the utils folder on github repo (maybe another folder called notebooks)

Preprint

Associated Zenodo registration and github release link for preprint of phase 1 sub-domain analysis.

Issue closed when there is a Zenodo DOI and link between Github and DOI preprint for phase 1 sub-domain analysis.

Test Model Execution with Sample Dataset

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

Review Incorporate Feedback - Model Spec Final

We need to review the feedback given on the two model spec documents created in #21 and #31 .

This issue can be closed once the WP2/3 and @qszhao have provided feedback on the document and @Gtregon has incorporated feedback.

Document Ingestion Pipeline

We need to create a document that describes the process of ingesting data from the model repo to the platform.

This document needs to detail the following elements:

Data formats for outputs
Hasura endpoints and requirements
Links to the schema
How to use the Action
Quality Control processes
Error and response codes - with messages

...

This issue can be closed once the document has been created and uploaded to the /models repo within it's own isolated space.

Generate First Draft of Reference Data

We need to create a reference dataset that will be used to generate training data for the Morphological Informality model.

Reference Data Team == @Adenikemie + Alex
Coordinated by @Gtregon

This task involves ...

Using summary statistics generated in #8 to cross reference original reference datasets and determine what can be reused
Reference data team to create a document that defines the characteristics of 'resusable data' (1st draft)
Using validation data to determine a range of HIGH - MEDIUM - LOW values that can be used for NEW reference data

The outcome of this task is a geopackage file that contains a range of values that are matched to grid cells and represent either a HIGH - MEDIUM - LOW likelihood that this grid cell is a deprived area.

Process Buildings Data for Kano

We need to follow the tracability process outlined in the image below for creating model ready datasets for buildings (small dense settlements + irregular layout) - FOR KANO ONLY.

Data required is already on CRIB.

This process will involve:

Reperform block analysis
Perform metric clustering
Analyse metric importance
Define metrics for irregularity
Define metrics for SDS
Aggregate metrics to the grid level
K-Means clustering
Create city-level data product.

The issue can be closed when KANO has a city level data product ready for the modelling process and this dataset is stored on CRIB. Access information for the dataset should be included in this issue.

Investigate provider setup requirements with DP/FOI Office

We need to speak to the data protection and freedom of information office at UofG about the requirements for switching to Vercel/Nhost
We may need further liaising with the Procurement Office.
For now, we are planning to pay for Vercel/Nhost via credit card and we will NOT store any data there until we have approvals in place from the university.

Experiment with model parameters and thresholds

In order to ensure our model parameters for buildings, roads and population align with real world thresholds, @Gtregon will investigate existing examples of representative thresholds within case studies e.g. what thresholds exist within Lagos, Kano and Nairobi (and other small, medium and large cities)? Do these align within our model parameters?

This information will then be used to set thresholds for our model parameters and data acquisition/data preprocessing can begin.

This issue can be closed once thresholds for model parameters (high, med, low) have been established. A draft of model parameters will be sent to the WP2/3 team on Friday 19th and will be discussed during the next WP2/3 meeting on Tuesday 23rd April.

Meet to discuss standardisation of names for reference data

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

Road Connectivity - Process New Dataset for Training

We need to discuss with the WP2/3 team about what datasets to use and the approach to modelling road connectivity.

There is no real idea yet on how to move forward with this.

@Gtregon will need to source the datasets that will be used for this purpose and process them so that they are ready for use.

@Gtregon will need to provide a link to a document that describes the method - this should not be a long document that takes very long to prepare just something for our working method.

The outcome of this task is a dataset that is prepared for the training process

Review & Update Model Brief for Healthcare

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

Training Pengyu for Traceability Process + CRIB

We need to train @Pengyuwah to perform the processing of buildings for KANO + LAGOS.

@Gtregon to lead the training and give @Pengyuwah access to CRIB.

This issue can be closed when @Pengyuwah is happy that he has the resources needed to complete the process described in #34

Model spec: FINAL DRAFT

Upload final version on the 26th April

Calculate Population Metrics - NAIROBI Only

We need to calculate the model ready population dataset for NAIROBI only.

The starting point for this issue is the following datasets will be on CRIB.

GHS Pop 2023

The process involves creating the following metrics and wrapping them into a model ready dataset

Segment total pop
Segment pop density (m/hectare)
City total pop
City total pop classified as follows

Less than 250k
250-500k
500k-1mil
1mil - 10mil
10milly+ 💰

City 2020 - 2025 annual population

This issue can be closed once the dataset representing the above metrics is on CRIB.

Running of morphological informality model - NAIROBI ONLY

The source code developed in #14 will be used to run a random forest model using reference data developed during #15 #16 #17 and Sentinel-2 imagery to generate three classes of morphological informality (high, med and low). Supplementary data will also be used to conduct an accuracy assessment of model outputs e.g. 80% of reference/satellite data will be ingested within the RF model, whilst 20% will be used to validate/assess accuracy of model outputs.

This issue can be closed when the model has been ran for all three pilot cities and outputs have been uploaded to the github repo

Arrange payment for Vercel and NHost

We need to speak to Yevdokia about paying for the new cloud services via credit card. We need to ask her if there is an option to pay by direct debit. Yevdokia returns from annual leave on May 20th or 21st.

Gather Healthcare Access Data

✅ Definition of Done

1. Define acceptance criteria.
2. Assess the need for a review process. If a review process is required, the issue states:
1. Who is involved in the review?
2. When will the review take place?
3. Who is responsible for taking on the feedback?
4. What additional tasks are involved and are they visible on the backlog?
3. Make progress and post updates.
4. Check off completed acceptance criteria.
5. Post links to digital outputs.
6. Note the value added to the product.

Set up README

Review Reference Data

@Gtregon to review the reference data created by @Adenikemie and @AlexandraMiddleton in #9 .

This issue can be closed when the quality check has been complete - comments added into the document.

urbanbigdatacentre / ideamaps-models Goto Github PK

ideamaps-models's Introduction

ideamaps-models

ideamaps-models's People

Contributors

Stargazers

Watchers

ideamaps-models's Issues

✅ Definition of Done

✅ Definition of Done

✅ Definition of Done

✅ Definition of Done

✅ Definition of Done

✅ Definition of Done

✅ Definition of Done

Recommend Projects

Recommend Topics

Recommend Org