rvmaretto / deepgeo Goto Github PK
View Code? Open in Web Editor NEWDeepGeo: Deep Learning for Earth Observation data ToolBox
License: GNU General Public License v3.0
DeepGeo: Deep Learning for Earth Observation data ToolBox
License: GNU General Public License v3.0
Implement a notebook to plot the results of data augmentation.
In the current version, there is a clear bottleneck in the CPU integration of the network weights. Implement profiling options to analyse the execution in each device (CPUs and GPUs)
Implemente a first simpler CNN, like VGG or another like that.
Samples are now saved only in PNG files. Save it in GeoTiff files to make the visualization possible in SIGs.
Data augmentation operations are not working on multiple GPUs, and rotation operations are not even running on GPUs, only on one CPU. Fix it.
Implement unittests for the SampleGenerator class.
Synthetic band generated by the computeEVI function seems to be wrong. Verify results and formula.
Inside the src folder, create a new folder "deepleeo", that will be the top folder of the package. Verify structure. The main init.py must be inside this folder?
After this, include the code coverage in the Travis script:
nosetests --with-coverage --cover-erase --cover-package=deepleeo --cover-html
Chips are being saved with some shift.
The Rasterizer class must work with any number of bands. Verify the behavior when the base raster have more than 3 bands. Fix if necessary.
Implement the following strategies:
Implement the U-Net with the possibility of early fusion, stacking images BEFORE the encoder.
This function must generate a new map with false negatives, false positives and pixels correctly classified. It must plot it with matplotlib or seaborn.
Rename module utils to common.
This method must compare the final classification map with the ground through the metrics computed in quality_metrics.compute_quality_metrics.
When opened in TerraView, the saved raster are all black. Verify if they were correctly saved.
It must be something like data_manager or things like this.
Another solution is to move the data augmentation to utils module.
Implement a jupyter notebook to rasterize and plot a reference shape file and a reference Landsat Image. It will work as a visual test for the rasterizer.
Implement a function to print a summary of the model in the same way of keras model.summary()
This file must contain the functions to plot original image, labeled image, samples and data augmentation.
Plot also crossentropy and f1-score in tensorboard. Merge them with the accuracy in the same "group".
This method should make prediction in some chips and compute some metrics like f1-score, IoU, overall accuracy, precision, recall, confusion matrix, etc.
This method should either save these results in a folder ./validation inside the training model directory.
Implement a predefined function to compute PCA in preprocessor. It must receive as parameter the number of components to keep. To compute it, try to use the scikitlearn following function:
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
Allow the user to select the classes of interest in the rasterizer. In this way, tranform into "interest classes" and the remainder as "non-class", that will have the same value in the rasterized data.
Mixture model needs some samples of vegetation, soil and shadow. Verify if there is a way to automate this proccess. There are some methodologies to compute it according to spectral libraries. Verify.
To make it easier to visualize the spatial distribution of the samples, instead of saving several geotiffs, save a shape file with the extent of all the samples.
The class would be responsible for allowing the user to extract some synthetic data or indexes (NDVI, EVI, etc) in a synthetic band to compose the dataset. The API must provide some predefined functions, like the NDVI and EVI, but allow the user to pass a customized function as parameter. Thus, the system must be able to compute a synthetic band based on this (or these) functions (It must allow to produce more than just one band).
This class must either be able to remove classes that the user is not interested, croping either the base raster.
Implement AUC metric through the function tf.metrics.auc
reference: https://www.tensorflow.org/api_docs/python/tf/metrics/auc
This class must be able to generate chips for a list of images and shapefiles using the chipGenerator with a given strategy and produce a single dataset.
Verify the impact of the patch size in the classification accuracy. Try from 64 to 256. Is it dependent on the target sizes?
To make the DNN performance better, the data must be normalized, usually between -1 and 1, with the mean centered in 0. Verify this information in the literature, and implement this normalization.
It could use sinthetic data, like a 3x3 matrix or things like this.
Method geofunctions.load_image is really necessary?
Is it necessary to convert the data to float32?
Is it necessary to mask it?
Review this method and refactor it if necessary.
Several of the implemented methods have prints in their body. Remove it.
Keras seems to have some Data Augmentation functionalities. Verfify if there are another packages with this functionalities. It is better to use them insetead of implement it. Verify if TensorFlow provides some of these functionalities.
Generate smaller test data to use in the unit tests.
The contrast is not working in the plot_chips method. Try to use in it plot_img_rgb or plot_labels, depending on the number of channels passed as parameter or depending on the another parameters (classes, etc).
Verify if samples containing "no data" values can confuse the network. If positive, implement an strategy to avoid patches containig "no data" values.
Unit tests are now using bigger data. Change it to use cropped data to fix failing build on Travis.
Implement an U-Net version with the time fusion between the encoder and the decoder.
Implement a method or a class to, given a numpy array, retrieve a raster band, or even a class that will encapsulate the numpy array. It would make it easier to the user to deal with the raster. The class can have either the path to the input raster as a parameter.
Implement method to split the dataset between train, test and evaluation and save it to a TfRecord. Then, load it as tf.Dataset to make it possible to use the MirroredStrategy.
Allow the user to agregate different classes in the shape file into a new class. Create a new collumn for this. For example, prodes NAO_FLORESTA and NAO FLORESTA 2 agregated into the new NAO_FLORESTA class.
The rasterizer must save a PNG file with the labeled map and a legend with the colors of each class.
Now Travis is only running tests on Linux. Create environments for Windows and Mac OSX
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.