sentinel-hub / eo-grow Goto Github PK

View Code? Open in Web Editor NEW

35.0 8.0 3.0 2.69 MB

Earth observation framework for scaled-up processing in Python

Home Page: https://eo-grow.readthedocs.io/en/latest/

License: MIT License

Python 99.91% Makefile 0.09%

eo-data eo-research machine-learning python

eo-grow's People

Contributors

Stargazers

Watchers

Forkers

aashishd ashishdhiman-tomtom westamine

eo-grow's Issues

[FEAT] Default `tiff_folder_key` parameter for ImportTiffPipeline

Usually we treat the input-data folder as the default when running some pipelines, where this is the location of input data to be used in the specific pipeline.

Usually it was not the case that one had to provide specific folder keys for input-data, so perhaps I would suggest to use the input-data location as a default in cases where the tiff_folder_key parameter is not provided when running the ImportTiffPipeline. This would make the tiff_folder_key parameter optional.

IMHO since the input-data is always there and is just one, there is no need to explicitly provide it as a folder key.

[FEAT] Make EONode construction more user friendly

What is the problem? Please describe.

Imagine a scenario where you are researching a workflow of nodes which are acyclic in nature. You write a task and add it to the node. You mess around, change things, explore, like researchers do. In the end you use the nodes to construct the workflow and run the workflow.

What can happen (speaking from experience):

you create a task, but forget to use the same task in the node related to that task (old one is used)
you link the tasks wrong, potentially missing out on a branch in a workflow
hard to keep track of a list of all the nodes, first you have to defined the node objects and then add them to a list

Alternatives

It would be helpful if this was somehow better managed, to offer the user an easier way to construct a list of nodes with less potential mistakes.

First idea I had was perhaps an additional method of an EOTask, where you call

nodes_list = []
my_created_task = MyCreatedTask(*args, **kwargs)
my_created_node = my_created_task.get_node(input_nodes = [], nodes_list = nodes_list)

my_next_created_task = MyNextCreatedTask(*args, **kwargs)
my_next_created_node = my_next_created_task.get_node(input_nodes = my_created_node, nodes_list = nodes_list)

...

the node my_*_created_node get created and filled automatically into the nodes_list object

For simple linear graphs the input_nodes could default to [nodes_list[-1]], which points to the last node added to the list.

Again, this is just the first thing that came to mind. Not sure if it's the best. I also thought about using some decorators, but didn't manage to find a way where this would could be used.

[BUG] Issues running the batch_to_eopatch pipeline

Question

I have successfully run the batch download pipeline and would like to convert the batch tiles to eopatches. After locally fixing #12 I've managed to run the batch_to_eopatch pipeline, but I get the following exception in the logs:

Summary of exceptions

    LoadUserDataTask (LoadUserDataTask-29825b248e7b11ecbc3b-f57730fc0853):
        14 times:

        TypeError: execute() missing 1 required positional argument: 'eopatch'

Which is weird, because the LoadUserDataTask is the first Task and no eopatch arguments should be expected.

Here is my config:

{
  "pipeline": "eogrow.pipelines.batch_to_eopatch.BatchToEOPatchPipeline",
  "folder_key": "data",
  "mapping": [
    {"batch_files": ["B01.tif"], "feature_type": "data", "feature_name": "B01", "multiply_factor": 1e-4},
    {"batch_files": ["B02.tif"], "feature_type": "data", "feature_name": "B02", "multiply_factor": 1e-4},
    {"batch_files": ["B03.tif"], "feature_type": "data", "feature_name": "B03", "multiply_factor": 1e-4},
    {"batch_files": ["B04.tif"], "feature_type": "data", "feature_name": "B04", "multiply_factor": 1e-4},
    {"batch_files": ["B05.tif"], "feature_type": "data", "feature_name": "B05", "multiply_factor": 1e-4},
    {"batch_files": ["B06.tif"], "feature_type": "data", "feature_name": "B06", "multiply_factor": 1e-4},
    {"batch_files": ["B07.tif"], "feature_type": "data", "feature_name": "B07", "multiply_factor": 1e-4},
    {"batch_files": ["B08.tif"], "feature_type": "data", "feature_name": "B08", "multiply_factor": 1e-4},
    {"batch_files": ["B8A.tif"], "feature_type": "data", "feature_name": "B8A", "multiply_factor": 1e-4},
    {"batch_files": ["B09.tif"], "feature_type": "data", "feature_name": "B09", "multiply_factor": 1e-4},
    {"batch_files": ["B10.tif"], "feature_type": "data", "feature_name": "B10", "multiply_factor": 1e-4},
    {"batch_files": ["B11.tif"], "feature_type": "data", "feature_name": "B11", "multiply_factor": 1e-4},
    {"batch_files": ["B12.tif"], "feature_type": "data", "feature_name": "B12", "multiply_factor": 1e-4},
    {"batch_files": ["CLP.tif"], "feature_type": "data", "feature_name": "CLP", "multiply_factor": 0.00392156862745098},
    {"batch_files": ["CLM.tif"], "feature_type": "mask", "feature_name": "CLM"},
    {"batch_files": ["dataMask.tif"], "feature_type": "mask", "feature_name": "dataMask"}
  ],
  "userdata_feature_name": "BATCH_INFO",
  "userdata_timestamp_reader": "eogrow.utils.batch.read_timestamps_from_orbits",
  "**global_settings": "${config_path}/sentinel2_l1c_batch_config.json"
}

Let me know if you need to see what sentinel2_l1c_batch_config.json looks like.

process killed while running eogrow.pipelines.prediction.ClassificationPredictionPipeline

process killed while running eogrow.pipelines.prediction.ClassificationPredictionPipeline

To Reproduce
The issue is reproducible when the property "label_encoder_filename" is specified in the cofiguration.
In such case, the LabelEncoder (LE) which was used in training step is used to decode the labels.

This happens in the eogrow.pipelines.prediction.ClassificationPredictionTask line 163 where we try to transform the form of the feature which accepts predictions to be a column feature. But because we are using a LE for decoding, it converts the column predictions array to row array. This causes the function transform_to_feature_form to just fail without any issue.

Steps to reproduce the behavior:

Run the eogrow.pipelines.training.ClassificationTrainingPipeline with a preprocessing.label_encoder_filename specified
Once the model is trained, run eogrow.pipelines.prediction.ClassificationPredictionPipeline and use the same label_encoder_filename which was used in training to decode

Expected behavior

A clear and concise description of what you expected to happen.

Environment

Python 3.9

Stack trace or screenshots

The issue doesn't give any stack trace, the process is killed with just a small warning as follows :
............

2023-01-19 15:34:11,381 graphviz._tools DEBUG    deprecate positional args: graphviz.graphs.BaseGraph.__init__(['comment', 'filename', 'directory', 'format', 'engine', 'encoding', 'graph_attr', 'node_attr', 'edge_attr', 'body', 'strict'])
2023-01-19 15:34:11,382 graphviz._tools DEBUG    deprecate positional args: graphviz.sources.Source.from_file(['directory', 'format', 'engine', 'encoding', 'renderer', 'formatter'])
2023-01-19 15:34:11,383 graphviz._tools DEBUG    deprecate positional args: graphviz.sources.Source.__init__(['filename', 'directory', 'format', 'engine', 'encoding'])
2023-01-19 15:34:11,383 graphviz._tools DEBUG    deprecate positional args: graphviz.sources.Source.save(['directory'])
2023-01-19 15:35:16,970 py.warnings  WARNING  /Users/dhiman/envs/uc_land-cover/lib/python3.9/site-packages/sklearn/preprocessing/_label.py:155: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)

Desktop (please complete the following information):

OS: Docker container
Version
sentinelhub>=3.6.3
eo-learn==1.3.1
eo-grow==1.3.3
scikit-learn==1.2.0

Solution
I was able to fix the issue by moving the line where you add new axis (i.e. line 163 in file eogrow.pipelines.prediction => predictions = predictions[..., np.newaxis]) after the labels are decoded. I tried to push my branch with the fix to create a PR but I don't have the rights to the repo.

[BUG] further issues with batch_to_eopatch pipeline

Describe the bug

Seems like even with the changes from #12 I still get issues if boundless=True in the src.read part of the ImportFromTiffTask. I tried setting boundless to False and then it works. This makes sense if eopatch is None, but not necessarily if it exists before, so this needs to be tested.

To Reproduce

Run batch download pipeline
Run batch to eopatch pipeline with changes from #13

[FEAT] Batch download pipeline doesn't have the option to set a maxcc filter

What is the problem? Please describe.

The batch download pipeline doesn't have the option to set a maxcc filter. This could allow the user to set a filter on the number of tiles to be considered and reduce the size of the downloaded.

It's possible that I'm missing some fact and that this was perhaps intentional.

[BUG] Import error in batch_to_eopatch pipeline

Describe the bug

When running the batch_to_eopatch pipeline an import error is thrown

Stack trace or screenshots

  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/eogrow/pipelines/batch_to_eopatch.py", line 9, in <module>
    from eolearn.core import (
ImportError: cannot import name 'RemoveFeature' from 'eolearn.core' (/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/eolearn/core/__init__.py)

and

  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/pipelines/batch_to_eopatch.py", line 19, in <module>
    from eolearn.io import ImportFromTiff
ImportError: cannot import name 'ImportFromTiff' from 'eolearn.io' (/Users/mlubej/work/projects/sh-project/eo-learn/io/eolearn/io/__init__.py)

Solution

I guess it should be RemoveFeatureTask and of ImportFromTiff

Environment

both eo-learn and eo-grow are at v1.0.0.

sentinel-hub / eo-grow Goto Github PK

eo-grow's People

Contributors

Stargazers

Watchers

Forkers

eo-grow's Issues

[FEAT] Default `tiff_folder_key` parameter for ImportTiffPipeline

[FEAT] Make EONode construction more user friendly

[BUG] Issues running the batch_to_eopatch pipeline

process killed while running eogrow.pipelines.prediction.ClassificationPredictionPipeline

[BUG] further issues with batch_to_eopatch pipeline

[FEAT] Batch download pipeline doesn't have the option to set a maxcc filter

[BUG] Import error in batch_to_eopatch pipeline

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent