sentinel-hub / eo-grow Goto Github PK
View Code? Open in Web Editor NEWEarth observation framework for scaled-up processing in Python
Home Page: https://eo-grow.readthedocs.io/en/latest/
License: MIT License
Earth observation framework for scaled-up processing in Python
Home Page: https://eo-grow.readthedocs.io/en/latest/
License: MIT License
Usually we treat the input-data
folder as the default when running some pipelines, where this is the location of input data to be used in the specific pipeline.
Usually it was not the case that one had to provide specific folder keys for input-data
, so perhaps I would suggest to use the input-data
location as a default in cases where the tiff_folder_key
parameter is not provided when running the ImportTiffPipeline
. This would make the tiff_folder_key
parameter optional.
IMHO since the input-data is always there and is just one, there is no need to explicitly provide it as a folder key.
What is the problem? Please describe.
Imagine a scenario where you are researching a workflow of nodes which are acyclic in nature. You write a task and add it to the node. You mess around, change things, explore, like researchers do. In the end you use the nodes to construct the workflow and run the workflow.
What can happen (speaking from experience):
Alternatives
It would be helpful if this was somehow better managed, to offer the user an easier way to construct a list of nodes with less potential mistakes.
First idea I had was perhaps an additional method of an EOTask
, where you call
nodes_list = []
my_created_task = MyCreatedTask(*args, **kwargs)
my_created_node = my_created_task.get_node(input_nodes = [], nodes_list = nodes_list)
my_next_created_task = MyNextCreatedTask(*args, **kwargs)
my_next_created_node = my_next_created_task.get_node(input_nodes = my_created_node, nodes_list = nodes_list)
...
the node my_*_created_node
get created and filled automatically into the nodes_list
object
For simple linear graphs the input_nodes
could default to [nodes_list[-1]]
, which points to the last node added to the list.
Again, this is just the first thing that came to mind. Not sure if it's the best. I also thought about using some decorators, but didn't manage to find a way where this would could be used.
Question
I have successfully run the batch download pipeline and would like to convert the batch tiles to eopatches. After locally fixing #12 I've managed to run the batch_to_eopatch
pipeline, but I get the following exception in the logs:
Summary of exceptions
LoadUserDataTask (LoadUserDataTask-29825b248e7b11ecbc3b-f57730fc0853):
14 times:
TypeError: execute() missing 1 required positional argument: 'eopatch'
Which is weird, because the LoadUserDataTask
is the first Task and no eopatch
arguments should be expected.
Here is my config:
{
"pipeline": "eogrow.pipelines.batch_to_eopatch.BatchToEOPatchPipeline",
"folder_key": "data",
"mapping": [
{"batch_files": ["B01.tif"], "feature_type": "data", "feature_name": "B01", "multiply_factor": 1e-4},
{"batch_files": ["B02.tif"], "feature_type": "data", "feature_name": "B02", "multiply_factor": 1e-4},
{"batch_files": ["B03.tif"], "feature_type": "data", "feature_name": "B03", "multiply_factor": 1e-4},
{"batch_files": ["B04.tif"], "feature_type": "data", "feature_name": "B04", "multiply_factor": 1e-4},
{"batch_files": ["B05.tif"], "feature_type": "data", "feature_name": "B05", "multiply_factor": 1e-4},
{"batch_files": ["B06.tif"], "feature_type": "data", "feature_name": "B06", "multiply_factor": 1e-4},
{"batch_files": ["B07.tif"], "feature_type": "data", "feature_name": "B07", "multiply_factor": 1e-4},
{"batch_files": ["B08.tif"], "feature_type": "data", "feature_name": "B08", "multiply_factor": 1e-4},
{"batch_files": ["B8A.tif"], "feature_type": "data", "feature_name": "B8A", "multiply_factor": 1e-4},
{"batch_files": ["B09.tif"], "feature_type": "data", "feature_name": "B09", "multiply_factor": 1e-4},
{"batch_files": ["B10.tif"], "feature_type": "data", "feature_name": "B10", "multiply_factor": 1e-4},
{"batch_files": ["B11.tif"], "feature_type": "data", "feature_name": "B11", "multiply_factor": 1e-4},
{"batch_files": ["B12.tif"], "feature_type": "data", "feature_name": "B12", "multiply_factor": 1e-4},
{"batch_files": ["CLP.tif"], "feature_type": "data", "feature_name": "CLP", "multiply_factor": 0.00392156862745098},
{"batch_files": ["CLM.tif"], "feature_type": "mask", "feature_name": "CLM"},
{"batch_files": ["dataMask.tif"], "feature_type": "mask", "feature_name": "dataMask"}
],
"userdata_feature_name": "BATCH_INFO",
"userdata_timestamp_reader": "eogrow.utils.batch.read_timestamps_from_orbits",
"**global_settings": "${config_path}/sentinel2_l1c_batch_config.json"
}
Let me know if you need to see what sentinel2_l1c_batch_config.json
looks like.
process killed while running eogrow.pipelines.prediction.ClassificationPredictionPipeline
To Reproduce
The issue is reproducible when the property "label_encoder_filename" is specified in the cofiguration.
In such case, the LabelEncoder (LE) which was used in training step is used to decode the labels.
This happens in the eogrow.pipelines.prediction.ClassificationPredictionTask line 163 where we try to transform the form of the feature which accepts predictions to be a column feature. But because we are using a LE for decoding, it converts the column predictions array to row array. This causes the function transform_to_feature_form to just fail without any issue.
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Environment
Python 3.9
Stack trace or screenshots
The issue doesn't give any stack trace, the process is killed with just a small warning as follows :
............
2023-01-19 15:34:11,381 graphviz._tools DEBUG deprecate positional args: graphviz.graphs.BaseGraph.__init__(['comment', 'filename', 'directory', 'format', 'engine', 'encoding', 'graph_attr', 'node_attr', 'edge_attr', 'body', 'strict'])
2023-01-19 15:34:11,382 graphviz._tools DEBUG deprecate positional args: graphviz.sources.Source.from_file(['directory', 'format', 'engine', 'encoding', 'renderer', 'formatter'])
2023-01-19 15:34:11,383 graphviz._tools DEBUG deprecate positional args: graphviz.sources.Source.__init__(['filename', 'directory', 'format', 'engine', 'encoding'])
2023-01-19 15:34:11,383 graphviz._tools DEBUG deprecate positional args: graphviz.sources.Source.save(['directory'])
2023-01-19 15:35:16,970 py.warnings WARNING /Users/dhiman/envs/uc_land-cover/lib/python3.9/site-packages/sklearn/preprocessing/_label.py:155: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
Desktop (please complete the following information):
Solution
I was able to fix the issue by moving the line where you add new axis (i.e. line 163 in file eogrow.pipelines.prediction => predictions = predictions[..., np.newaxis]
) after the labels are decoded. I tried to push my branch with the fix to create a PR but I don't have the rights to the repo.
Describe the bug
Seems like even with the changes from #12 I still get issues if boundless=True
in the src.read
part of the ImportFromTiffTask
. I tried setting boundless
to False
and then it works. This makes sense if eopatch
is None
, but not necessarily if it exists before, so this needs to be tested.
To Reproduce
What is the problem? Please describe.
The batch download pipeline doesn't have the option to set a maxcc
filter. This could allow the user to set a filter on the number of tiles to be considered and reduce the size of the downloaded.
It's possible that I'm missing some fact and that this was perhaps intentional.
Describe the bug
When running the batch_to_eopatch
pipeline an import error is thrown
Stack trace or screenshots
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/eogrow/pipelines/batch_to_eopatch.py", line 9, in <module>
from eolearn.core import (
ImportError: cannot import name 'RemoveFeature' from 'eolearn.core' (/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/eolearn/core/__init__.py)
and
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/pipelines/batch_to_eopatch.py", line 19, in <module>
from eolearn.io import ImportFromTiff
ImportError: cannot import name 'ImportFromTiff' from 'eolearn.io' (/Users/mlubej/work/projects/sh-project/eo-learn/io/eolearn/io/__init__.py)
Solution
I guess it should be RemoveFeatureTask
and of ImportFromTiff
Environment
both eo-learn
and eo-grow
are at v1.0.0.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.