Git Product home page Git Product logo

thelper's Introduction

Overview

dependencies
Requires Python 3.6+ Requirements Status
ci-status
Travis-CI Build Status Documentation Build Status
releases
PyPI Package latest release Commits since latest release
packages
Platform Build Status

This package provides a training framework and CLI for PyTorch-based machine learning projects. This is free software distributed under the Apache Software License version 2.0 built by researchers and developers from the Centre de Recherche Informatique de Montréal / Computer Research Institute of Montreal (CRIM).

To get a general idea of what this framework can be used for, visit the FAQ page. For installation instructions, refer to the installation guide. For usage instructions, refer to the user guide. The auto-generated documentation is available via readthedocs.io.

Notes

Development is still on-going --- the API and internal classes may change in the future.

The project's structure was originally generated by cookiecutter via ionelmc's template.

thelper's People

Contributors

crim-beaulima avatar fmigneault avatar fmigneault-crim avatar plstcharles avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

thelper's Issues

pillow possible udpate

# PILLOW_VERSION removed which causes import to fail with torchvision
pillow<7

The constant has been re-added because it broke too many code base. We could consider re-applying pillow>=7.1 to get latest updates.
https://pillow.readthedocs.io/en/stable/releasenotes/7.1.0.html#pillow-version-constant

There are a few other changes to consider that could maybe affect some pre-processing:
https://pillow.readthedocs.io/en/stable/releasenotes/7.0.0.html#default-resampling-filter

I don't think other listed changes will impact us.

@plstcharles your call

Problem with ImageFolderGDataset

Hello Pierre-Luc,

I'm having a problem with the class ImageFolderGDataset that seems unavailable:

The code is here:
https://colab.research.google.com/gist/sfoucher/0eac5c4d42668cd1515e3a3361abe703/classif_model_packaging.ipynb

Also, if I change the configuration in this cell for the following by adding the export config:

config = {"name": model_name, "model": model_config, "datasets": datasets_config, "loaders": loaders_config, "export": export_config}

The problem disappears but I suspect the path is just different internally.

Thank you!

Samuel

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-23-6f05e83ea160> in <module>()
      1 config = {"name": model_name, "model": model_config, "datasets": datasets_config, "loaders": loaders_config}
      2 thelper.utils.bypass_queries = True     # avoid blocking ui query
----> 3 thelper.cli.export_model(config, '/content/thelper-export')

2 frames
/content/thelper/thelper/cli.py in export_model(config, save_dir)
    455         task = thelper.tasks.create_task(task)
    456     if task is None and "datasets" in config:
--> 457         _, task = thelper.data.create_parsers(config)  # try to load via datasets...
    458     assert task is not None, "could not get proper task object from export config or data parsers"
    459     if isinstance(trace_input, str):

/content/thelper/thelper/data/utils.py in create_parsers(config, base_transforms)
    323             if "type" not in dataset_config:
    324                 raise AssertionError("missing field 'type' for instantiation of dataset '%s'" % dataset_name)
--> 325             dataset_type = thelper.utils.import_class(dataset_config["type"])
    326             dataset_params = thelper.utils.get_key_def(["params", "parameters"], dataset_config, {})
    327             transforms = None

/content/thelper/thelper/utils.py in import_class(fullname)
    730         module_name, class_name = fullname.rsplit('.', 1)
    731         module = importlib.import_module(module_name)
--> 732     return getattr(module, class_name)
    733 
    734 

AttributeError: module 'thelper.data.geo' has no attribute 'ImageFolderGDataset'

Validate class names not as string

When passing a dict[int, int], thelper will ingest it without raising anything (ClassNamesHandler will simply store the dict), but this will at a later time raise during the dataloader execution that specifically checks for dict[str, int].

There should be a pre-check / warning immediately at Task creation that raises the invalid format to avoid much later assert. It is generally harder to track where the erroneous value came from the later it gets raised.

In the special case of dict[int, int], thelper could also infer automatically convert the labels since it is a pretty common scenario to have IDs instead of plain names.

Here is an example of Task creation working with dict[int, int] and failing right after at dataloader instantiation.

[2020-10-14 15:18:02,554 - thelper.data.utils] DEBUG : loading dataset 'deepglobe' configuration...
[2020-10-14 15:18:04,161 - thelper.data.utils] WARNING : 'task' field detected in dataset 'deepglobe' config; dataset's default task will be ignored
[2020-10-14 15:18:04,163 - thelper.data.utils] INFO : parsed dataset: ginmodelrepo.util.BatchTestPatchesBaseSegDatasetLoader(transforms=thelper.transforms.composers.Compose(transforms=[
	thelper.transforms.wrappers.TransformWrapper(operation=thelper.transforms.operations.SelectChannels(channels={0: 0, 1: 1, 2: 2}), params={}, probability=1, convert_pil=False, target_keys=['data'], linked_fate=True),
	thelper.transforms.wrappers.TransformWrapper(operation=thelper.transforms.operations.CenterCrop(size=(128, 128), bordertype=0, borderval=0), params={}, probability=1, convert_pil=False, target_keys=None, linked_fate=True),
	thelper.transforms.wrappers.TransformWrapper(operation=thelper.transforms.operations.NormalizeMinMax(min=[0.], max=[255.], out_type=<class 'numpy.float32'>), params={}, probability=1, convert_pil=False, target_keys=['data'], linked_fate=True),
	thelper.transforms.wrappers.TransformWrapper(operation=thelper.transforms.operations.NormalizeZeroMeanUnitVar(mean=[0.485 0.456 0.406], std=[0.229 0.224 0.225], out_type=<class 'numpy.float32'>), params={}, probability=1, convert_pil=False, target_keys=['data'], linked_fate=True),
	thelper.transforms.wrappers.TransformWrapper(operation=thelper.transforms.operations.Transpose(axes=[2 0 1]), params={}, probability=1, convert_pil=False, target_keys=['data'], linked_fate=True)
]), deepcopy=False)
[2020-10-14 15:18:04,164 - thelper.data.utils] INFO : task info: thelper.tasks.segm.Segmentation(class_names={235: 0, 231: 1, 240: 2, 203: 3, 255: 4, 223: 5, 242: 6, 83: 7, 63: 8, 161: 9}, input_key='data', label_map_key='mask', meta_keys=[], dontcare=-1, color_map={})
[2020-10-14 15:18:04,164 - thelper.data.utils] DEBUG : splitting datasets and creating loaders...
[2020-10-14 15:18:04,164 - thelper.data.loaders] INFO : splitting datasets with parsed sizes = {'deepglobe': 631}
[2020-10-14 15:18:04,171 - thelper.data.loaders] INFO : initialized loaders with batch counts:
	train = 32
	valid = 8
Traceback (most recent call last):
  File "/usr/local/bin/thelper", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/thelper/cli.py", line 604, in main
    resume_session(ckptdata, save_dir, config=override_config, eval_only=args.eval_only, task_compat=args.task_compat)
  File "/usr/local/lib/python3.6/dist-packages/thelper/cli.py", line 113, in resume_session
    old_task = thelper.tasks.create_task(ckptdata["task"]) if isinstance(ckptdata["task"], str) else ckptdata["task"]
  File "/usr/local/lib/python3.6/dist-packages/thelper/tasks/utils.py", line 58, in create_task
    task = eval(config)
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/thelper/tasks/segm.py", line 62, in __init__
    ClassNamesHandler.__init__(self, class_names=class_names)
  File "/usr/local/lib/python3.6/dist-packages/thelper/ifaces.py", line 92, in __init__
    self.class_names = class_names
  File "/usr/local/lib/python3.6/dist-packages/thelper/ifaces.py", line 128, in class_names
    assert all([isinstance(name, str) for name in class_names]), "all classes must be named with strings"
AssertionError: all classes must be named with strings

Small issue with torch version number under colab

Hello Pierre-Luc,

Weird issue with thelper using Google colab:

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/usr/local/lib/python3.6/dist-packages/thelper/data/loaders.py", line 39, in default_collate
    torch_ver = [int(v) for v in torch.__version__.split(".")]
  File "/usr/local/lib/python3.6/dist-packages/thelper/data/loaders.py", line 39, in <listcomp>
    torch_ver = [int(v) for v in torch.__version__.split(".")]
ValueError: invalid literal for int() with base 10: '0+cu101'

For some reason I get:
print(torch.__version__)= 1.5.0+cu101

cu101 seems to indicate CUDA 10.1

so torch_ver = [int(v) for v in torch.__version__.split(".")] in default_collate crashes.

Samuel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.