binitai / betterloader Goto Github PK

A better PyTorch data loader capable of custom image operations and image subsets

Home Page: https://binitai.github.io/BetterLoader/

License: MIT License

Python 99.40% Makefile 0.60%

pytorch-dataloader pytorch torch torchvision deep-learning pytorch-implmention pytorch-dataloader-objects pytorch-transformers

betterloader's People

Contributors

Stargazers

Watchers

Forkers

ishaanchandratreya

betterloader's Issues

Add support for SubsetRandomSampler

Feature Request

Adding a sampler param option for the BetterLoader

Description of Problem:

As we begin moving towards supporting unsupervised learning, one of the first steps will be allowing a user to pass an SubsetRandomSampler object into the BetterLoader, which would be used in order to arbitrarily load data.

Potential Solutions:

Let's use #19 to do this.

Add support for DataLoaderParams Metadata key

Feature Request

As we require more and more custom params to be set at the underlying dataloader level, specifying all of these in a variable object that is a key-value pair of the metadata object would be useful.

Description of Problem:

We're going to end up adding more and more constructor args to mimic DataLoader args. This deals with that whole problem entirely.

Potential Solutions:

Add a dataloader_params key to the dataset_metadata parameter passed into the BetterLoader. This would contain a dict of key-value pairs that we would want to set on the Dataloader level

Fix BetterLoader landing page

Feature Request

We really need to give the landing page a facelift

Description of Problem:

There are too many Docusaurus defaults on the landing page that we never changed

Potential Solutions:

Customise the icons on the landing page
Pick a consistent color scheme

Change index and subset file inputs to object inputs instead

Feature Request

Description of Problem:

You don't always have access to subset and index files saved - sometimes you want to generate them dynamically. Being able to pass in objects/arrays is more useful than passing in just filenames, especially since those filenames can just be opened and then accessed anyway.

Potential Solutions:

We may have to tweak the metadata functions that dynamically read these files, but aside from that, I think it would be a useful value-add.

@JamesBollas thoughts?

Implement transforms as dictionaries

Feature Request

Description of Problem:

Right now, we just pass a single transform object in. This is inconvenient if we want different transforms for train, test, and val, as mentioned in #25.

Potential Solutions:

Rename the transform parameter to transforms and treat it as a dictionary instead. We can then split it within fetch_segmented_dataloaders . We would also want to update our tests to reflect this change + cover the edge case where the transforms parameter is either None or {}

Baseline Integration Test Suite

Feature Request

Description of Problem:

We currently have 2 integration tests, literally for the sake of having them. This needs to be fixed

Potential Solutions:

Given BetterLoader's modular nature, a comprehensive integration test suite would be key moving forward. Testing various types of index files, as well as the consequent functions that would handle them would all be an integral part of ensuring that we don't break what we've already got :)

Landing page refresh

Feature Request

Description of Problem:

We can probably make the website a little more clear overall

Potential Solutions:

Make the actual documentation more informative
Get rid of all the extra detail, and move that to an API documentation page

Rectify all the default variables being passed around

Feature Request

Description of Problem:

We've taken a fairly risky/bad approach by just resorting to setting function params to arbitrary defaults when they aren't passed in. While this works sometimes, I think we've overused this approach and should really trim parts of it down.

Potential Solutions:

Eliminate optional parameters in any non-public function, and actually throw exceptions when things aren't right, rather than passing None values around

Usage Documentation

Feature Request

Description of Problem:

Again, our usage documentation is extremely minimal. To the point where this is probably unusable until we write usage docs.

Potential Solutions:

Just got to take a few hours out and document how far we've got so far :)

data_transforms

Hi !
Is it possible to use a data transform dictionary?
Like this :

data_transforms = {   
    'train': transforms.Compose([
        transforms.Resize([224,224]),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize([224,224]),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Resize([224,224]),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

And do you have an example for the dataset_metadata ?

Actually write unit tests

Pretty self explanatory - I wrote out a skeleton, but didn't actually implement anything

Review the README.md

Feature Request

Description of Problem:

The README.md for this project could probably be more succinct, and documentation about things like the Makefile could probably be removed.

Potential Solutions:

Maybe we could use something like PyTorch's README, as a baseline (ours should be less elaborate, of course). The idea would be to make the README slightly more concise and potentially refined, while still keeping it succinct and to the point.

Baseline unit test suites

Feature Request

Description of Problem:

We currently have no unit tests, literally just some really basic integration tests, to get us off the ground.

Potential Solutions:

We need to write a baseline test suite, looking at things like unit testing the individual custom classes and their helper methods, as well as maybe a few overall integration tests as well

Explain dataset metadata better

Feature Request

Description of Problem:

The Dataset Metadata section of the Getting Started docs page definitely needs some details for the callable function parameters passed as key-value pairs.

The current description is confusing and is really difficult to understand without peeping under the hood.

Potential Solutions:

We should probably add a short function example along with a docstring for every callable parameter listed

Feature Request

Description of Problem:

BetterLoader should support unsupervised learning tasks too

Potential Solutions:

The first step is definitely to review the data loading process for models like Autoencoders and chart out a gameplan based on that. Updates to this ticket are coming soon

binitai / betterloader Goto Github PK

betterloader's People

Contributors

Stargazers

Watchers

Forkers

betterloader's Issues

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Feature Request

Description of Problem:

Potential Solutions:

Recommend Projects

Recommend Topics

Recommend Org