Git Product home page Git Product logo

betterloader's People

Contributors

devopsbinit avatar ishaanchandratreya avatar jamesbollas avatar raghavmecheri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

betterloader's Issues

Add support for SubsetRandomSampler

Feature Request

Adding a sampler param option for the BetterLoader

Description of Problem:

As we begin moving towards supporting unsupervised learning, one of the first steps will be allowing a user to pass an SubsetRandomSampler object into the BetterLoader, which would be used in order to arbitrarily load data.

Potential Solutions:

Let's use #19 to do this.

Add support for DataLoaderParams Metadata key

Feature Request

As we require more and more custom params to be set at the underlying dataloader level, specifying all of these in a variable object that is a key-value pair of the metadata object would be useful.

Description of Problem:

We're going to end up adding more and more constructor args to mimic DataLoader args. This deals with that whole problem entirely.

Potential Solutions:

Add a dataloader_params key to the dataset_metadata parameter passed into the BetterLoader. This would contain a dict of key-value pairs that we would want to set on the Dataloader level

Fix BetterLoader landing page

Feature Request

We really need to give the landing page a facelift

Description of Problem:

  1. There are too many Docusaurus defaults on the landing page that we never changed

Potential Solutions:

  1. Customise the icons on the landing page
  2. Pick a consistent color scheme

Change index and subset file inputs to object inputs instead

Feature Request

Description of Problem:

You don't always have access to subset and index files saved - sometimes you want to generate them dynamically. Being able to pass in objects/arrays is more useful than passing in just filenames, especially since those filenames can just be opened and then accessed anyway.

Potential Solutions:

We may have to tweak the metadata functions that dynamically read these files, but aside from that, I think it would be a useful value-add.

@JamesBollas thoughts?

Implement transforms as dictionaries

Feature Request

Description of Problem:

Right now, we just pass a single transform object in. This is inconvenient if we want different transforms for train, test, and val, as mentioned in #25.

Potential Solutions:

Rename the transform parameter to transforms and treat it as a dictionary instead. We can then split it within fetch_segmented_dataloaders . We would also want to update our tests to reflect this change + cover the edge case where the transforms parameter is either None or {}

Baseline Integration Test Suite

Feature Request

Description of Problem:

We currently have 2 integration tests, literally for the sake of having them. This needs to be fixed

Potential Solutions:

Given BetterLoader's modular nature, a comprehensive integration test suite would be key moving forward. Testing various types of index files, as well as the consequent functions that would handle them would all be an integral part of ensuring that we don't break what we've already got :)

Landing page refresh

Feature Request

Description of Problem:

We can probably make the website a little more clear overall

Potential Solutions:

  1. Make the actual documentation more informative
  2. Get rid of all the extra detail, and move that to an API documentation page

Rectify all the default variables being passed around

Feature Request

Description of Problem:

We've taken a fairly risky/bad approach by just resorting to setting function params to arbitrary defaults when they aren't passed in. While this works sometimes, I think we've overused this approach and should really trim parts of it down.

Potential Solutions:

Eliminate optional parameters in any non-public function, and actually throw exceptions when things aren't right, rather than passing None values around

Usage Documentation

Feature Request

Description of Problem:

Again, our usage documentation is extremely minimal. To the point where this is probably unusable until we write usage docs.

Potential Solutions:

Just got to take a few hours out and document how far we've got so far :)

data_transforms

Hi !
Is it possible to use a data transform dictionary?
Like this :

data_transforms = {   
    'train': transforms.Compose([
        transforms.Resize([224,224]),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize([224,224]),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Resize([224,224]),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

And do you have an example for the dataset_metadata ?

Review the README.md

Feature Request

Description of Problem:

The README.md for this project could probably be more succinct, and documentation about things like the Makefile could probably be removed.

Potential Solutions:

Maybe we could use something like PyTorch's README, as a baseline (ours should be less elaborate, of course). The idea would be to make the README slightly more concise and potentially refined, while still keeping it succinct and to the point.

Baseline unit test suites

Feature Request

Description of Problem:

We currently have no unit tests, literally just some really basic integration tests, to get us off the ground.

Potential Solutions:

We need to write a baseline test suite, looking at things like unit testing the individual custom classes and their helper methods, as well as maybe a few overall integration tests as well

Explain dataset metadata better

Feature Request

Description of Problem:

The Dataset Metadata section of the Getting Started docs page definitely needs some details for the callable function parameters passed as key-value pairs.

The current description is confusing and is really difficult to understand without peeping under the hood.

Potential Solutions:

We should probably add a short function example along with a docstring for every callable parameter listed

Shuffle

Hi !
Is it possible to make a shuffle ?
I did not find in the documentation.

Unsupervised Learning support

Feature Request

Description of Problem:

BetterLoader should support unsupervised learning tasks too

Potential Solutions:

The first step is definitely to review the data loading process for models like Autoencoders and chart out a gameplan based on that. Updates to this ticket are coming soon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.