<a href="https://github.com/openjournals/joss-reviews/issues/3934" data-hovercard-type

Thanks! If the data won't be saved on disk and only loaded into <code class="notransla

Hi again <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

[joss] feature request: accessible utility to import a dataset about pyss3 HOT 4 CLOSED

sergioburdisso commented on May 21, 2024

[joss] feature request: accessible utility to import a dataset

from pyss3.

Comments (4)

hbaniecki commented on May 21, 2024 3

No worries. Thanks! Works great.

from pyss3.

hbaniecki commented on May 21, 2024 1

Thanks! If the data won't be saved on disk and only loaded into x_train, y_train etc. then load_from_url(url, folder=None) makes perfect sense.

from pyss3.

sergioburdisso commented on May 21, 2024

Hi again @hbaniecki!

Wow, this is an amazing idea 👏👏👏

What do you think if adding a method called load_from_url() to the Dataset class, which would do the same thing as the current Dataset.load_from_files() but instead of loading the dataset from disk, it will do it from an URL, as you suggested.

Perhaps load_from_url() should take two arguments, load_from_url(url, folder=None), first the url from which to download the zipped dataset and secondly, an optional argument called something like folder that let the user to specify a particular folder to use from inside the zipped dataset. The example would end up being something like:

from pyss3 import SS3

x_train, y_train = Dataset.load_from_url("https://url/to/movie_review.zip", "train")
x_test, y_test = Dataset.load_from_url("https://url/to/movie_review.zip", "test")

clf = SS3()
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)

What do you think?
(Again, thank you for this suggestion, I think it is an awesome idea 💪😎👍)

from pyss3.

sergioburdisso commented on May 21, 2024

Hi @hbaniecki! sorry for the delay, I just had to wait for the weekend to get down on this. I've added the suggested methods and also updated the README.md. Just check it out and let me know if everthing is OK 💪 🤓 👍

Below I'm pasting the commit message that marked this issue as closed:

Now datasets can be directly loaded via a given url, not only from disk.
To achieve this, two methods have been added to Dataset class:

Dataset.load_from_url(...)

Dataset.load_from_url_multilabel(...)

These methods download and extract the zip file (given by the url)
into the system's temporary folder and then call
Dataset.load_from_files() to load it
(or Dataset.load_from_files_multilabel(), respectively).

Note: If the same url is used consecutively, the already downloaded
files will be used as a cache (to avoid downloading and extracting
them again).

from pyss3.

[joss] feature request: accessible utility to import a dataset about pyss3 HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent