First, thanks for an elegant library that has saved me a significant amount of time over the past couple years.
Now, the problem: Since the name change, Ive been trying to refactor some old code to work with future versions of torch (specifically torch==1.8.1+cu101
). While doing so, I seem to have uncovered a confusing issue that shows up upon installation of torchdatasets. The problem is that neither installing via pip install torchdatasets
or even pip install torchdatasets==0.2.0
results in an identical version of the repo to the one tagged as 0.2.0
on GitHub. This became a problem for me, while I tried simply importing:
(ffcv-test) jrose3@serrep3:/media/data/jacob/GitHub/ffcv/examples/cifar$ python
Python 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchdatasets as torchdata
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/media/data/conda/jrose3/envs/ffcv-test/lib/python3.8/site-packages/torchdatasets/__init__.py", line 60, in <module>
from . import cachers, datasets, maps, modifiers, samplers
File "/media/data/conda/jrose3/envs/ffcv-test/lib/python3.8/site-packages/torchdatasets/datasets.py", line 28, in <module>
from torch.utils.data import _typing
ImportError: cannot import name '_typing' from 'torch.utils.data' (/media/data/conda/jrose3/envs/ffcv-test/lib/python3.8/site-packages/torch/utils/data/__init__.py)
Examining the 2 implicated python scripts (1 in torch and 1 in torchdatasets) I realized that the file torch.utils.data._typing
isn't actually introduced into any torch repo until version torch==1.9.0
, while I'm currently using torch==1.8.1
and as far as I can tell, the only stated requirement for the torchdatasets library is torch>=1.2.0
listed in requirements.txt
.
Looking further into the torchdatasets file that relies on torch.utils.data._typing
, namely torchdatasets.datasets.py
, I found that it's only used once, and for a comically unnecessary type hint used in a placeholder class's definition!
class MetaIterableWrapper(MetaIterable, GenericMeta, _typing._DataPipeMeta): pass
My assumption is that this was introduced as part of an effort to integrate the new torch data pipe pattern, but at some point it leaked into the main repo and broke a bunch of other, significant assumptions necessary to install smoothly. Since I can only find it via my locally installed pip version and not on GitHub, I have no clear way of tracking down when it was introduced or by whom.
My recommendation is removing these 2 lines from the file torchdatasets/datasets.py
hosted on pip for version 0.2.0 (Im not sure if these can be revised without updating the version as well). Thoughts?