Comments (9)
@opcode81 could you look into it?
from tianshou.
This is a well-known Poetry limitation. By default, installing torch via Poetry will use a torch build that was built against a default version of CUDA (it is not a CPU-only version); and the version of CUDA it uses depends on the torch version. For example, torch 2.0 might have used CUDA 11 and later versions might now use CUDA 12 by default.
So to get CUDA support in later torch versions on your system, you have the following uptions:
- Upgrade your system to use CUDA 12 (if possible) OR
- Install a torch build (of the same version) that works with CUDA 11 (or whatever CUDA version you may have)
When using Poetry,
- a hacky way to do the latter (which you'll need to repeat after every poetry install) is to
pip uninstall torch
and then, for example,pip install torch==<desired version> --index-url https://download.pytorch.org/whl/cu118
(see PyTorch site for available URLs when using pip) - a cleaner way to do the latter is to configure a "source" in Poetry and then specify that torch shall be installed via that source.
When using conda to manage your env (often a better choice!), a clean way to do the latter is to use the channels pytorch
and nvidia
and to depend on pytorch-cuda=11
in addition to pytorch
itself.
from tianshou.
Let's add some instructions to the readme, then close this issue. I can do that, or leave it to you, if you want :)
from tianshou.
This is a well-known Poetry limitation. By default, installing torch via Poetry will use a torch build that was built against a default version of CUDA (it is not a CPU-only version); and the version of CUDA it uses depends on the torch version. For example, torch 2.0 might have used CUDA 11 and later versions might now use CUDA 12 by default.
I'm a bit confused about this. According to your explanation, a "default" version of PyTorch that supports CUDA should be installed. However, the result on my Windows shows that it installed a version that doesn't support CUDA, not just a difference between CUDA 11 or 12. But it doesn't matter, I'd prefer to install it manually.
- a cleaner way to do the latter is to configure a "source" in Poetry and then specify that torch shall be installed via that source.
Perhaps some modifications can be made in pyproject.toml
according to this. (but this could potentially cause some inconvenience for some users who are experiencing slow network speeds when accessing https://download.pytorch.org/).
Anyway, I think it'd be a good idea to add some instructions in the readme, explaining the limitations of the default torch installation and suggesting (or reminding users of) the recommended manual installation methods like above.
Thank you~
from tianshou.
This is a well-known Poetry limitation. By default, installing torch via Poetry will use a torch build that was built against a default version of CUDA (it is not a CPU-only version); and the version of CUDA it uses depends on the torch version. For example, torch 2.0 might have used CUDA 11 and later versions might now use CUDA 12 by default.
I'm a bit confused about this. According to your explanation, a "default" version of PyTorch that supports CUDA should be installed. However, the result on my Windows shows that it installed a version that doesn't support CUDA, not just a difference between CUDA 11 or 12.
No, the result on your Windows machine does not show this. The version that is designated as purely "2.1.1" does support CUDA, just not necessarily your version of CUDA. Like I said, every default build of torch installed by Poetry supports a particular version of CUDA, and the "2.1.1" version happens to support CUDA 12. Did you try upgrading to CUDA 12/your Nvidia drivers?
By contrast, builds installed from https://download.pytorch.org will specifically designate the CUDA version in a suffix (e.g. you might get "2.1.1+cu118"), but the default builds do not indicate the CUDA version they are compatible with in the version string.
from tianshou.
No, the result on your Windows machine does not show this. The version that is designated as purely "2.1.1" does support CUDA, just not necessarily your version of CUDA. Like I said, every default build of torch installed by Poetry supports a particular version of CUDA, and the "2.1.1" version happens to support CUDA 12. Did you try upgrading to CUDA 12/your Nvidia drivers? By contrast, builds installed from https://download.pytorch.org will specifically designate the CUDA version in a suffix (e.g. you might get "2.1.1+cu118"), but the default builds do not indicate the CUDA version they are compatible with in the version string.
I didn't install CUDA through the official NVIDIA website, but CUDA 12 is indeed present on my computer. I suspect it might have been installed via PyTorch in some other conda environment.
However, in the Poetry test environment, where
> nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 546.33 Driver Version: 546.33 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
> pip list
torch 2.1.1 // default version from poetry, just as we have been discussing
and in python,
>>> import torch
>>> torch.cuda.is_available()
False
Furthermore,
>>> torch.tensor([0]).to("cuda")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\cmzb\miniconda3\envs\temp\Lib\site-packages\torch\cuda\__init__.py", line 289, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
So, I'm still not quite sure whether the default build of torch installed by poetry supports CUDA.
from tianshou.
@coolermzb3 Maybe you can add the following text to the corresponding block in pyproject.toml
to FORCE Poetry to use the pytorch GPU source.
[tool.poetry.dependencies]
torch = {version = "^2.3.1", source = "pytorch"}
[[tool.poetry.source]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
priority = "explicit"
[[tool.poetry.source]]
contains three fields: name
, url
, priority
.
The custom name
is used in [tool.poetry.dependencies]
to enable Poetry to identify which custom source should be used currently.
url
is used to specify the source location.
And the priority
source is configured as explicit
, meaning that the source will only be searched if the package configuration explicitly indicates that it should be found on this package source.
from tianshou.
@phoinix-chen Thank you! It works!
from tianshou.
@coolermzb3 I had literally suggested the same solution in this link:
- a cleaner way to do the latter is to configure a "source" in Poetry and then specify that torch shall be installed via that source.
Regarding CUDA support in the "default" torch installation: The policy has changed.
By default, installing torch via Poetry will use a torch build that was built against a default version of CUDA (it is not a CPU-only version); and the version of CUDA it uses depends on the torch version. For example, torch 2.0 might have used CUDA 11 and later versions might now use CUDA 12 by default.
This statement is no longer true for Windows, but is still true for Linux. The version "2.1.0", for example, supports CUDA 12 on Linux but is CPU-only on Windows. It's an unfortunate and random decision, because Windows does, of course, still support CUDA.
from tianshou.
Related Issues (20)
- How can I make action sampling within the range specified by my environment when using onpolicy_trainer? HOT 6
- Document effects of the relations between buffer size, num workers and episode length
- [question] Why does Tianshou use a replay buffer in on-policy RL algorithms? HOT 1
- ImportError: cannot import name 'Self' from 'typing' (/root/miniconda3/lib/python3.10/typing.py) HOT 1
- ModuleNotFoundError: No module named 'tianshou.highlevel' HOT 2
- Support dict observation spaces in highlevel api
- get_env_attr not working in SubprocVectorEnv? HOT 2
- How to save the log which axis is each epoch not epoch's steps? HOT 2
- Python Bug: lambda function refers only one environment HOT 4
- expected to be in range of [-1, 0], but got 1 HOT 3
- Unable to replicate original PPO performance HOT 7
- Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector
- will add dreamerv3 ?
- Documentation for multi-agent needs fixing
- No minibatch for computation of logp_old in PPOPolicy HOT 1
- MPO Implementation HOT 1
- Improve interface of BasePolicy.compute_action
- Suggestion - Redesign RayEnvWorker for Improved Performance
- tianshou v1.0.0 failed to install on python 3.12.4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tianshou.