Comments (7)
@linzzzzzz You're only using one process, so you can remove the distributed launch entirely:
python main.py --batch_size 2 --no_aux_loss --eval
--resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth
--coco_path /path/to/coco
from detr.
Hi @gaopengcuhk
If you are going to use slurm, we recommend using Submitit.
To run your eval on one node, one gpu, you can run
python run_with_submitit.py --ngpus 1 --nodes 1 --timeout 360 --batch_size 2 --no_aux_loss --eval --resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth --coco_path ../../dataset/
As a side note, your version of pytorch is a bit old, we recommend using pytorch 1.5.
from detr.
Thank you very much. I need to modify part of the code to run detr with pytorch 1.3. I will switch to pytorch 1.5 latter.
Thank you very much for your quick response.
from detr.
python run_with_submitit.py --ngpus 1 --nodes 1 --timeout 360 --batch_size 2 --no_aux_loss --eval --resume ./saved_model/detr-r50-e632da11.pth --coco_path ../../dataset/
When I run your code, I run into the following error.
submitit INFO (2020-06-02 21:34:32,188) - Starting with JobEnvironment(job_id=587603, hostname=SH-IDC1-10-198-6-145, local_rank=0(1), node=0(1), global_rank=0(1))
submitit INFO (2020-06-02 21:34:32,188) - Loading pickle: experiments/587603/587603_submitted.pkl
Process group: 1 tasks, rank: 0
submitit ERROR (2020-06-02 21:35:20,838) - Submitted job triggered an exception
~
from detr.
@gaopengcuhk can you paste which exception you got? As this might be an issue with submitit, and not with DETR
from detr.
I came across similar issue and ended up using the command below:
python -m torch.distributed.launch --nproc_per_node=1 --use_env main.py --batch_size 2 --no_aux_loss --eval
--resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth
--coco_path /path/to/coco
from detr.
python run_with_submitit.py --ngpus 1 --nodes 1 --timeout 360 --batch_size 2 --no_aux_loss --eval --resume ./saved_model/detr-r50-e632da11.pth --coco_path ../../dataset/
When I run your code, I run into the following error.
submitit INFO (2020-06-02 21:34:32,188) - Starting with JobEnvironment(job_id=587603, hostname=SH-IDC1-10-198-6-145, local_rank=0(1), node=0(1), global_rank=0(1))
submitit INFO (2020-06-02 21:34:32,188) - Loading pickle: experiments/587603/587603_submitted.pkl
Process group: 1 tasks, rank: 0
submitit ERROR (2020-06-02 21:35:20,838) - Submitted job triggered an exception
~
Hi , i am using on 1984 and get the same problems. Could you tell me the solution?
from detr.
Related Issues (20)
- Question about object queries. HOT 4
- I want to train the DETR model on a CPU. How can I make it possible on a small computer, 8gb RAM HOT 3
- Why positional encoding is added to different role in encoder and decoder. HOT 1
- 🐛 Bug: Architecture diagram in README.md renders incorrectly when using dark mode
- continue training with chekckpoint
- How to finetune DETR for semantic segmentation task?
- I do not understand what the mask meaning in "samlpes"
- Process finished with exit code 137 (interrupted by signal 9: SIGKILL)Please read & provide the following
- Very low performance for segmentation task.
- box_cxcywh_to_xyxy
- ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 6 (pid: 257736) of binary: /home/public/anaconda3/envs/DL/bin/python
- Average Precision of each class for best epoch and then it's mean HOT 1
- the mAP is chage
- I think there are some errors in the posted code HOT 6
- Queries for images with low number of objects HOT 2
- RuntimeError: Error(s) in loading state_dict for DETRsegm: HOT 2
- Map metrics anomalies after backbone replacement
- when the trained model is used for inference this import error comes: RuntimeError: Failed to import transformers.models.detr.modeling_detr because of the following error (look up to see its traceback): cannot import name 'experimental_functions_run_eagerly' from 'tensorflow.python.eager.def_function' (C:\Anaconda\lib\site-packages\tensorflow\python\eager\def_function.py)
- Get Image masks coordinates.
- GFLOPs instead of GFLOPS?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from detr.