ait's People
ait's Issues
Error(s) in loading state_dict for VQVAE
Thank you for your nice work!
However, after training VA-VAE on depth estimation, I tried to train task-solver on depth estimation, the following error comes out:
Error(s) in loading state_dict for VQVAE:
Missing key(s) in state_dict: "encoder.0.weight", "encoder.0.bias", "encoder.2.weight", "encoder.2.bias", "encoder.4.weight", "encoder.4.bias", "encoder.6.weight", "encoder.6.bias", "encoder.8.weight", "encoder.8.bias", "encoder.10.net.0.weight", "encoder.10.net.0.bias", "encoder.10.net.2.weight", "encoder.10.net.2.bias", "encoder.10.net.4.weight", "encoder.10.net.4.bias", "encoder.11.net.0.weight", "encoder.11.net.0.bias", "encoder.11.net.2.weight", "encoder.11.net.2.bias", "encoder.11.net.4.weight", "encoder.11.net.4.bias", "encoder.12.weight", "encoder.12.bias", "decoder.0.weight", "decoder.0.bias", "decoder.2.net.0.weight", "decoder.2.net.0.bias", "decoder.2.net.2.weight", "decoder.2.net.2.bias", "decoder.2.net.4.weight", "decoder.2.net.4.bias", "decoder.3.net.0.weight", "decoder.3.net.0.bias", "decoder.3.net.2.weight", "decoder.3.net.2.bias", "decoder.3.net.4.weight", "decoder.3.net.4.bias", "decoder.4.weight", "decoder.4.bias", "decoder.6.weight", "decoder.6.bias", "decoder.8.weight", "decoder.8.bias", "decoder.10.weight", "decoder.10.bias", "decoder.12.weight", "decoder.12.bias", "decoder.14.weight", "decoder.14.bias", "_vq_vae._embedding", "_vq_vae._ema_cluster_size", "_vq_vae._ema_w".
How can I solve it? Thank you.
Single Image Inference
How can i perform inferencing with my custom set of images? What changes do I need to do for data pre processing? Do I need to change val dict under data in AiT/ait/configs/swinv2b_480reso_depthonly.py ?
RuntimeError: The size of tensor a (256) must match the size of tensor b (225) at non-singleton dimension 1
Dear author:
Thanks for your meaning work. During inference, I met 'RuntimeError: The size of tensor a (256) must match the size of tensor b (225) at non-singleton dimension 1'.
a
Small typo
Hi, great work! I just noticed a small typo :
In the inference section of the readme, the supposedly <model_checkpoint> is written <model_checkpiont>
Swin-S and Swin-Ti weights
Thank you for releasing your code! I am wondering if you happen to have any pre-trained checkpoints for Swin-S and Swin-Ti? or even just the ImageNet-1k weights. The ImageNet-1k pre-trained weights would be more preferable, as I can't seem to find these released anywhere with matching sizes.
Thanks!
Some problem with visualizing the depth of pred and gt.
Thanks for your work. I meet some problems with visualizing the depth of pred and gt. Here is the location to visualize them in
AiT/ait/code/model/depth/depth.py
Lines 157 to 159 in ca2c2d1
for pred_d, depth_gt in results:
'''visualize 'pred_d'''
pred_crop, gt_crop = cropping_img(pred_d, depth_gt)
''' After reshaping, visualize 'pred_crop, gt_crop'''
computed_result = eval_depth(pred_crop, gt_crop)
this is cmd:
CUDA_VISIBLE_DEVICES=5,6,7 python -m torch.distributed.launch --nproc_per_node=3 code/train.py configs/swinv2b_480reso_depthonly.py --cfg-options model.task_heads.depth.vae_cfg.pretrained=vqvae_depth.pt --eval ait_joint_swinv2b.pth
However, the results of pred_d,pred_crop and gt_crop are very similar. The results of them are like this picture[The picture is almost white]
Training time
Hi, interesting work! Can you share the approximate time to train the VQVAE and the task solver on both tasks? Thanks!
train/visualize on single GPU
Hello!
I am trying to evaluate it by one GPU,but found a lot of errors.
I am new in these,do you have the code for a single GPU?
Best wishes
'PublicAccessNotPermitted' when download the checkpoints
Hi, thank you for the excellent work!
I come across some troubles when I download the checkpoints using wget, it raises an error 'PublicAccessNotPermitted'. I would like to know how to download them properly, especially the pre-trained backbone models.
Thank you in advance!
Unable to evaluate the results
Hello,
I am trying to run these models to evaluate the results, however I am not able to do that due to errors at runtime.
The best "result" I could get is by with this Dockerfile (at the root of the project):
FROM nvidia/cuda:11.4.3-cudnn8-devel-ubuntu18.04
ARG DEBIAN_FRONTEND=noninteractive
ENV TZ=Etc/UTC
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
# Install system dependencies
RUN apt-get update && \
apt-get install -y \
git \
wget \
python3-pip \
python3-dev \
python3-opencv \
python3-six
RUN python3 -m pip install --upgrade pip
RUN pip3 install setuptools openmim
# Install PyTorch and torchvision
RUN pip3 install torch torchvision torchaudio -f https://download.pytorch.org/whl/cu111/torch_stable.html
RUN python3 -m pip install h5py albumentations tensorboardX gdown scipy
RUN python3 -m mim install mmcv
# Upgrade pip
WORKDIR /
RUN wget http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat -O nyu_depth_v2_labeled.mat
RUN git clone https://github.com/vinvino02/GLPDepth.git --depth 1
RUN mv GLPDepth/code/utils/logging.py GLPDepth/code/utils/glp_depth_logging.py
# Set the working directory
WORKDIR /app
RUN python3 ../GLPDepth/code/utils/extract_official_train_test_set_from_mat.py ../nyu_depth_v2_labeled.mat ../GLPDepth/datasets/splits.mat ./data/nyu_depth_v2/official_splits/
# RUN ln -s data ait/data
COPY requirements.txt requirements.txt
RUN python3 -m pip install -r requirements.txt
COPY . .
RUN rm -rf .git
Built the Dockerfile with:
sudo docker build -t mde . -f Dockerfile
And run with:
sudo docker run --name mde-test --gpus all --ipc=host -it --rm -v $(pwd):/app mde
Finally running the evaluation command. For example:
cd ait
python3 -m torch.distributed.launch --nproc_per_node=1 code/train.py configs/swinv2b_480reso_parallel_depthonly.py --cfg-options model.task_heads.depth.vae_cfg.pretrained=../models/vqvae_depth_2bp.pt --eval ../models/ait_depth_swinv2b_parallel.pth
In this way, the inference process is launched, eventually an anonymous error happen:
eval task depth
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 654/654, 2.5 task/s, elapsed: 262s, ETA: 0sERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 34) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 193, in <module>
main()
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
===================================================
code/train.py FAILED
---------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
---------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-08-26_03:01:18
host : f50427e7ad50
rank : 0 (local_rank: 0)
exitcode : -9 (pid: 34)
error_file: <N/A>
traceback : Signal 9 (SIGKILL) received by PID 34
===================================================
Are the authors able to provide the versions of all the software they are using? In particular:
- Linux version and distribution
- CUDA version
- Python version
- Packages version (in the requirements, some versions are missing)
- Any other relevant information about
Thanks.
There is a bug in dataset maybe. Might cause over-fitting maybe.
Thanks for yours sharing.
transform = [
A.Crop(x_min=41, y_min=0, x_max=601, y_max=480),
A.HorizontalFlip(),
A.RandomCrop(crop_size[0], crop_size[1]),
]
In dataset./nyudepthv2.py , i found you cropped image to (480,480)[fixed region], after that a randomcrop is used.
Maybe albumentations could change the transform sequence?
I am not sure.
denorm twice in eval_coco.py
Hello! I find that /vae/utils/eval_coco.py
denorm the reconstruction image twice in line 45.
if hasattr(vae, 'get_codebook_indices'):
code = vae.get_codebook_indices(mask)
remask = vae.decode(code)[0, 0, :, :].cpu().numpy() * 0.5 + 0.5 # why denorm here?
because in class func decode
, the attr use_norm
is True, so decode
will denorm the image, but the code denorm after decodeing.
I will try to investigate the effect when evaluating.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.