Git Product home page Git Product logo

mvsnet_pytorch's Introduction

An Unofficial Pytorch Implementation of MVSNet

MVSNet: Depth Inference for Unstructured Multi-view Stereo. Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, Long Quan. ECCV 2018. MVSNet is a deep learning architecture for depth map inference from unstructured multi-view images.

This is an unofficial Pytorch implementation of MVSNet

How to Use

Environment

  • python 3.6 (Anaconda)
  • pytorch 1.0.1

Training

  • Download the preprocessed DTU training data (Fixed training cameras, from Original MVSNet), and upzip it as the MVS_TRANING folder
  • in train.sh, set MVS_TRAINING as your training data path
  • create a logdir called checkpoints
  • Train MVSNet: ./train.sh

Testing

  • Download the preprocessed test data DTU testing data (from Original MVSNet) and unzip it as the DTU_TESTING folder, which should contain one cams folder, one images folder and one pair.txt file.
  • in test.sh, set DTU_TESTING as your testing data path and CKPT_FILE as your checkpoint file. You can also download my pretrained model.
  • Test MVSNet: ./test.sh

Fusion

in eval.py, I implemented a simple version of depth map fusion. Welcome contributions to improve the code.

Results on DTU

Acc. Comp. Overall.
MVSNet(D=256) 0.396 0.527 0.462
PyTorch-MVSNet(D=192) 0.4492 0.3796 0.4144

Due to the memory limit, we only train the model with D=192, the fusion code is also different from the original repo.

mvsnet_pytorch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mvsnet_pytorch's Issues

about warp problem

many thanks for your great work. Recently, I make a multi-view human image dataset like dtu. while I warp image using homo_warping function, errors occur. If there is occlusion in the sampling place, the wrong place will be sampled, have you considered this problem? can you tell me how to solve?Looking forwarding for your reply.

about the depth_visual file

What does the depth visual file mean? I guess it will generate a mask about out-of-range depth pixels as done in the official MVSNet repo. But after checking the max depth value in depth_gt[mask], I found it still has out-of-range values. For example, the max depth value can be 1092.7748, which is larger than 933.8=425+2.5x192x1.06.

Error in testing tank and temples dataset

image
And after I change the code to assert np_img.shape[:2] == (1080, 1920), it comes the error as "RuntimeError:The size of tensor a (31) must match the size of tensor b (32) at non-singleton dimension 3 in /MVSNet_pytorch/models/mvsnet.py".

跑测试集出现的维度问题

(120, 160)
(128, 160, 3)
Traceback (most recent call last):
File "/home/stu5/dl_mvs/MVSNet_注释代码/eval.py", line 347, in
filter_depth(scan_folder, out_folder, os.path.join(args.outdir, 'mvsnet{:0>3}_l3.ply'.format(scan_id)))
File "/home/stu5/dl_mvs/MVSNet_注释代码/eval.py", line 295, in filter_depth
color = ref_img[1:-2:4, 1::4, :][valid_points]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 128 but corresponding boolean dimension is 120

test问题

您好,我想要执行eval.py的main,但提示了以下问题,该怎么解决呢?谢谢
Traceback (most recent call last):
File "/home/cv/MVSNet/MVSNet_pytorch-master/eval.py", line 306, in
with open(args.testlist) as f:
TypeError: expected str, bytes or os.PathLike object, not NoneType

dtu数据集的训练集

您好,我想问一下,那个DTU数据集的训练集如何裁剪成614*512的分辨率啊,有没有裁剪好开源的训练集鸭

f-score

Hi, 我发现f-score计算,DTU没有提供,T&T提供了在线提交接口。你这边可以分享下DTU的f-score计算方法吗?

Depth refinement

Hi there,

great work! Is there any reason why you exclude the depth refinement from the evaluation?

Regards

Where is your test.sh

Hello, I am trying to run test.sh using your provided pretrained model.
I cant find it anywhere,please help !

1080Ti测试显存溢出

您好:
非常感谢您的项目构建。我在运行eval.sh时候,采用的平台是1080Ti,直接运行会出现显存溢出的问题。请问您测试和训练的显卡最低是啥要求呀?
祝好!

Some problem about Evaluation Metrics.

image

Hello, I'm a freshman in this field, recently, I can run mvset_pytorch successfully, but when use the eval.py to fuse the depth maps and get the point cloud.ply files, but I don't know how to get above three metrics.

Are there some current software can evaluate these? Please tell how to get these three metrics.

RuntimeError: The size of tensor a (31) must match the size of tensor b (32) at non-singleton dimension 3

`Traceback (most recent call last):
File "eval.py", line 307, in
save_depth()

File "eval.py", line 118, in save_depth
outputs = model(sample_cuda["imgs"], sample_cuda["proj_matrices"], sample_cuda["depth_values"])

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
output = module(*input, **kwargs)

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)

File "/home/amax/shenye/colmapTest1/MVSNet_pytorch-master/models/mvsnet.py", line 132, in forward
cost_reg = self.cost_regularization(volume_variance)

File "/home/amax/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)

File "/home/amax/shenye/colmapTest1/MVSNet_pytorch-master/models/mvsnet.py", line 66, in forward
x = conv4 + self.conv7(x)

RuntimeError: The size of tensor a (31) must match the size of tensor b (32) at non-singleton dimension 3`

Hello,how do you deal this problem?

CUDA error: unknown error when run train.sh

Thank you for sharing this code. But I got the following error when I run ./train.sh:

argv: ['--dataset=dtu_yao', '--batch_size=4', '--trainpath=/mnt/f/datasets/DTU/dtu_training', '--trainlist', 'lists/dtu/train.txt', '--testlist', 'lists/dtu/test.txt', '--numdepth=192', '--logdir', './checkpoints/d192']
################################  args  ################################
mode      	train                         	<class 'str'>       
model     	mvsnet                        	<class 'str'>       
dataset   	dtu_yao                       	<class 'str'>       
trainpath 	/mnt/f/datasets/DTU/dtu_training	<class 'str'>       
testpath  	/mnt/f/datasets/DTU/dtu_training	<class 'str'>       
trainlist 	lists/dtu/train.txt           	<class 'str'>       
testlist  	lists/dtu/test.txt            	<class 'str'>       
epochs    	16                            	<class 'int'>       
lr        	0.001                         	<class 'float'>     
lrepochs  	10,12,14:2                    	<class 'str'>       
wd        	0.0                           	<class 'float'>     
batch_size	4                             	<class 'int'>       
numdepth  	192                           	<class 'int'>       
interval_scale	1.06                          	<class 'float'>     
loadckpt  	None                          	<class 'NoneType'>  
logdir    	./checkpoints/d192            	<class 'str'>       
resume    	False                         	<class 'bool'>      
summary_freq	20                            	<class 'int'>       
save_freq 	1                             	<class 'int'>       
seed      	1                             	<class 'int'>       
########################################################################
dataset train metas: 27097
dataset test metas: 7546
start at epoch 0
Number of model parameters: 338129
Epoch 0:
/home/lixudong/.local/lib/python3.10/site-packages/torch/functional.py:501: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3149.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Traceback (most recent call last):
  File "/mnt/f/code/MVSNet_pytorch/train.py", line 276, in <module>
    train()
  File "/mnt/f/code/MVSNet_pytorch/train.py", line 131, in train
    loss, scalar_outputs, image_outputs = train_sample(sample, detailed_summary=do_summary)
  File "/mnt/f/code/MVSNet_pytorch/train.py", line 193, in train_sample
    outputs = model(sample_cuda["imgs"], sample_cuda["proj_matrices"], sample_cuda["depth_values"])
  File "/home/lixudong/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/f/code/MVSNet_pytorch/models/mvsnet.py", line 122, in forward
    volume_sq_sum = volume_sq_sum + warped_volume ** 2
RuntimeError: CUDA error: unknown error

When I changed volume_sum = volume_sum + warped_volume and volume_sq_sum = volume_sq_sum + warped_volume ** 2 to volume_sum += warped_volume and volume_sq_sum += warped_volume ** 2, another error appeared:

Traceback (most recent call last):
  File "/mnt/f/code/MVSNet_pytorch/train.py", line 276, in <module>
    train()
  File "/mnt/f/code/MVSNet_pytorch/train.py", line 131, in train
    loss, scalar_outputs, image_outputs = train_sample(sample, detailed_summary=do_summary)
  File "/mnt/f/code/MVSNet_pytorch/train.py", line 198, in train_sample
    loss.backward()
  File "/home/lixudong/.local/lib/python3.10/site-packages/torch/_tensor.py", line 482, in backward
    torch.autograd.backward(
  File "/home/lixudong/.local/lib/python3.10/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: unknown error

However, there are no errors when I run ./eval.sh.

My environment:

  • os: WSL 2, Ubuntu 22.04
  • python: 3.10
  • cuda: 11.7
  • pytorch: 1.13.0.dev20221001+cu117
  • torchvision: 0.14.0.dev20221001+cu117

I really appreciate it if you kindly provide some advice.

The correct metrics

I ran the evaluation code using your pretrained model.
The correct metrics should be the following: (I added the last row)

Acc. Comp. Overall
MVSNet(D=256) 0.396 0.527 0.462
Your readme 0.4492 0.3796 0.4144
Your pretrained 0.5229 0.4514 0.4871

which is slightly worse than original MVSNet, and that is reasonable because you use smaller D=192.
I used the same evaluation code to evaluate RMVSNet and his result is consistent with his provided model, so the evaluation code has no problem.

Could you explain how you get the numbers in the readme, otherwise it seems misleading that smaller D produces better result.

Large memory costs during evaluation

Hi,

Thanks for the excellent project! I use multiple RTX 2080 to run the code. However, the code causes OOM during evaluation (eval.sh). Since the batch size is 1 so it only uses a single GPU. Yet I could not figure out why it doesnt cause OOM during training.

Can you give an example that which kind of GPU is enough for testing? Thanks!

The best model

How many iterations of the model will work best at 2 batch_size?

How to visualize the depth file?

The depth files,such as depth_map_0002.pfm in the file of DTU training,how to visualize it?What software is used to achieve it? or using the code to visualize it ?

can't find pfm in eval.py

When I run eval.sh, an error occurs:
No such file or directory: './outputs/scan1/depth_est/00000000.pfm'
I'm wondering is this pfm a result of eval.py? So I don't need to prepare this file.
Why it can't find this file?

preprocessing problems?

Hi!
I am trying to do inference on my own dataset. I wonder why you do intrinsics[:2, :] /= 4 in Dataset Class?

accuracy

May I ask how to calculate this accuracy rate
Hope to get your reply

Does your test loss decline regularly?

image
I have re-train your model using the same setting as yours, and I test it for evaluating the depth map error on test dataset.
I find out your depth map error is 89 using your provided model or re-trained model, but the original MVSNet depth map error is only 45.
Through the visualization of tensorboard, it seems that your model does not train well.

what's the reason about the large loss?

I used your pre-trained model,but from the beginning, the loss was very large. I checked the code, but no problem was found. What is the reason for this?and my learning rate was set 0.0001,Eagerly looking forward to your reply.Thanks.

微信图片_20201209172324

3

Question about homo_warping

    rot_depth_xyz = rot_xyz.unsqueeze(2).repeat(1, 1, num_depth, 1) * depth_values.view(batch, 1, num_depth,
                                                                                        1)  # [B, 3, N_depth, H*W]
    proj_xyz = rot_depth_xyz + trans.view(batch, 3, 1, 1)  # [B, 3, N_depth, H*W]

Is the tanslation missing multiplication with depth values here according to the plane-induced homography?

About loss

What is the normal range for the loss value obtained during training? After 15 rounds of training, I received:
avg_ test_ scalars: {'loss': 7.96386517003913, 'abs_depth_error': 8.387600596960601, 'thres2mm_error': 0.3122527413139298, 'thres4mm_error': 0.16783424718643547, 'thres8mm_error': 0.10430021953216903}

Some questions about depth_visual_xxx.png.

Hi, this is Zander, recently I run your code in my server, I find your code use the depth_visual_xxx.png as the mask to constrain the depth map:

   model_loss = mvsnet_loss
    model.train()
    optimizer.zero_grad()

    sample_cuda = tocuda(sample)
    depth_gt = sample_cuda["depth"]
    mask = sample_cuda["mask"]

    outputs = model(sample_cuda["imgs"], sample_cuda["proj_matrices"], sample_cuda["depth_values"])
    depth_est = outputs["depth"]

    loss = model_loss(depth_est, depth_gt, mask)

In your mvsnet_loss, I see the your the mask to spilt the foreground and background:

def mvsnet_loss(depth_est, depth_gt, mask):
    mask = mask > 0.5
    return F.smooth_l1_loss(depth_est[mask], depth_gt[mask], size_average=True)

But in the Source Code, I don't see any codes about use depth_visual.png, so I wonder know whether you create some new codes to use these masks.
Please tell me, I'm hoping for your answer. :-)

fusion

fusibile编译好后,在eval里面改fusibile路径吗还是如何操作,接下来如何生成点云

Matlab code

Hello, how do I need to run the matlab codes in order to obtain the accuracy and the overall?

How to reproduce evaluation metrics?

Thank you for sharing this code. However, I have some questions about how to reproduce evaluation metrics.
(1) How many GPUs did you use?
(2) What is the batch size per GPU?
(3) Did you use refinement in MVSNET

I tried 4GPUs(batch size=1) and 4GPUs(batch size=2), the results are as below, there are some gaps to catch up with your results. (PS: I didn't use refinement.)

     Acc        Comp    Overall

4x2: 0.5636 0.5393 0.5514
4x1: 0.6434 0.6790 0.6612

I really appreciate it if you kindly provide some advice.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.