tqtqliu / et-mvsnet Goto Github PK

[ICCV 2023] When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo

License: MIT License

Python 88.40% MATLAB 10.01% Shell 1.58%

et-mvsnet's Introduction

ET-MVSNet

Paper | Arxiv | Model

When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo
Authors: Tianqi Liu, Xinyi Ye, Weiyue Zhao, Zhiyu Pan, Min Shi^*, Zhiguo Cao
Institute: Huazhong University of Science and Technology
ICCV 2023

Abstract

Learning-based multi-view stereo (MVS) method heavily relies on feature matching, which requires distinctive and descriptive representations. An effective solution is to apply non-local feature aggregation, e.g., Transformer. Albeit useful, these techniques introduce heavy computation overheads for MVS. Each pixel densely attends to the whole image. In contrast, we propose to constrain non-local feature augmentation within a pair of lines: each point only attends the corresponding pair of epipolar lines. Our idea takes inspiration from the classic epipolar geometry, which shows that one point with different depth hypotheses will be projected to the epipolar line on the other view. This constraint reduces the 2D search space into the epipolar line in stereo matching. Similarly, this suggests that the matching of MVS is to distinguish a series of points lying on the same line. Inspired by this point-to-line search, we devise a line-to-point non-local augmentation strategy. We first devise an optimized searching algorithm to split the 2D feature maps into epipolar line pairs. Then, an Epipolar Transformer (ET) performs non-local feature augmentation among epipolar line pairs. We incorporate the ET into a learning-based MVS baseline, named ET-MVSNet. ET-MVSNet achieves state-of-the-art reconstruction performance on both the DTU and Tanks-and-Temples benchmark with high efficiency.

Installation

conda create -n etmvsnet python=3.10.8
conda activate etmvsnet
pip install -r requirements.txt
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 -f https://download.pytorch.org/whl/torch_stable.html

Data Preparation

1. DTU Dataset

Training data. We use the same DTU training data as mentioned in MVSNet and CasMVSNet. Download DTU training data and Depth raw. Unzip and organize them as:

dtu_training                     
    ├── Cameras                
    ├── Depths   
    ├── Depths_raw
    └── Rectified

Testing Data. Download DTU testing data. Unzip it as:

dtu_testing                                       
    ├── scan1   
    ├── scan4
    ├── ...

2. BlendedMVS Dataset

Download BlendedMVS and unzip it as:

blendedmvs                          
    ├── 5a0271884e62597cdee0d0eb                
    ├── 5a3ca9cb270f0e3f14d0eddb   
    ├── ...
    ├── training_list.txt
    ├── ...

3. Tanks and Temples Dataset

Download Tanks and Temples and unzip it as:

tanksandtemples                          
       ├── advanced                 
       │   ├── Auditorium       
       │   ├── ...  
       └── intermediate
           ├── Family       
           ├── ...

We use the camera parameters of short depth range version (included in your download), you should replace the cams folder in intermediate folder with the short depth range version manually.

Or you can download our processed data here.

We recommend the latter.

Training

Training on DTU

To train the model from scratch on DTU, specify DTU_TRAINING in ./scripts/train_dtu.sh first and then run:

bash scripts/train_dtu.sh exp_name

After training, you will get model checkpoints in ./checkpoints/dtu/exp_name.

Finetune on BlendedMVS

To fine-tune the model on BlendedMVS, you need specify BLD_TRAINING and BLD_CKPT_FILE in ./scripts/train_bld.sh first, then run:

bash scripts/train_bld.sh exp_name

Testing

Testing on DTU

For DTU testing, we use the model (pretrained model) trained on DTU training dataset. Specify DTU_TESTPATH and DTU_CKPT_FILE in ./scripts/test_dtu.sh first, then run the following command to generate point cloud results.

bash scripts/test_dtu.sh exp_name

For quantitative evaluation, download SampleSet and Points from DTU's website. Unzip them and place Points folder in SampleSet/MVS Data/. The structure is just like:

SampleSet
├──MVS Data
      └──Points

Specify datapath, plyPath, resultsPath in evaluations/dtu/BaseEvalMain_web.m and datapath, resultsPath in evaluations/dtu/ComputeStat_web.m, then run the following command to obtain the quantitative metics.

cd evaluations/dtu
matlab -nodisplay
BaseEvalMain_web 
ComputeStat_web

Testing on Tanks and Temples

We recommend using the finetuned model (pretrained model) to test on Tanks and Temples benchmark. Similarly, specify TNT_TESTPATH and TNT_CKPT_FILE in scripts/test_tnt_inter.sh and scripts/test_tnt_adv.sh. To generate point cloud results, just run:

bash scripts/test_tnt_inter.sh exp_name

bash scripts/test_tnt_adv.sh exp_name

For quantitative evaluation, you can upload your point clouds to Tanks and Temples benchmark.

Citation

@InProceedings{Liu_2023_ICCV,
    author    = {Liu, Tianqi and Ye, Xinyi and Zhao, Weiyue and Pan, Zhiyu and Shi, Min and Cao, Zhiguo},
    title     = {When Epipolar Constraint Meets Non-Local Operators in Multi-View Stereo},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {18088-18097}

Acknowledgements

Our work is partially based on these opening source work: MVSNet, cascade-stereo, MVSTER.

We appreciate their contributions to the MVS community.

et-mvsnet's People

Contributors

Stargazers

Watchers

Forkers

hustmx721 gefucvpr24

et-mvsnet's Issues

test：Why are there some black point clouds around the obtained point cloud?

Hope to receive your reply! I have some doubts about the point cloud results obtained from testing:
Why are there some black point clouds around the obtained point cloud?

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 64, 3, 3], but got 3-dimensional input of size [64, 104, 144] instead

您好，我不知道为什么下载的数据集在测试时会出现这个错误。

About results

Hello, I modified the conf indicator according to what you said, and the final acc, comp, and all indicators are very close to your paper, but the qualitative analysis results are still very different from those in your paper, no matter how small the settings are, there is still a big difference between them and your paper, I haven't changed your code, I don't know what the reason is, can you help solve it, the GPU I use is A5000, or the evaluation of the DTU dataset needs to run test_pcd_dtu.sh instead of test_dtu.sh when I set the conf indicator to 0.1, the effect is as follows.

Testing on Tanks and Temples

Hello, thank you very much for your excellent work, I would like to know, for the TNT dataset, how do you get your own pre-trained model, or is it just training on the DTU dataset and testing on the rest of the dataset.

About DTU dataset

Hello, I have some questions when I reproduce your work, and I hope you can help answer them.

After completing the training of the DTU dataset, I tested it in the test set, and the acc, comp, and all indicators were the same as in the article, but there was noise in the obtained point cloud data, so I referred to your reply to others and changed the relevant part of the test_dypcd.py file to photometric_confidence = confidence_list[0]confidence_list[1] confidence_list[2]*confidence_list[3], but after evaluation, it was found that although the noise of the point cloud data is much less, the relevant data indicators are much worse than the original text, can you help with this?
For the data in your paper, whether the point cloud data with noise removal is used for evaluation？

LICENSE

Thanks for this good work! I think you could add a license file to your repo?

Trained checkpoints and feature enhancement approaches

Hi, I have two questions.
1、After completing the training process, I have acquired 15 checkpoints ranging from finalmodel_0.ckpt to finalmodel_14.ckpt. I want to select the best checkpoint, however, I've noticed that the evaluation process, particularly the execution time in MATLAB, is quite extensive. Could you provide me with some suggestions?
2、Transformer-based MVS methods, including your work and TransMVSNet, aim to enhance feature representation. However, pre-trained large models can also be used to enhance features. What's your opinion on this?

different depth samples when testing on TnT dataset

Thank you for your open-sourcing works!
I noticed that the script uses different depth samples when testing on intermediate (8,8,4,4) and advanced (16,8,4,4) sets of the TnT dataset. Is there any theory or assumption behind using different sets?

Fairness of Ablation Experiments

Thank you for your response in issue: Modification of point cloud fusion method, however, it does not address my concerns. Is it fair to attribute the DTU score improvement brought about by changing the fusion method to the ET module?

In the baseline results, it seems that you utilized a default fusion method, but for your proposed ET module, there appears to have been a switch to your updated fusion method. This change seems to have led to performance improvements being attributed to the ET module, which may not entirely reflect its true impact. The MVSTER method achieves comparable scores using the updated fusion method alone on DTU, without the need for the ET module. Therefore, attributing the performance gains solely to the Epipolar Transformer in the ablation study seems unfair.

I appreciate your time and attention to these inquiries. I believe that addressing these concerns will not only enhance the credibility of this research but also contribute to the advancement of our understanding of the MVS field.

i met an error when tested data

Traceback (most recent call last):
File "C:\Users\dell\Desktop\ET-MVSNet-main\ET-MVSNet-main\test_dtu_dypcd.py", line 511, in
save_depth(testlist)
File "C:\Users\dell\Desktop\ET-MVSNet-main\ET-MVSNet-main\test_dtu_dypcd.py", line 392, in save_depth
torch.cuda.reset_peak_memory_stats()
File "D:\software\Anaconda\envs\etmvsnet\lib\site-packages\torch\cuda\memory.py", line 260, in reset_peak_memory_stats
return torch._C._cuda_resetPeakMemoryStats(device)
RuntimeError: invalid argument to reset_peak_memory_stats

Quantitative results of TNT dataset

Hello author, this is the result of my reproduction, it turns out to be my problem, but these final indicators are still a little bit behind your paper, what is the reason for this, is there a way to improve it

About Tanks ang Temples

Hello, thank you very much for your excellent work, I recently reproduced your work following the tutorial you posted, but there is a little problem, I need your help.

The quantitative results in the DTU dataset are basically the same as the results presented in your paper, but the evaluation effect in the TNT dataset is very poor, not even half of the data you present, I would like to ask if there is any good solution. I would appreciate it if you could tell me how
Is it possible that there is a problem with my steps, I will now roughly describe it, you see if it is appropriate. Firstly, the DTU dataset is used for training, the trained model is put into the blended dataset for fine-tuning to obtain the blended training model, and finally the blended training model is used as a checkpoint to test the TNT dataset.

Test doesn't work on Tanks and Temples

I followed the instructions on the README and get good outputs for the DTU test set, but when I test on TNT, I just get distorted depth maps, as shown below.

I downloaded the linked TNT dataset and the two trained checkpoints (DTU and finetune on BLD). Both checkpoints fail for me on TNT.

Modification of point cloud fusion method

First of all, I want to thank you for your great work and express my gratitude for your open-source code.

The paper mentioned that the baseline is MVSTER. After adding the epipolar transformer, the score on the DTU data set has been greatly improved. However, I found that the dypcd script used in your code is different from that of MVSTER. We know that even for the same depth map, using different fusion methods or fusion parameters, the results will be very different. In your ablation experiments, are you using the exact same fusion script as MVSTER? Would such an experiment be fair if different fusion scripts were used?

I tried deleting the ET module, and as long as I used the "updated" fusion script, I could still achieve similar scores. Here are some of my results. (ET means using fusion scripts from your code repository, and regular means using scripts from MVSTER and other similar methods.)

experiment	fusion method	accuracy	completeness	overall
MVSTER (original paper)	regular	0.350	0.276	0.313
MVSTER (ET paper)	regular	0.351	0.284	0.318
ET-MVSNet (ET paper)	ET	0.329	0.253	0.291
ET-MVSNet (my PC)	ET	0.3291	0.2567	0.2929
ET-MVSNet but delete ET	ET	0.3272	0.2590	0.2931
ET-MVSNet but delete ET	regular	0.3514	0.2832	0.3173

The changes to the fusion script played a big role in the model's performance on the DTU dataset.

I'm looking forward to your explanation about it, thank you!

about tnt dataset

Hello, thank you very much for your excellent work, I have recently reproduced your work, but when I was evaluating, I found that there is a lot of noise in the point cloud file, can I reduce the noise by modifying the parameters in the sh file, if so, please inform me which parameter it is, thank you very much.

An error occurred during code testing

Thank you for your excellent work. I encountered the following error message while replicating the DTU test dataset. I hope you can answer it and look forward to your reply.

save_depth(testlist)
File "/home/featurize/work/ET-MVSNet-main/test_dtu_dypcd.py", line 392, in save_depth
torch.cuda.reset_peak_memory_stats()
File "/environment/miniconda3/envs/etmvsnet/lib/python3.10/site-packages/torch/cuda/memory.py", line 309, in reset_peak_memory_stats
return torch._C._cuda_resetPeakMemoryStats(device)
RuntimeError: invalid argument to reset_peak_memory_stats

Training loss

Hello, I reproduced your results on the DTU dataset, but I found that the final training loss is 1.55, I don't know if this result is correct, your paper does not mention the final training loss of the DTU dataset, so it is convenient to inform, in addition, I am ready to reproduce your work, as a baseline, I don't know if it is feasible

Question about the visualization of epipolar pairs.

Great job! I am interested in how to visualize the epipolar pairs in the diagrams (Fig 1. (a), Fig. 3, and the center of Fig. 4) of your paper. Would you mind sharing the visualization code? I would appreciate it very much.

results

Hello author, I recently studied your code and came to the following conclusions:

Your work mainly focuses on feature enhancement and integrating the ET module into the FPN network.
Cost body generation, regularization, and depth estimation are not the focus of your work. Can I make improvements in these aspects in the future?

Quantify the results

Hello, I'm sorry to bother you again, it may be that my current level is low, and when I read your paper, I found that the quantitative results on the DTU and TNT datasets you mentioned in the paper, for the test results of the TNT dataset, I can get them by running the two test files you provided? Or do you need to take additional steps, for the data in your paper, do you get it from Tensorboard?

something about the code

Hi,what an excellent work!After reading your paper,there are still some problems confused me.Would u mind release the code soon or later so that I could have a more comprehensive understanding?Thanks a lot!

Testing on the TNT dataset

Hello, I have two questions to ask

For testing TNT datasets, this process is not very time-consuming.
Replicate your work and see if there will be any differences in the quantitative results without making any changes

Why Epipolar When we have Homography?

Thanks for the work! I read several works about Epipolar Transofomer, e.g. MVSTER. Homography matrix is able to directly project point p to corresponding p'. Then why apply Epipolar Geometry to find point (p)'s corresponding epipolar line? I've read MVSTER code, they still construct a cost volume from Homography warping.

tqtqliu / et-mvsnet Goto Github PK

et-mvsnet's Introduction

ET-MVSNet

Paper | Arxiv | Model

Abstract

Installation

Data Preparation

1. DTU Dataset

2. BlendedMVS Dataset

3. Tanks and Temples Dataset

Training

Training on DTU

Finetune on BlendedMVS

Testing

Testing on DTU

Testing on Tanks and Temples

Citation

Acknowledgements

et-mvsnet's People

Contributors

Stargazers

Watchers

Forkers

et-mvsnet's Issues

Recommend Projects

Recommend Topics

Recommend Org