Git Product home page Git Product logo

Comments (17)

cdb0y511 avatar cdb0y511 commented on August 14, 2024 2

Great work! I will try it.
The illegal memory access error seems related to the resolution, and I will send some data along with the resolution parameter if the fix_vgicp occurs this error.
Thanks,

from fast_gicp.

kisung-kim avatar kisung-kim commented on August 14, 2024 1

Above all, thank you for sharing this code.

I also "occasionally" got

GaussianVoxelMap::create_voxelmap 4
terminate called after throwing an instance of 'thrust::system::system_error'
what():  parallel_for failed: cudaErrorIllegalAddress: an illegal memory access was encountered                                                                                                                    
Aborted (core dumped) 

error at
https://github.com/SMRT-AIST/fast_gicp/blob/master/src/fast_gicp/cuda/gaussian_voxelmap.cu#L242

I added some printf lines as below.

printf("GaussianVoxelMap::create_voxelmap 4\n");  
  thrust::for_each(
    thrust::cuda::par.on(stream),
    thrust::make_zip_iterator(thrust::make_tuple(points.begin(), covariances.begin())),
    thrust::make_zip_iterator(thrust::make_tuple(points.end(), covariances.end())),
    accumulate_points_kernel(voxelmap_info_ptr.data(), buckets, num_points, voxel_means, voxel_covs));
  printf("GaussianVoxelMap::create_voxelmap 5\n");

I can find line those error occured by printf('something').
So I'm not sure what is the reason of this issue.

my registration settings

number of points

  • target : 200000
  • source : 10000~50000

method : "VGICP_CUDA"
downsample_resolution : 0.05
k_correspondences : 30
voxel_resolution : 0.8
max_correspondence_distance : 0.4
neighbor_search_method : "DIRECT_RADIUS"
neighbor_search_radius : 0.3

Could you help me for this problem?
Thank you.

from fast_gicp.

fls1995 avatar fls1995 commented on August 14, 2024 1

The problem seem to happen because of this judgement miss "bucket.second >= 0".
https://github.com/SMRT-AIST/fast_gicp/blob/master/src/fast_gicp/cuda/gaussian_voxelmap.cu#L128
When coord = (0,0,0), it can be occasionally mistake matched to thrust::pair((0,0,0),-1),so under the if, bucket.second = -1. In gpu memory, voxel_covs is occasionally alligned just bebind buckets (in my code, when the bucket.size() == 32768, it happened). When add voxel_covs, it incorrectly change the last 9 int value of buckets, make the value too big. So when coord is match to bucket[bucket.size() - 3], bucket.second is a big value, make the address of voxel_covs out of the limit, call the error cudaErrorIllegalAddress.

from fast_gicp.

koide3 avatar koide3 commented on August 14, 2024 1

Ah, you are right. It can access invalid memory regions when coord == (0, 0, 0). I didn't notice this bug because I usually filter out very close points. I'll push a fix soon. Thanks a lot @fls1995 !

from fast_gicp.

cdb0y511 avatar cdb0y511 commented on August 14, 2024

I doubt it may relate to the voxel resolution cause after I increase resolution, it seems to disappear.

from fast_gicp.

koide3 avatar koide3 commented on August 14, 2024

Thanks for reporting the issue. I'll check if voxelization has some memory problems soon.

from fast_gicp.

cdb0y511 avatar cdb0y511 commented on August 14, 2024

Thanks for reporting the issue. I'll check if voxelization has some memory problems soon.

I strongly suggest you could add more NeighborSearchMethod for the Cuda version.
like, setNeighborSearchMethod(fast_gicp::NeighborSearchMethod::DIRECT7);
I find DIRECT7 is more robust than DIRECT1 in some scenarios.
I think the resolution parameter can be easier to choose when the neighbor voxel is computed.
Because the smaller resolution results in fewer points in each voxel, and to consider the neighbor voxels is fair compensation.
I am not familiar with thrust. But I think the fast_vgicp_voxel.hpp is a good example. I hope you can make your Cuda version more like it.
Thanks,

from fast_gicp.

koide3 avatar koide3 commented on August 14, 2024

Yep, it must be good to support more NeighborSearchMethod options in the CUDA version. I'm quite busy with writing some papers, but I try to find spare time to update the VGICP_CUDA code in a few weeks.

from fast_gicp.

koide3 avatar koide3 commented on August 14, 2024

I updated VGICP_CUDA so it supports DIRECT7 and DIRECT27. I also added a new neighbor voxel search method DIRECT_RADIUS which uses voxels within a radius so the GPU version can take more voxels into account for registration stability. Please take a look at "fix_vgicp" branch.

Regarding the illegal memory access error, I couldn't reproduce the error on my PC. Can you provide some data so I can check the problem?

from fast_gicp.

kisung-kim avatar kisung-kim commented on August 14, 2024

If target point cloud size is very big,

(int init_num_buckets=8192) in GaussianVoxelMap constructor
should be much higher?

from fast_gicp.

koide3 avatar koide3 commented on August 14, 2024

Hi @kisung-kim ,

Maybe the input point cloud was too large, and it ran out of the GPU memory. What GPU are you using and how much memory does it have?

By increasing init_num_buckets, you can avoid GPU memory fragmentation and reduce a bit amount of memory. Another workaround is to use a larger voxel resolution that would substantially decrease the memory consumption.

Honestly, I didn't test the code for such large point clouds, and it may have some memory-related bugs. It would be very helpful to improve the code if you could provide some example data.

from fast_gicp.

whuzs avatar whuzs commented on August 14, 2024

Has this issue been fixed?

from fast_gicp.

koide3 avatar koide3 commented on August 14, 2024

I just pushed a fix, and it will be merged into the main branch soon.
#97

from fast_gicp.

fls1995 avatar fls1995 commented on August 14, 2024

I think why coord(0,0,0) would be matched to thrust::pair((0,0,0),-1) because we didn't return to end the loop after added voxel_means. After I add return under line 144, the problem disappeared. We can add (bucket.second < 0) to avoid unexpected circumstances.

from fast_gicp.

whuzs avatar whuzs commented on August 14, 2024

thanks!

from fast_gicp.

ZFcvYes avatar ZFcvYes commented on August 14, 2024

target:204761[pts] source:59051[pts]
--- ndt_cuda (P2D) ---
terminate called after throwing an instance of 'thrust::system::system_error'
what(): reduce failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
Aborted (core dumped)

Is there any solution? Thanks

from fast_gicp.

NirvanaDragon avatar NirvanaDragon commented on August 14, 2024

target:204761[pts] source:59051[pts] --- ndt_cuda (P2D) --- terminate called after throwing an instance of 'thrust::system::system_error' what(): reduce failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered Aborted (core dumped)

Is there any solution? Thanks

I also meet this problem, do you solve it now?

from fast_gicp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.