Git Product home page Git Product logo

cunsearch's Introduction

cuNSearch

A C++/CUDA library to efficiently compute neighborhood information on the GPU for 3D point clouds within a fixed radius. Suitable for many applications, e.g. neighborhood search for SPH fluid simulations.

The library interface is similar to the CPU neighborhood search CompactNSearch.

Libraries using cuNSearch

  • SPlisHSPlasH - A C++ library for the physically-based simulation of fluids using Smoothed Particle Hydrodynamics (see screenshot)

Build Instructions

This project is based on CMake. Simply generate project, Makefiles, etc. using CMake and compile the project with the compiler of your choice.

Requirements:

  • CMake 3.12
  • CUDA SDK 9.0 or newer
  • C++ 11

The code was tested with the following configurations:

  • Windows 10 64-bit, CMake 3.12.3, Visual Studio 2017, CUDA SDK 10.1
  • Debian 9 64-bit, CMake 3.12.3, GCC 6.3.0, CUDA SDK 9.2

Usage

A data structure to perform a neighborhood search can be created by calling the constructor given a fixed search radius r.

cuNSearch::NeighborhoodSearch nsearch(r);

An arbitrary number of point clouds can then be added to the data structure using the method add_point_set. The library expects the point positions to be contiguously stored in an array-like structure. The method will return a unique id associated with the initialized point set.

std::vector<std::array<Real, 3>> positions;
// ... Fill array with 3 * n real numbers representing three-dimensional point positions.
unsigned int point_set_id = nsearch.add_point_set(positions.front().data(), positions.size());
nsearch.find_neighbors();

In order to generate the neighborhood information simply execute the following command

nsearch.find_neighbors();

Finally, the neighborhood information can be accessed as follows

PointSet const& ps = nsearch.point_set(point_set_id);
for (int i = 0; i < ps.n_points(); ++i)
{
	for (int j = 0; j < ps.n_neighbors(i); ++j)
	{
    	// Return PointID of the jth neighbor of the ith particle in the 0th point set.
	    PointID const& pid = ps.neighbor(0, i, j);
	    // ...
	    // Do whatever you want with the point id. The id contains two indices.
	    // The first field pid.point_set_id represents the unique point set id returnd by add_point_set.
	    // The second field pid.point_id stands for the index of the neighboring particle within
	    // the containing point set.
	    // ...
	}
}

Besides the basic functionality the library offers to compute a rule for reordering the points according to a space-filling Z curve. The reordering will improve the performance of future neighborhood queries and accesses. The rule can be computed via

nsearch.z_sort();

Please note that the actual reordering must be invoked by the user by

ps.sort_field(positions.data());

Assuming that there is additional information stored per-point (e.g. velocity, color, mass etc.) the information must also be reorded using the same method to maintain consistency. Subsequently, the find_neighbors function has to be invoked again to update the neighborhood information.

Another self-explaining (benchmark) demo is contained in the project.

Activation Table

When maintaining multiple it is sometimes desired that only certain point sets can find points from other point sets. Therefore an activation table is implemented where the user can specify whether a point set i searches points in another point set j. When nothing else is specified all point sets will search points in all other point sets. The activation table can be modified with e.g.

nsearch.set_active(i, j, false)

Common mistakes and issues

Visual Studio may not detect changes in ".cu" files.

Use of thrust library in cpp files: Some thrust classes can only be used when the file is compiled by the nvidia compiler nvcc. This is usually solved by change the file ending to .cu to mark the file for the nvcc compiler.

References

  • R. Hoetzlein, 2014. "Fast Fixed-Radius Nearest Neighbors: Interactive Million-Particle Fluids", GPU Technology Conference (GTC), Santa Clara, CA.

cunsearch's People

Contributors

janbender avatar louisforestier avatar thehamsta avatar weilermarcel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cunsearch's Issues

crash while updating query points

Hi, i am actually using cunsearch to reference a point cloud buffer ( 7 millions points ) radius is: 0.01f

  • creating a second pointset for the query ( around 1 millions points )
  • then searching neighbors
  • updating that query poinset with new point etc, and making a search again

it works well for a few loops ( around 3/4 ) but then after for more searches it crashes...
is there any buffer to be released in cunsearch to repetitively search neighbors ?
i figured updating the pointset resizes on the buffer, but i doesnt seem to be a memory issue...

Trouble trying to use the library, not an issue but I need help

Hi,
First of all, I understand this is not an issue. I just spent a few hours trying to make it work, and I seem to not be able to. I'd say I have a basic knowledge of cmake having built other libraries to use on my projects. But there is something I am missing here, and if you have time I'd appreciate your help.
I am trying to build this just to check the library is linked properly:

#include "cuNSearch.h"

int main()
{
    cuNSearch::NeighborhoodSearch nsearch(2.1);
    return 0;
}

And I get this error:

[main] Building folder: cuda_knn 
[build] Starting build
[proc] Executing command: /usr/local/bin/cmake --build /mnt/E/dev/cuda_knn/build --config Release --target all --
[build] [1/2  50% :: 1.409] Building CUDA object CMakeFiles/cuda_knn.dir/main.cu.o
[build] [2/2 100% :: 1.624] Linking CUDA executable cuda_knn
[build] FAILED: cuda_knn 
[build] : && /usr/bin/g++  CMakeFiles/cuda_knn.dir/main.cu.o -o cuda_knn  ../cuNSearch/build/libcuNSearch.a  -lcudadevrt  -lcudart_static  -lrt  -lpthread  -ldl -L"/usr/local/cuda-11.3/targets/x86_64-linux/lib/stubs" -L"/usr/local/cuda-11.3/targets/x86_64-linux/lib" && :
[build] /usr/bin/ld: CMakeFiles/cuda_knn.dir/main.cu.o: in function `main':
[build] tmpxft_0000dd43_00000000-6_main.cudafe1.cpp:(.text.startup+0x28): undefined reference to `cuNSearch::NeighborhoodSearch::NeighborhoodSearch(float)'
[build] collect2: error: ld returned 1 exit status
[build] ninja: build stopped: subcommand failed.
[proc] The command: /usr/local/bin/cmake --build /mnt/E/dev/cuda_knn/build --config Release --target all -- exited with code: 1 and signal: null
[build] Build finished with exit code 1

It looks to me that or the library is not properly linked or there is a problem with header files. This is my CMakeLists.txt, with the path to the library hardcoded to make sure it finds it:

cmake_minimum_required(VERSION 3.0.0)
project(cuda_knn VERSION 0.1.0 LANGUAGES CUDA CXX)

add_executable(cuda_knn main.cu)
target_include_directories(cuda_knn PRIVATE "include/")
target_link_libraries(cuda_knn PRIVATE /mnt/E/dev/cuda_knn/cuNSearch/build/libcuNSearch.a)
set_target_properties(cuda_knn PROPERTIES CUDA_ARCHITECTURES "86")

This is the layout of the Visual Studio Code project (for some reason when I paste the screengrab it scales it up ridicously, sorry):
image

Thanks in advance for your time,
Daniel

get closest points from query point

Hi, i have a query point set and a second point set,
for each points in the query, i am asking the closest points of the second set, in the radius of 1.0
i can see i am limited to 70 neighbours, but what if they are more :

  • are the 70 points classified by euclidian distance ?
  • are they the 70 closest, or 70 points randomly taken ?

what would be the best strategy to recover ALL Neighbors from the radius ( 50k ? ) and => classify them by distance for the given radius => in order to have the 70 closest points ? ( CUDA_MAX_NEIGHBORS )

i understand there is a theory of sorting the points by "space-filling Z curve" how does it differ from euclidian distance ?

by reading "FAST FIXED-RADIUS NEAREST NEIGHBORS: INTERACTIVE MILLION-PARTICLE FLUIDS"
i understand there a way of "Finding all neighbors in a fixed radius R" or is there a limitation ?

thank you !

Sanity check failed when query set and point set are different

Hi! I wrote a small sample code, where I added two point sets, and used one as the query set and the other as the point set. I then did a sanity check, but it failed. The relevant code is below. Am I doing anything wrong?

  // read_pc_data is my unity function to read point cloud files
  read_pc_data(data_file, &points); 
  read_pc_data(query_file, &queries);

  unsigned int numPoints = static_cast<int>(points.size());
  printf("Number of points: %d \n", numPoints);

  unsigned int numQueries = static_cast<int>(queries.size());
  printf("Number of queries: %d \n", numQueries);
  
  //Create neighborhood search instance
  NeighborhoodSearch nsearch(radius);
  printf("Radius: %lf \n", radius);
  
  //Add point sets
  auto pointIndex = nsearch.add_point_set(points.front().data(), points.size(), true, true);
  auto queryIndex = nsearch.add_point_set(queries.front().data(), queries.size(), true, true);
  nsearch.set_active(pointIndex, pointIndex, false);
  nsearch.set_active(pointIndex, queryIndex, false);
  nsearch.set_active(queryIndex, queryIndex, false);
  
  nsearch.find_neighbors();
  
  auto ps = nsearch.point_set(queryIndex);
  auto r_queries = ps.GetPoints();
  std::cout << ps.n_points() << std::endl;
  for (int i = 0; i < ps.n_points(); ++i)
  {
    Real3 query = ((Real3*)r_queries)[i];
    for (int j = 0; j < ps.n_neighbors(pointIndex, i); ++j)
    {
      auto neighbor = ps.neighbor(pointIndex, i, j);
      auto diff = query - ((Real3*)r_queries)[neighbor];

      float squaredLength = diff[0] * diff[0] + diff[1] * diff[1] + diff[2] * diff[2];
      float distance = sqrt(squaredLength);

      if (distance > radius)
      {
        throw std::runtime_error("Not a neighbor");  <<<---- this fails for certain queries
      }
    }
  }

The point file (contains the points in the search space) is

0.0,0.0,0.0
9.72500038147,2.72699999809,73.8320007324
5.0,0.0,0.0
9.95300006866,2.6970000267,72.9100036621
6.0,0.0,0.0
9.91199970245,2.71600008011,73.466003418
4.99,4.99,0
9.56000041962,2.54600000381,68.4359970093

And the query file is

0.0,0.0,0.0
5.0,0.0,0.0
6.0,0.0,0.0
4.99,4.99,0

I was wondering if I was doing anything wrong, and could you reproduce the (incorrect) result?

Why first count the neighbors when there is a pre-set CUDA_MAX_NEIGHBORS?

Hi!

Seems like the algorithm first counts the number of neighbors to allocate just enough memory for the actual neighbor search. I understand that this might save a bit of GPU memory, but since there is a pre-set CUDA_MAX_NEIGHBORS beyond which neighbors won't be recorded anyways, is it worth the effort to first calculate the neighbor counts, which are also bounded by CUDA_MAX_NEIGHBORS? Wouldn't it be easier to simply allocate CUDA_MAX_NEIGHBORS slots for each query?

The reason I am asking about this is because kNeighborCount (the neighbor count kernel) is as slow as kNeighborhoodQueryWithCounts (the actual search kernel that records the neighbor indices), essentially doubling the search time.

Sorting points doesn't actually sort points

Real3 diff = particles[reversedSortIndices[neighborIndex]] - particle;

I might be wrong about this, but I was under the impression that the point of sorting points is to minimize scattered global memory access and thus facilitating memory coalescing. If so, then we need to actually sort and reorder points in the actual global memory, but the referenced code above shows that we don't actually reorder the point layout in the memory; rather, we simply use the indices in reversedSortIndices to index into the original memory layout.

I would think to actually reorder points in the memory, you would have to do something like:

thrust::sort_by_key(pointSetImpl->d_ReversedSortIndices.begin(), pointSetImpl->d_ReversedSortIndices.end(), pointSetImpl->d_Particles.begin())

Am I correct in this understanding?

Redundant gather in computeCellInformation

https://github.com/InteractiveComputerGraphics/cuNSearch/blob/master/src/cuNSearchDeviceData.cu#L160-#L167

These lines seem to be redundant. Since tempSequence.begin is a unique sequence that starts from 0 and d_SortIndices is also a unique sequence that starts from 0, d_ReversedSortIndices is guaranteed to be exactly the same as d_SortIndices after gather. They why not directly copy d_SortIndices to d_ReversedSortIndices? I validated this on a few examples, but maybe there are cases where this is not the case?

Search time progressively increases?

I am using the demo code which runs the neighbor search 5 times. It's interesting that the 3rd, 4th, and 5th runs are always much longer than the first two. Experimenting with 10M points and queries, the last three times take double the time than the first two. I can probably do a bit of debugging to figure out why, but was curious if you had any hunch why this might be?

lack of file "cuNSearch_export.h"

I add these files to my nisight project, but when I build it remind me that can't find file "cuNSearch_export.h" in file "Timing.h"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.