Comments (9)
Can relate with the RTX 2080, Nvidia driver 440.33.01 and Cuda 10.2.
Repro case :
- I built the popsift-demo
- Ran it with the following arguments (Image from here ):
./popsift-demo -i apollonian_gasket.ascii.pgm
- It crashes with these errors :
/home/rdousse/Téléchargements/casablanca.ascii.pgm
Warning: no descriptors extracted
/home/rdousse/dev/eesep_2/thirdParty/build/popsift/src/popsift/features.cu:89 Runtime warning:
Failed to register feature memory in CUDA.
Features count: 1121
Memory size requested: 80712
an illegal memory access was encountered
/home/rdousse/dev/eesep_2/thirdParty/build/popsift/src/popsift/features.cu:97 Runtime warning:
Failed to register descriptor memory in CUDA.
Descriptors count: 0
Memory size requested: 0
an illegal memory access was encountered
/home/rdousse/dev/eesep_2/thirdParty/build/popsift/src/popsift/sift_pyramid.cu:307
Failed to copy device-to-host: an illegal memory access was encountered
src ptr=7f32b7800000
dst ptr=7f329c021000
The code was also tested with a Geforce 1080 with the same nvidia driver and cuda version, everything went well.
Hope it can help.
from popsift.
@Rodousse I now have made it work with RTX cards. Here is my solution.
- As pointed out by the authors @griwodz, the problem lies in function "BitonicSort" when estimating the orientation. I couldn't identify what really went wrong with it but I attempted to re-implement this by a simple bubble sorting, and I now can pass the orientation step.
- Similarly, the program will crash at the descriptor step. This issue can be fixed by setting "--desc-mode grid".
Overall, I suspect that the problem lies in using CUDA warp-level primitives, e.g., __shfl_sync. It seems to be some out-of-bound issue when calling those functions with RTX cards.
from popsift.
Sorry for not replying for an eternity!
Unfortunately, I don't have any RTX card and I don't know what has changed.
My guess is that one of the intrinsic functions (__ballot, __popc) doesn't return because a thread has died after a memory violation. And that's probably in BitonicSort, but that is pure guesswork.
Do you have any time to search for the problem?
from popsift.
Thank you for your reply! I will have time to diagnose this issue in the next few weeks, and will keep you informed if it is resolved.
from popsift.
@lzx551402 Thanks for the turnaround !
I'll just wait for the fix to be released I think.
from popsift.
@lzx551402 Thank you very much for finding the origin for the crashes!
I'll probably get a login for a machine with a new Tesla card in the next few days, and I hope that the problem is the same on that one.
from popsift.
@lzx551402 Thanks for discovering the locations of the RTX 20 bugs. I managed to loan an eGPU with a 2080 and fixed the problem in BitonicSort (just one line using a conditional partial warp xor-shuffle).
If you have the time, you could test the branch dev/fixNewWarp (PR #66).
I'm not seeing the crash in the descriptor step in loop mode. Can you tell me which parameters you used when it crashed? Perhaps you have an example image?
from popsift.
Hmm, I did actually move a bad __syncthreads() out of a loop in s_desc_loop, so that may have fixed the other problem without actually looking for it.
from popsift.
@griwodz thank you so much for fixing the problem! I have just tested both modifications in src/popsift/s_desc_loop.cu and src/popsift/common/warp_bitonic_sort.h, and they have indeed resolved the two issues I have encountered.
from popsift.
Related Issues (20)
- How to retrieve matches [question] HOT 3
- [question]how to get the xpos and ypos on the input image? HOT 2
- [question]how can I set config to extract the same features for same images? HOT 2
- [bug] s_filtergrid.cu does not compile missing thrust/host_vector.h header HOT 2
- [bug]cannot use thread_local specifier for a __constant__ variable ConsoleApplication1 E:\Desktop\VS-Test\09\ConsoleApplication1\popsift\gauss_filter.cu 21 HOT 2
- [request] using exception for error handling HOT 2
- Memory leak when running enqueue, job->get() in while loop HOT 9
- Doesn't build with cuda 12 HOT 1
- job->get() takes a long time(~200ms with default settings on a small image HOT 1
- [request] Support CUDA SDK 12.X HOT 1
- [ci] move ci to github actions HOT 1
- [request] Bringing PopSift implementation into OpenCV
- How to use popsift as API? HOT 3
- Cannot Reproduce OpenCV Descriptors
- [bug] PopSift build failure for CUDA >= 12.0 HOT 1
- [ci] Continuous integration for both Linux and Windows must be updated
- runtime error: cudaMemcpyToSymbol failed for Gauss kernel initialization HOT 7
- [bug] Cannot built in vcpkg - MSVS2022 + Cuda 12.6 HOT 4
- Pypopsift crashes. Investigate. HOT 2
- [bug] incorrect L2 normalization if CUDA normf() is in use
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from popsift.