Git Product home page Git Product logo

gpgpu-raytrace-the-rainbow's Introduction

Raytrace the rainbow on the GPU

My solution for the the Raytrace the rainbow on the GPU assignment, which is part of the GPGPU course at Eötvös Loránd University.

Prerequisities

The following dependencies are required:

Installation of these files can be done with the following command on Arch Linux:

sudo pacman -S --needed base-devel cuda cmake

How to build

GPU

Simply run build.sh if you are on Linux, the resulting executable will be located in the build directory, named gpgpu_raytrace_rainbow

CPU

Switch over to the cpu-cpp branch and copy over the CMakeLists.txt file or change the following lines in the main branch's CMakeLists.txt:

- 2 project(gpgpu_raytrace_rainbow CUDA)
+ 2 project(gpgpu_raytrace_rainbow)
...
- 8 add_executable(gpgpu_raytrace_rainbow rainbow.cu stb_image_write.h)
+ 8 add_executable(gpgpu_raytrace_rainbow rainbow.cpp stb_image_write.h)
...
- 12 set_target_properties(gpgpu_raytrace_rainbow PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

Image display

Running the GPU implementation results in a picture called rainbow.png to be placed alongside the source files. This image is a slice of the simulated world at z = -3, with x having a range of -1.90 to -2.00 and y having a range of 1.9 to 2.1. The explanation of these specific values are, that given the sphere ((x - 2)^2 + (y + 2)^2 + (z - 1)^2 = 9, so a center of (2, -2, 1) and a radius of 3) and the initial light ((3, 2, -3) + t * (0, -1, 1), so an initial position of (3, 2, -3) with a direction vector of (0, -1, 1)) after refraction, reflection & refraction results in these specific values that intersect the z = -3 plane at a range from -1.94345 (Ultraviolet light) to -1.9854 (Red light).

For a more visual representation about the text, check out the following GeoGebra link: https://www.geogebra.org/m/awdhpswq

Benchmarks & Optimizations

The following table shows the time it takes on different hardware to calculate the refraction & reflection of 300 light vectors (380 nm to 680 nm)

Hardware Execution time Notes
CPU (Single Threaded) 835 μs g++, no debug symbols, simple for loop
GPU (Non-Vectorized) 26340 μs nvcc, no debug symbols, simple for loop
GPU (Vectorized) 86 μs nvcc, no debug symbols, parallel computation

As we can see, the non-vectorized GPU version is much slower than a simple CPU implementation, this is due to the fact that every for loop also calls a cudaMemcpy instruction, further slowing down the process.

Calculating the block size and grid size for the 300 vectors, therefore passing everything all data at once reduces the run time by about 99.7% (or in other words, 1/300th of the original runtime, about the same as the amount of data).

A further optimization is removing unnecessary calls to the normalize function, as per Khronos's recommendation, reflect (and refract) should use a normalized vector, if we normalize the vector in these functions, we introduce a huge overhead.

Tested On

The following generous people have aided me in testing that the program runs on a variety of different hardware configurations:

Person CPU GPU OS
Zoltán Balázs Intel Core i7-8700K Nvidia GeForce RTX 2080 8GB Arch Linux (6.3.5-arch1-1)
Dóra Gregorics AMD Ryzen 5 2600X Nvidia GeForce RTX 2060 6GB Windows 10 (22H2)
Márton Petes AMD Ryzen 5 3600 Nvidia GeForce GTX 1060 6GB NixOS 23.11 (6.3.4)
Márton Petes Intel Core i5-5200U Nvidia GeForce 840M 2GB NixOS 23.05 (6.0.10-zen2)

gpgpu-raytrace-the-rainbow's People

Contributors

zoltan-balazs avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.