ii) Refer to sumArraysOnGPU-timer.cu, and let block.x = 256. Make a new kernel to let each thread handle two elements. Compare the results with other execution confi gurations.
To perform vector addition on host and device.
Hardware โ PCs with NVIDIA GPU & CUDA NVCC Google Colab with NVCC Compiler
- Initialize the device and set the device properties.
- Allocate memory on the host for input and output arrays.
- Initialize input arrays with random values on the host.
- Allocate memory on the device for input and output arrays, and copy input data from host to device.
- Launch a CUDA kernel to perform vector addition on the device.
- Copy output data from the device to the host and verify the results against the host's sequential vector addition. Free memory on the host and the device.
TYPE YOUR CODE HERE
SHOW YOUR OUTPUT HERE
Thus, Implementation of sum arrays on host and device is done in nvcc cuda using random number.