Git Product home page Git Product logo

exercises-solutions's People

Contributors

agitoz avatar benelgar avatar bmerry avatar gallagth avatar jrprice avatar raoulchartreuse avatar simonmcs avatar tomdeakin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

exercises-solutions's Issues

Can we make it easier to use Mac OSX?

For the Exercises and Solutions, it doesn't take much to get them all compiled and running on a Mac. All we have to do is modify two lines in the Makefiles from something that looks like this:

CCFLAGS=-O3 -lm -std=c99 -ffast-math

LIBS = -fopenmp -lOpenCL

To:

CCFLAGS=-O3 -lm -std=c99 -ffast-math -DAPPLE

LIBS = -fopenmp -framework OpenCL

There are two main ways we could do this:

  1. Use a condition inside the Makefile itself that looks for APPLE

  2. Use a make.def which we modify for each platform.

For previous versions of the course we used 2) with great effect, and I still have make.def files for Nvidia, AMD, Intel and Mac OSX.

C helper function has incomplete list of error numbers

The C helper function we provide, int err_code (cl_int err_in) in err_code.c, has an incomplete list of error codes it will recognise. In particular, it doesn't know about CL_DEVICE_NOT_FOUND, which is quite an important one.

This has already bitten me when one solution code expected a GPU, but my MBA doesn't expose one.

It would be worth updating the list in err_code() against the latest OpenCL v1.1 header file and making it a complete set.

In fact, a simple script that would take the appropriate chunk from cl.h and turn it into err_code() would be useful as we migrate this to support v1.2 and 2.0 etc.

Header file dependencies are missing from some Makefiles

I just noticed that not all the dependencies on header files are correctly captured in the Makefiles. This can lead to some erroneous behaviour when recompiling. The matrix multiply example and solution is one specific set of examples that suffers from this bug.

Solution for Exercise06 assumes a GPU in C, but anything in Python

Just trying the solutions on my Apple Macbook Air. After changing the Makefiles to use -framework OpenCL and -DAPPLE, they compile OK. But the C code assumes it will find a GPU in the following code:

// Set up OpenCL context. queue, kernel, etc.
cl_uint numPlatforms;
// Find number of platforms
err = clGetPlatformIDs(0, NULL, &numPlatforms);
if (err != CL_SUCCESS || numPlatforms <= 0)
{
    printf("Error: Failed to find a platform!\n",err_code(err));
    return EXIT_FAILURE;
}
// Get all platforms
cl_platform_id Platform[numPlatforms];
err = clGetPlatformIDs(numPlatforms, Platform, NULL);
if (err != CL_SUCCESS || numPlatforms <= 0)
{
    printf("Error: Failed to get the platform!\n",err_code(err));
    return EXIT_FAILURE;
}
// Secure a device
for (int i = 0; i < numPlatforms; i++)
{
    err = clGetDeviceIDs(Platform[i], DEVICE, 1, &device_id, NULL);
    if (err == CL_SUCCESS)
        break;
}
if (device_id == NULL)
{
    printf("Error: Failed to create a device group!\n",err_code(err));
    return EXIT_FAILURE;
}

DEVICE is defined in matmul.h to be CL_DEVICE_TYPE_GPU.

This means the program exits with "Error: Failed to create a device group!".

Whereas the Python solution assumes any valid OpenCL device.

So, what do we want this to do? Make it CL_DEVICE_TYPE_DEFAULT in the C code?

Differences between Exercise06 solution in slides and code

In what is now slide 82, we list the solution for Exercise06, where the student should have written their own kernel for the first time by converting the sequential C code into a simple matrix multiply kernel.

The solution in the slides has a body that looks like this:

{
int k;
int i = get_global_id(0);
int j = get_global_id(1);
float tmp = 0.0f;
for (k = 0; k < Pdim; k++)
tmp += A[i_Ndim+k] * B[k_Pdim+j];
}
C[i*Ndim+j] += tmp;
}

Whereas in the sequential C code solution provided in source form inside matrix_lib.c, its body looks like this:

for (i=0; i<Ndim; i++){
    for (j=0; j<Mdim; j++){
        tmp = 0.0;
        for(k=0;k<Pdim;k++){
             /* C(i,j) = sum(over k) A(i,k) * B(k,j) */
             tmp += *(A+(i*Ndim+k)) *  *(B+(k*Pdim+j));
         }
         *(C+(i*Ndim+j)) = tmp;
      }
}

This is a very different style of array addressing and could confuse the students. We should change the sequential C code inside matrix_lib.c in both the Exercise and the Solution so that the body looks like this:

for (i=0; i<Ndim; i++) {
    for (j=0; j<Mdim; j++) {
        tmp = 0.0f;
        for (k=0; k<Pdim; k++) {
             /* C(i,j) = sum(over k) A(i,k) * B(k,j) */
             tmp += A[i*Ndim+k] * B[k*Pdim+j];
         }
         C[i*Ndim+j] += tmp;
      }
}

Notice I've also added a few spaces inside the "for" statements, and also changed the definition of tmp to 0.0f from 0.0 (just good practise!).

Note that in the actual OpenCL kernel solution for Exercise06 the code is as we would want it, i.e. consistent with the above, apart from the 0.0 also needs to be changed to 0.0f.

Inconsistent use of timers

In the C++ code, some examples (matmul) use wtime() and some examples (pi_ocl) use the util::Timer.

They should all probably be consistent with themselves.

Python solution of Exercise08 prb with C_block_form.cl part

I add no problem to launch most of the example, but the solution of the exercice9 I get this error message :
===== OpenCL, A and B in block form in local memory, order 1024 ======

Traceback (most recent call last):
File "matmul.py", line 187, in
d_a, d_b, d_c, localmem1, localmem2)
File "/usr/local/lib/python2.7/dist-packages/pyopencl/init.py", line 466, in kernel_call
global_offset, wait_for, g_times_l=g_times_l)
pyopencl.LogicError: clEnqueueNDRangeKernel failed: invalid work group size

Exercise 06 Python solution code takes too long on the host

When running the Python solution code on Blue Crystal, the initial CPU code is so slow, it feels like it's hung. For example, on my Nehalem test machine (a GPU node in Blue Crystal phase 1), I get:

===== Sequential, matrix mult (dot prod), order 1024 on host CPU ======

1256.22704506 seconds at 1.70947095626 MFLOPS

20 minutes is a long time to wait, especially when the C version only takes about 10 seconds on the same machine:

===== Sequential, matrix mult (dot prod), order 1024 on host CPU ======
10.31 seconds at 208.2 MFLOPS

I think this is too long to wait, users will think something it wrong.

Either we need to make the Python faster on the CPU, or leave the CPU version commented out by default!

Gameoflife example won't build on Mac OS X

If you use a recent Xcode on Mac OS X, it won't build gameoflife from Exercise13:

cc gameoflife.c -O3 -std=c99 -o gameoflife
gameoflife.c:102:5: error: second parameter of 'main' (argument array) must be of type 'char *_'
int main(int argc, void *_argv)
^

This is with the following version of the tools:

$ cc --version
Apple clang version 4.1 (tags/Apple/clang-421.11.66) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

The fix is obvious.

Exercise 6 C and Cpp solutions are incorrect

If I run make ; ./mult in Solutions/Exercise06/C or Solutions/Exercise06/Cpp I get the following output:

===== Sequential, matrix mult (dot prod), order 1024 on host CPU ======
 7.67 seconds at 279.9 MFLOPS 

===== OpenCL, matrix mult, C(i,j) per work item, order 1024 ======
 5.01 seconds at 428.9 MFLOPS 

 Errors in multiplication: 168394460495872.000000

This is the output from the C executable, although the Cpp one gives similar results.

Am I correct in thinking that the error should be somewhat smaller?
Is this a known bug?

I'm running OS X 10.9.5, Core i7, Intel HD Graphics 4000, NVIDIA GeForce GT 650M 1024 MB.
I believe the OpenCL kernel runs on the GeForce in these examples.

Warning with float initialization

When float variable are initialized with a constant I have some opencl warning :
"/tmp/OCL9395T11.cl", line 19: warning: double-precision constant is
represented as single-precision constant because double is not
enabled
tmp = 0.0;
^

warn(text, CompilerWarning)

A change of all 0.0 with 0.0f will remove this kind of warning.
I know this is a very minor issue, would you like that I make the change.

Exercise 06 matrix mul doesn't report error

The exercise 6 code with the kernel deleted doesn't produce an error when the buffer is returned. As no kernel is running the buffer shouldn't have the correct result in!

Python improvements

Suggestions from Andreas Kloeckner, creator of PyOpenCL:

From a brief look at the slides, the only feedback would be that

prg.kernel(...)

reexecutes clCreateKernel() on every launch, so storing a reference to
the kernel may be more efficient. In addition, the issue of having to
cast arguments to numpy types can be alleviated by

http://documen.tician.de/pyopencl/runtime.html#pyopencl.Kernel.set_scalar_arg_dtypes

I'm not suggesting that you include this information (it might well be
that you left it out on purpose), I'm just trying to make sure you're
aware of it. :)

Tests for Game of Life

Saved as a TODO.

We should include a sanity checking test suite for the game of life - provide some simple inputs and outputs and check the final states. This will be helpful when completing the exercise too.

Change compiler for OS X

Saved as TODO.

  1. You need to add '-stdlib=libc++' to the compiler flags when building for C++11 with clang++, so I guess this should be added to the Makefiles for OS X.
  2. I was building with 'make CPPC=clang++', but I guess you could also have the Makefiles automatically select clang++ if on OS X
  3. Clang doesn't support OpenMP, so problem for Exercise06

Problem with -fopenmp flag not being recognised with Xcode 5

Apple has moved to using the LLVM framework by default for Xcode 5, from gcc in previous versions of Xcode. The new Xcode doesn't have support for OpenMP by default, and so the -fopenmp flag in many of the Makefiles causes an error on Mac OS X.

This flag isn't strictly needed, so the fix is simply to remove it, especially on Mac OS X platforms.

C++ timer not working?

The timer in Cpp_common/util.cpp might not work on some Apple systems. Trying to use the C++ timer program on some Mac OS X laptops can give absurd times (186302452924.23423 seconds for a simple vadd, for example).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.