Git Product home page Git Product logo

Comments (11)

tromp avatar tromp commented on August 28, 2024

I'm not sure how to interpret the output above.
Are the 9 lines from 9 different contexts?
Are you giving each one a different graph to work on (by varying the arguments to run_solver()) ?
It's also puzzling why the trim took 0 ms in each case, and why there are always 46 edges left.
Where is your code calling run_solver() ?

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

Hi John,

Thanks for your reply,

I just create a simple function named find_solution:

 CALL_CONVENTION int find_solution(SolverCtx* ctx,
                                   char* header,
                                   int header_length,
                                   SolverSolutions *solutions)
 {
   if (ctx == NULL || !ctx->trimmer.initsuccess){
     print_log("Error initialising trimmer. Aborting.\n");
     print_log("Reason: %s\n", LAST_ERROR_REASON);
     return -__LINE__;
   }

   ctx->setheadernonce(header, header_length, 0);
   u32 nsols = ctx->solve();

   for (unsigned s = 0; s < nsols; s++) {
     print_log("Solution");
     u32* prf = &ctx->sols[s * PROOFSIZE];
     if (solutions != NULL){
       solutions->edge_bits = EDGEBITS;
       solutions->num_sols++;
       for (u32 i = 0; i < PROOFSIZE; i++)
         solutions->sols[s].proof[i] = (u64) prf[i];
     }

     int pow_rc = verify(prf, &ctx->trimmer.sipkeys);
     if (pow_rc != POW_OK) {
         return -__LINE__;
     }
   }

   return nsols;
 }

I give each context a different header, and called the solver this way:

var solverSols SolverSolutions
n := C.find_solution(ctx, header, headerLen, (*C.SolverSolutions)(unsafe.Pointer(&solverSols)))
if n > 0 {
    sols := make([]uint32, 42)
    for i := 0; i < int(n); i++ {
        for j := 0; j < 42; j++ {
            sols[j] = uint32(solverSols.sols[i].proof[j])
        }
    }
} else if n < 0 {
    return fmt.Errorf("cgo return code: %d", int(n))
}

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

The full log context:

 GPU1 finding solution
 GPU4 finding solution
 GPU2 finding solution
 GPU0 finding solution
 GPU1 finding solution
 GPU4 finding solution
 GPU2 finding solution
 GPU0 finding solution
 eted in 63 + 46 ms
 Seeding completed in 63 + 46 ms
 Seeding completed in 63 + 46 ms
 Seeding completed in 63 + 46 ms
 Seeding completed in 63 + 46 ms
    2-cycle found
    8-cycle found
   54-cycle found
 findcycles edges 63077 time 56 ms total 390 ms
   12-cycle found
    4-cycle found
   20-cycle found
    4-cycle found
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=RgAAACQPgXA=
    6-cycle found
  100-cycle found
  406-cycle found
  594-cycle found
  340-cycle found
 findcycles edges 71378 time 49 ms total 386 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=RwAAACQPgXA=
   38-cycle found
   10-cycle found
   66-cycle found
   12-cycle found
 2466-cycle found
   70-cycle found
 findcycles edges 74681 time 58 ms total 383 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=QwAAACQPgXA=
  232-cycle found
  306-cycle found
 findcycles edges 61895 time 53 ms total 378 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=RAAAACQPgXA=
   20-cycle found
   74-cycle found
   28-cycle found
  332-cycle found
 findcycles edges 63438 time 51 ms total 379 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=SAAAACQPgXA=
   38-cycle found
 findcycles edges 64257 time 74 ms total 400 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=RQAAACQPgXA=
 findcycles edges 77 time 2 ms total 106 ms
 findcycles edges 77 time 2 ms total 137 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=TAAAACQPgXA=
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=TgAAACQPgXA=
 findcycles edges 77 time 1 ms total 189 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=SwAAACQPgXA=
 findcycles edges 77 time 1 ms total 280 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=SQAAACQPgXA=
 findcycles edges 77 time 0 ms total 365 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=TQAAACQPgXA=
 findcycles edges 77 time 3 ms total 432 ms
 findcycles edges 77 time 4 ms total 319 ms
 findcycles edges 46 time 4 ms total 234 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=SgAAACQPgXA=
 findcycles edges 77 time 4 ms total 319 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=UQAAACQPgXA=
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=VAAAACQPgXA=
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=UgAAACQPgXA=
 findcycles edges 46 time 2 ms total 2 ms
 findcycles edges 46 time 2 ms total 2 ms
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=VwAAACQPgXA=
 setheader: sKrQQF/FfgmOD8PO0OeVJSNq0ssZM7xJ/ANzAPWt9sQ=UAAAACQPgXA=

from cuckoo.

tromp avatar tromp commented on August 28, 2024

The earlier runs with edges in the 10s of thousands look fine, but the ones with less than 100 edges are wrong and spent 0ms on trimming. Perhaps you can more diagnostic output from the trimming routine to figure out why they skipped nearly all the work. For instance, SeedA should compute siphashes for 2^29 nodes which necessarily takes a lot of time.
I think with all data encapsulated in the SolverCtx, the solver should be thread safe.

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

Thanks John!

I'm continue debugging, will let you know if this problem solved!

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

It seems that the edgetrimmer *dt point to invalid memory area.

651       int solve() {
(gdb) n
653         auto time0 = std::chrono::high_resolution_clock::now();
(gdb) n
655         trimmer.abort = false;
(gdb) n
656         u32 nedges = trimmer.trim();
(gdb) s
edgetrimmer::trim (this=0x7f3510000c00) at mean.cu:416
416       u32 trim() {
(gdb) n
417         cudaMemcpy(dt, this, sizeof(edgetrimmer), cudaMemcpyHostToDevice);
(gdb) p *dt
Cannot access memory at address 0x7f34cf400000
(gdb) p *this
$1 = {tp = {expand = 0, ntrims = 176, genA = {blocks = 4096, tpb = 256}, genB = {blocks = 4096,
      tpb = 128}, trim = {blocks = 4096, tpb = 512}, tail = {blocks = 4096, tpb = 1024}, recover = {
      blocks = 1024, tpb = 1024}}, dt = 0x7f34cf400000, sizeA = 4462739456, sizeB = 2852126720,
  indexesSize = 16384, bufferA = 0x7f331a600000, bufferB = 0x7f3424600000,
  bufferAB = 0x7f33c4600000, indexesE = 0x7f34cec00000, indexesE2 = 0x7f34cec04000, hostA = {72644,
    0 <repeats 4095 times>}, uvnodes = 0x7f34cf000000, sipkeys = {k0 = 3288269178733044074,
    k1 = 2872931202106657597, k2 = 11375450941595734051, k3 = 5158531013351406122},
  dipkeys = 0x7f34cf000200, abort = false, initsuccess = true}
(gdb) n
420         checkCudaErrors(cudaEventCreate(&startall)); checkCudaErrors(cudaEventCreate(&stopall));
(gdb) p *dt
Cannot access memory at address 0x7f34cf400000

I'm continue debugging ...

from cuckoo.

tromp avatar tromp commented on August 28, 2024

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

Hi John,

Still not solved, really weird. I'll try newer nvidia driver ...

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

Hi John,

This problem is solved.

Since goroutine may be scheduled running on different OS thread, which lead this problem, I add runtime.LockOSThread(), my program has been running hours, all look fine now.

from cuckoo.

tromp avatar tromp commented on August 28, 2024

I still don't understand how running on different OS threads explains the faulty behaviour.
But I'm happy to hear you found a solution!

from cuckoo.

tianchaijz avatar tianchaijz commented on August 28, 2024

Thank you again, John!

from cuckoo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.