Git Product home page Git Product logo

Comments (6)

chrisrichardson avatar chrisrichardson commented on August 16, 2024

This is due to the serial implementation using the same code as in parallel. For example, it calculates the dual graph for partitioning the mesh (not needed in serial).

from dolfinx.

jorgensd avatar jorgensd commented on August 16, 2024

Confirming that this is still an issue:
DOLFINx

python3 -c"import dolfinx; from mpi4py import MPI; dolfinx.UnitCubeMesh(MPI.COMM_WORLD, 100, 100, 100)"

real	0m11.179s
user	0m9.546s
sys	0m1.611s

DOLFIN:

fenics@f596c6c6e934:/root/shared$ time python3 -c"from dolfin import *; UnitCubeMesh(MPI.comm_world, 100, 100, 100)"

real	0m1.046s
user	0m0.907s
sys	0m0.140s

from dolfinx.

IgorBaratta avatar IgorBaratta commented on August 16, 2024

To avoid computing the dual graph in serial we can call a custom partitioner, which sets the destination of all cells to process 0 (in serial):

from mpi4py import MPI
import dolfinx
import numpy


def serial_partitioner(mpi_comm, nparts, tdim, cells, ghost_mode):
    dest = numpy.zeros(cells.num_nodes, dtype=numpy.int32)
    return dolfinx.cpp.graph.AdjacencyList_int32(dest)


mesh = dolfinx.UnitCubeMesh(
    MPI.COMM_WORLD, 100, 100, 100, partitioner=serial_partitioner)

dolfinx.list_timings(MPI.COMM_WORLD, [dolfinx.TimingType.wall])

with custom partitioner:

real 0m6.091s
user 0m4.249s
sys 0m2.458s

[MPI_AVG] Summary of timings                                   |  reps  wall avg  wall tot
------------------------------------------------------------------------------------------
Build BoxMesh                                                  |     1  4.254135  4.254135
Build dofmap data                                              |     1  1.060930  1.060930
Compute SCOTCH graph re-ordering                               |     1  0.141512  0.141512
Compute dof reordering map                                     |     1  0.666798  0.666798
Compute local-to-local map                                     |     1  0.068058  0.068058
Compute-local-to-global links for global/local adjacency list  |     1  0.042887  0.042887
Distribute in graph creation AdjacencyList                     |     1  0.509226  0.509226
Fetch float data from remote processes                         |     1  0.029057  0.029057
Init dofmap from element dofmap                                |     1  0.343332  0.343332
SCOTCH: call SCOTCH_graphBuild                                 |     1  0.000490  0.000490
SCOTCH: call SCOTCH_graphOrder                                 |     1  0.121626  0.121626
TOPOLOGY: Create sets                                          |     1  0.735610  0.735610

with standard partitioner:

real 0m9.116s
user 0m6.854s
sys 0m2.900s

[MPI_AVG] Summary of timings                                   |  reps  wall avg  wall tot
------------------------------------------------------------------------------------------
Build BoxMesh                                                  |     1  7.280152  7.280152
Build dofmap data                                              |     1  1.074936  1.074936
Compute SCOTCH graph re-ordering                               |     1  0.140453  0.140453
Compute dof reordering map                                     |     1  0.675271  0.675271
Compute graph partition (SCOTCH)                               |     1  0.338212  0.338212
Compute local part of mesh dual graph                          |     1  2.617081  2.617081
Compute local-to-local map                                     |     1  0.069505  0.069505
Compute non-local part of mesh dual graph                      |     1  0.047709  0.047709
Compute-local-to-global links for global/local adjacency list  |     1  0.044120  0.044120
Distribute in graph creation AdjacencyList                     |     1  0.515640  0.515640
Extract partition boundaries from SCOTCH graph                 |     1  0.029006  0.029006
Fetch float data from remote processes                         |     1  0.032896  0.032896
Get SCOTCH graph data                                          |     1  0.000000  0.000000
Init dofmap from element dofmap                                |     1  0.348799  0.348799
SCOTCH: call SCOTCH_dgraphBuild                                |     1  0.003080  0.003080
SCOTCH: call SCOTCH_dgraphHalo                                 |     1  0.035761  0.035761
SCOTCH: call SCOTCH_dgraphPart                                 |     1  0.190264  0.190264
SCOTCH: call SCOTCH_graphBuild                                 |     1  0.000497  0.000497
SCOTCH: call SCOTCH_graphOrder                                 |     1  0.120793  0.120793
TOPOLOGY: Create sets                                          |     1  0.739001  0.739001

from dolfinx.

garth-wells avatar garth-wells commented on August 16, 2024

Updated syntax:

time python -c "from dolfinx.mesh import create_unit_cube, CellType; from mpi4py import MPI; create_unit_cube(MPI.COMM_WORLD, 100, 100, 100, cell_type=CellType.tetrahedron)"

from dolfinx.

jorgensd avatar jorgensd commented on August 16, 2024
def serial_partitioner(mpi_comm, nparts, tdim, cells, ghost_mode):
    dest = numpy.zeros(cells.num_nodes, dtype=numpy.int32)
    return dolfinx.cpp.graph.AdjacencyList_int32(dest)

New syntax:

def serial_partitioner(comm, n, m, topo):
    dest = np.zeros(topo.num_nodes, dtype=np.int32)

    return dolfinx.cpp.graph.AdjacencyList_int32(dest)

from dolfinx.

chrisrichardson avatar chrisrichardson commented on August 16, 2024

OK, so we could (automatically) just call this "null" partitioner, when running in serial. It knocks of about 25% of the time. On my mac, it goes down from about 11s to 8s.
However, if we look at the timings, with dolfinx.list_timings, we see:

Compute local part of mesh dual graph                                       |     1  2.886924  2.886924
Topology: create                                                            |     1  3.307674  3.307674

The local dual graph is still computed, because it is used for reordering. Probably this didn't happen in old dolfin, which is why it is so fast to create a simple mesh. I really wonder if we shouldn't just close this issue as "won't fix"...

from dolfinx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.