Git Product home page Git Product logo

Comments (14)

dellaert avatar dellaert commented on August 28, 2024 2

@ProfFan you do not have time for this :-)
@acxz if you're motivated, this particular benchmark might not be the best to benchmark - rather, the other SolverComparer benchmark might.

from gtsam.

dellaert avatar dellaert commented on August 28, 2024

@MandyXie could you try to reproduce?

from gtsam.

MandyXie avatar MandyXie commented on August 28, 2024

I ran the example, and got the same issue as you mentioned. I will look into it, and try to figure out what is going on.

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

Side note: We can try to integrate a flamegraph library into GTSAM possibly replacing the gttic/toc machinery.

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

#121

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

My results on macOS 10.14:

numberOfProblems = 1000000
problemSize = 4
With 1 threads:
Without memory allocation, grain size = 1, time = 0.150485
Without memory allocation, grain size = 10, time = 0.15183
Without memory allocation, grain size = 100, time = 0.149489
Without memory allocation, grain size = 1000, time = 0.152419
With memory allocation, grain size = 1, time = 0.351757
With memory allocation, grain size = 10, time = 0.320499
With memory allocation, grain size = 100, time = 0.314284
With memory allocation, grain size = 1000, time = 0.323573

With 4 threads:
Without memory allocation, grain size = 1, time = 0.162687
Without memory allocation, grain size = 10, time = 0.162498
Without memory allocation, grain size = 100, time = 0.146438
Without memory allocation, grain size = 1000, time = 0.150557
With memory allocation, grain size = 1, time = 0.192916
With memory allocation, grain size = 10, time = 0.200336
With memory allocation, grain size = 100, time = 0.196882
With memory allocation, grain size = 1000, time = 0.195918

With 8 threads:
Without memory allocation, grain size = 1, time = 0.160153
Without memory allocation, grain size = 10, time = 0.160778
Without memory allocation, grain size = 100, time = 0.161141
Without memory allocation, grain size = 1000, time = 0.161196
With memory allocation, grain size = 1, time = 0.198829
With memory allocation, grain size = 10, time = 0.199491
With memory allocation, grain size = 100, time = 0.199772
With memory allocation, grain size = 1000, time = 0.201396

Summary of results:
4 threads, without allocation, grain size = 1, speedup = 0.924997
4 threads, without allocation, grain size = 10, speedup = 0.93435
4 threads, without allocation, grain size = 100, speedup = 1.02083
4 threads, without allocation, grain size = 1000, speedup = 1.01237
4 threads, with allocation, grain size = 1, speedup = 1.82337
4 threads, with allocation, grain size = 10, speedup = 1.59981
4 threads, with allocation, grain size = 100, speedup = 1.59631
4 threads, with allocation, grain size = 1000, speedup = 1.65157
8 threads, without allocation, grain size = 1, speedup = 0.939633
8 threads, without allocation, grain size = 10, speedup = 0.944346
8 threads, without allocation, grain size = 100, speedup = 0.927691
8 threads, without allocation, grain size = 1000, speedup = 0.945551
8 threads, with allocation, grain size = 1, speedup = 1.76914
8 threads, with allocation, grain size = 10, speedup = 1.60658
8 threads, with allocation, grain size = 100, speedup = 1.57321
8 threads, with allocation, grain size = 1000, speedup = 1.60665

from gtsam.

dellaert avatar dellaert commented on August 28, 2024

Wondering whether this is something we can fix by looking at where we lose time. Also the amount of parallelism depends on a good ordering, hence we should investigate whether using Metis for example gives us better bang for the buck. Finally, we could share this in the docs and a possible blog post, reminding people about parallelism in the Bayes tree, and possibly providing a flag to try and use the parallel branch or not...

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

Note that the FindTBB.cmake in GTSAM is also out of date (cannot find TBB 2019.U0). Replacing the file from the VTK repo works flawlessly.

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

Note that the previous result is wrong. On my mac it is actually working, with max 4 times improvement with TBB. Assuming a bug specific to the environment (Ubuntu 16.04).

Got no time on this currently.

> $ ninja TimeTBB.run
[2/2] cd /Users/proffan/Projects/Development/VISION/gtsam_...n/Projects/Development/VISION/gtsam_build/examples/TimeTBB
/Users/proffan/Projects/Development/VISION/GTSAM/gtsam/3rdparty/Eigen/Eigen/src/Core/functors/UnaryFunctors.h:576:88: runtime error: division by zero
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Users/proffan/Projects/Development/VISION/GTSAM/gtsam/3rdparty/Eigen/Eigen/src/Core/functors/UnaryFunctors.h:576:88 in
numberOfProblems = 1000000
problemSize = 4
With 1 threads:
/usr/local/include/tbb/internal/../task.h:779:30: runtime error: member call on address 0x000116be3e00 which does not point to an object of type 'tbb::internal::scheduler'
0x000116be3e00: note: object is of type 'tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>'
 00 00 00 00  e8 1e 92 12 01 00 00 00  00 00 00 00 00 00 00 00  60 76 bf 16 01 00 00 00  60 76 bf 16
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/local/include/tbb/internal/../task.h:779:30 in
/usr/local/include/tbb/internal/../task.h:1046:23: runtime error: member call on address 0x000116be3e00 which does not point to an object of type 'tbb::internal::scheduler'
0x000116be3e00: note: object is of type 'tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>'
 00 00 00 00  e8 1e 92 12 01 00 00 00  00 00 00 00 00 00 00 00  60 76 bf 16 01 00 00 00  60 76 bf 16
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/local/include/tbb/internal/../task.h:1046:23 in
Without memory allocation, grain size = 1, time = 5.46733
Without memory allocation, grain size = 10, time = 5.59286
Without memory allocation, grain size = 100, time = 5.64539
Without memory allocation, grain size = 1000, time = 5.51933
With memory allocation, grain size = 1, time = 8.55949
With memory allocation, grain size = 10, time = 9.07178
With memory allocation, grain size = 100, time = 8.79069
With memory allocation, grain size = 1000, time = 8.66558

With 4 threads:
Without memory allocation, grain size = 1, time = 1.69261
Without memory allocation, grain size = 10, time = 1.68709
Without memory allocation, grain size = 100, time = 1.73469
Without memory allocation, grain size = 1000, time = 1.7691
With memory allocation, grain size = 1, time = 2.58719
With memory allocation, grain size = 10, time = 2.65104
With memory allocation, grain size = 100, time = 2.62247
With memory allocation, grain size = 1000, time = 2.74432

With 8 threads:
Without memory allocation, grain size = 1, time = 1.37712
Without memory allocation, grain size = 10, time = 1.46636
Without memory allocation, grain size = 100, time = 1.46375
Without memory allocation, grain size = 1000, time = 1.45783
With memory allocation, grain size = 1, time = 1.80873
With memory allocation, grain size = 10, time = 1.81393
With memory allocation, grain size = 100, time = 1.8269
With memory allocation, grain size = 1000, time = 1.84683

Summary of results:
4 threads, without allocation, grain size = 1, speedup = 3.23012
4 threads, without allocation, grain size = 10, speedup = 3.31508
4 threads, without allocation, grain size = 100, speedup = 3.25442
4 threads, without allocation, grain size = 1000, speedup = 3.11986
4 threads, with allocation, grain size = 1, speedup = 3.30841
4 threads, with allocation, grain size = 10, speedup = 3.42198
4 threads, with allocation, grain size = 100, speedup = 3.35207
4 threads, with allocation, grain size = 1000, speedup = 3.15765
8 threads, without allocation, grain size = 1, speedup = 3.97013
8 threads, without allocation, grain size = 10, speedup = 3.81411
8 threads, without allocation, grain size = 100, speedup = 3.8568
8 threads, without allocation, grain size = 1000, speedup = 3.78598
8 threads, with allocation, grain size = 1, speedup = 4.73232
8 threads, with allocation, grain size = 10, speedup = 5.00116
8 threads, with allocation, grain size = 100, speedup = 4.81182
8 threads, with allocation, grain size = 1000, speedup = 4.69213

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

For the UBSAN panic here, it is a problem with TBB, RcppCore/RcppParallel#36

In light of the code quality, I strongly believe it is a issue with the Ubuntu 16.04 supplied TBB.

@izzys Could you help reproducing this bug on our side? Need your environment, TBB version, compiling command line, etc. Many thanks!

from gtsam.

acxz avatar acxz commented on August 28, 2024

I can reproduce the issue:
mkdir build && cd build && cmake .. && make TimeTBB

gist

Hardware: Intel i7-7500U (2) @ 3.5GHz (having only two cores prob affects the times at higher thread counts)
OS: Arch Linux
TBB: 2020.2
GCC: 9.3

from gtsam.

ProfFan avatar ProfFan commented on August 28, 2024

I'll add this to my todo list, but not sure if I really have time on this.

from gtsam.

zzodo avatar zzodo commented on August 28, 2024

Any updates on this issue?
I still can reproduce this on Ubuntu 22.04 LTS and system-default TBB(2021.5) in both 4.2.0 and develop branches.
The test below was held on develop branch.

$ ./examples/TimeTBB 
numberOfProblems = 1000000
problemSize = 4
With 1 threads:
Without memory allocation, grain size = 1, time = 0.332967
Without memory allocation, grain size = 10, time = 0.328845
Without memory allocation, grain size = 100, time = 0.328481
Without memory allocation, grain size = 1000, time = 0.328192
With memory allocation, grain size = 1, time = 0.369558
With memory allocation, grain size = 10, time = 0.369653
With memory allocation, grain size = 100, time = 0.368168
With memory allocation, grain size = 1000, time = 0.368071

With 4 threads:
Without memory allocation, grain size = 1, time = 2.13116
Without memory allocation, grain size = 10, time = 2.10212
Without memory allocation, grain size = 100, time = 2.11296
Without memory allocation, grain size = 1000, time = 2.11572
With memory allocation, grain size = 1, time = 2.39639
With memory allocation, grain size = 10, time = 2.40664
With memory allocation, grain size = 100, time = 2.43013
With memory allocation, grain size = 1000, time = 2.43989

With 8 threads:
Without memory allocation, grain size = 1, time = 3.15854
Without memory allocation, grain size = 10, time = 3.17693
Without memory allocation, grain size = 100, time = 3.17387
Without memory allocation, grain size = 1000, time = 3.17985
With memory allocation, grain size = 1, time = 3.45604
With memory allocation, grain size = 10, time = 3.50903
With memory allocation, grain size = 100, time = 3.51825
With memory allocation, grain size = 1000, time = 3.52622

Summary of results:
4 threads, without allocation, grain size = 1, speedup = 0.156237
4 threads, without allocation, grain size = 10, speedup = 0.156435
4 threads, without allocation, grain size = 100, speedup = 0.15546
4 threads, without allocation, grain size = 1000, speedup = 0.155121
4 threads, with allocation, grain size = 1, speedup = 0.154214
4 threads, with allocation, grain size = 10, speedup = 0.153597
4 threads, with allocation, grain size = 100, speedup = 0.151501
4 threads, with allocation, grain size = 1000, speedup = 0.150855
8 threads, without allocation, grain size = 1, speedup = 0.105418
8 threads, without allocation, grain size = 10, speedup = 0.10351
8 threads, without allocation, grain size = 100, speedup = 0.103496
8 threads, without allocation, grain size = 1000, speedup = 0.10321
8 threads, with allocation, grain size = 1, speedup = 0.106931
8 threads, with allocation, grain size = 10, speedup = 0.105343
8 threads, with allocation, grain size = 100, speedup = 0.104645
8 threads, with allocation, grain size = 1000, speedup = 0.104381

GTSAM build information:

$ sudo cmake ..
-- GTSAM is a shared library due to GTSAM_FORCE_SHARED_LIB
-- GTSAM_POSE3_EXPMAP=ON, enabling GTSAM_ROT3_EXPMAP as well
-- Found Eigen version: 3.3.7
-- checking for thread-local storage - found
-- Could NOT find MKL (missing: MKL_INCLUDE_DIR MKL_LIBRARIES) 
-- Found Google perftools: 
-- Building 3rdparty
-- Could NOT find GeographicLib (missing: GeographicLib_LIBRARY_DIRS GeographicLib_LIBRARIES GeographicLib_INCLUDE_DIRS) 
-- Building base
-- Building basis
-- Building geometry
-- Building inference
-- Building symbolic
-- Building discrete
-- Building hybrid
-- Building linear
-- Building nonlinear
-- Building sam
-- Building sfm
-- Building slam
-- Building navigation
-- GTSAM Version: 4.3a0
-- Install prefix: /usr/local
-- Building GTSAM - as a SHARED library
-- Wrote /opt/gtsam/build/GTSAMConfig.cmake
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- ===============================================================
-- ================  Configuration Options  ======================
--  CMAKE_CXX_COMPILER_ID type                       : GNU
--  CMAKE_CXX_COMPILER_VERSION                       : 11.4.0
--  CMake version                                    : 3.22.1
--  CMake generator                                  : Unix Makefiles
--  CMake build tool                                 : /usr/bin/gmake
-- Build flags                                               
--  Build Tests                                      : Disabled
--  Build examples with 'make all'                   : Disabled
--  Build timing scripts with 'make all'             : Disabled
--  Build shared GTSAM libraries                     : Enabled
--  Put build type in library name                   : Enabled
--  Build libgtsam_unstable                          : Disabled
--  Build GTSAM unstable Python                      : Disabled
--  Build MATLAB Toolbox for unstable                : Disabled
--  Build for native architecture                    : Disabled
--  Build type                                       : Release
--  C compilation flags                              :  -O3 -DNDEBUG
--  C++ compilation flags                            :  -O3 -DNDEBUG
--  Enable Boost serialization                       : ON
--  GTSAM_COMPILE_FEATURES_PUBLIC                    : cxx_std_17
--  GTSAM_COMPILE_OPTIONS_PUBLIC                     : 
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC                 : 
--  GTSAM_COMPILE_OPTIONS_PUBLIC_RELEASE             : 
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_RELEASE         : 
--  Use System Eigen                                 : ON (Using version: 3.3.7)
--  Use System Metis                                 : OFF
--  Using Boost version                              : 1.74.0
--  Use Intel TBB                                    : Yes (Version: 2021.5.0)
--  Eigen will use MKL                               : MKL not found
--  Eigen will use MKL and OpenMP                    : OpenMP found but GTSAM_WITH_EIGEN_MKL is disabled
--  Default allocator                                : TBB
--  Cheirality exceptions enabled                    : YES
--  Build with ccache                                : No
-- Packaging flags
--  CPack Source Generator                           : TGZ
--  CPack Generator                                  : TGZ
-- GTSAM flags                                               
--  Quaternions as default Rot3                      : Disabled
--  Runtime consistency checking                     : Disabled
--  Build with Memory Sanitizer                      : Disabled
--  Rot3 retract is full ExpMap                      : Enabled
--  Pose3 retract is full ExpMap                     : Enabled
--  Enable branch merging in DecisionTree            : Enabled
--  Allow features deprecated in GTSAM 4.3           : Enabled
--  Metis-based Nested Dissection                    : Enabled
--  Use tangent-space preintegration                 : Enabled
-- MATLAB toolbox flags
--  Install MATLAB toolbox                           : Disabled
-- Python toolbox flags                                      
--  Build Python module with pybind                  : Disabled
-- ===============================================================
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/gtsam/build

from gtsam.

zzodo avatar zzodo commented on August 28, 2024

Another example with SolverComparer that mentioned above

$ ./examples/SolverComparer --incremental -d w10000 -o w_inc --threads 8
Loading dataset w10000
Using 8 threads
Looking for first measurement from step 0
Looks like 0 is the first time step, so adding a prior on it
Playing forward time steps...
chi2 = -nan
Step 0
-Total: 0 CPU (0 times, 0 wall, 0 children, min: 0 max: 0)
|   -Collect measurements: 0 CPU (1 times, 2e-06 wall, 0 children, min: 0 max: 0)
|   -Update ISAM2: 0 CPU (1 times, 2e-06 wall, 0 children, min: 0 max: 0)
|   -chi2: 0 CPU (1 times, 3.4e-05 wall, 0 children, min: 0 max: 0)
chi2 = 0.00172843
Step 1000
-Total: 0 CPU (0 times, 0 wall, 0.71 children, min: 0 max: 0)
|   -Collect measurements: 0.08 CPU (1001 times, 0.030624 wall, 0.08 children, min: 0 max: 0.01)
|   -Update ISAM2: 0.63 CPU (1001 times, 0.11614 wall, 0.63 children, min: 0 max: 0.01)
|   -chi2: 0 CPU (2 times, 0.000611 wall, 0 children, min: 0 max: 0)
chi2 = 0.00175299
Step 2000
-Total: 0 CPU (0 times, 0 wall, 1.85 children, min: 0 max: 0)
|   -Collect measurements: 0.18 CPU (2001 times, 0.093793 wall, 0.18 children, min: 0 max: 0.01)
|   -Update ISAM2: 1.67 CPU (2001 times, 0.334617 wall, 1.67 children, min: 0 max: 0.02)
|   -chi2: 0 CPU (3 times, 0.001946 wall, 0 children, min: 0 max: 0)
chi2 = 0.00177148
Step 3000
-Total: 0 CPU (0 times, 0 wall, 4.29 children, min: 0 max: 0)
|   -Collect measurements: 0.52 CPU (3001 times, 0.358602 wall, 0.52 children, min: 0 max: 0.01)
|   -Update ISAM2: 3.77 CPU (3001 times, 0.948901 wall, 3.77 children, min: 0 max: 0.02)
|   -chi2: 0 CPU (4 times, 0.005088 wall, 0 children, min: 0 max: 0)
chi2 = 0.00177683
Step 4000
-Total: 0 CPU (0 times, 0 wall, 7.88 children, min: 0 max: 0)
|   -Collect measurements: 1.09 CPU (4001 times, 0.882309 wall, 1.09 children, min: 0 max: 0.02)
|   -Update ISAM2: 6.78 CPU (4001 times, 1.9246 wall, 6.78 children, min: 0 max: 0.05)
|   -chi2: 0.01 CPU (5 times, 0.008837 wall, 0.01 children, min: 0.01 max: 0.01)
chi2 = 0.00177331
Step 5000
-Total: 0 CPU (0 times, 0 wall, 11.41 children, min: 0 max: 0)
|   -Collect measurements: 1.47 CPU (5001 times, 1.19369 wall, 1.47 children, min: 0 max: 0.02)
|   -Update ISAM2: 9.93 CPU (5001 times, 2.90427 wall, 9.93 children, min: 0.01 max: 0.05)
|   -chi2: 0.01 CPU (6 times, 0.014129 wall, 0.01 children, min: 0 max: 0.01)
chi2 = 0.00178298
Step 6000
-Total: 0 CPU (0 times, 0 wall, 16.1 children, min: 0 max: 0)
|   -Collect measurements: 2.55 CPU (6001 times, 2.11069 wall, 2.55 children, min: 0 max: 0.02)
|   -Update ISAM2: 13.54 CPU (6001 times, 4.45979 wall, 13.54 children, min: 0.01 max: 0.09)
|   -chi2: 0.01 CPU (7 times, 0.022692 wall, 0.01 children, min: 0 max: 0.01)
chi2 = 0.00177962
Step 7000
-Total: 0 CPU (0 times, 0 wall, 19.68 children, min: 0 max: 0)
|   -Collect measurements: 3.37 CPU (7001 times, 2.9156 wall, 3.37 children, min: 0 max: 0.02)
|   -Update ISAM2: 16.29 CPU (7001 times, 5.65427 wall, 16.29 children, min: 0 max: 0.11)
|   -chi2: 0.02 CPU (8 times, 0.029358 wall, 0.02 children, min: 0.01 max: 0.01)
chi2 = 0.00177708
Step 8000
-Total: 0 CPU (0 times, 0 wall, 23.28 children, min: 0 max: 0)
|   -Collect measurements: 4.16 CPU (8001 times, 3.72301 wall, 4.16 children, min: 0 max: 0.02)
|   -Update ISAM2: 19.09 CPU (8001 times, 6.79453 wall, 19.09 children, min: 0.01 max: 0.11)
|   -chi2: 0.03 CPU (9 times, 0.041096 wall, 0.03 children, min: 0.01 max: 0.01)
chi2 = 0.00177835
Step 9000
-Total: 0 CPU (0 times, 0 wall, 29.51 children, min: 0 max: 0)
|   -Collect measurements: 6.08 CPU (9001 times, 5.58775 wall, 6.08 children, min: 0 max: 0.02)
|   -Update ISAM2: 23.38 CPU (9001 times, 9.15137 wall, 23.38 children, min: 0 max: 0.17)
|   -chi2: 0.05 CPU (10 times, 0.055059 wall, 0.05 children, min: 0.02 max: 0.02)
Writing output file w_inc
unregistered class - derived class not registered or exported
$ ./examples/SolverComparer --incremental -d w10000 -o w_inc --threads 4
Loading dataset w10000
Using 4 threads
Looking for first measurement from step 0
Looks like 0 is the first time step, so adding a prior on it
Playing forward time steps...
chi2 = -nan
Step 0
-Total: 0 CPU (0 times, 0 wall, 0 children, min: 0 max: 0)
|   -Collect measurements: 0 CPU (1 times, 1e-06 wall, 0 children, min: 0 max: 0)
|   -Update ISAM2: 0 CPU (1 times, 1e-06 wall, 0 children, min: 0 max: 0)
|   -chi2: 0 CPU (1 times, 3.3e-05 wall, 0 children, min: 0 max: 0)
chi2 = 0.00172843
Step 1000
-Total: 0 CPU (0 times, 0 wall, 0.36 children, min: 0 max: 0)
|   -Collect measurements: 0.07 CPU (1001 times, 0.030023 wall, 0.07 children, min: 0 max: 0.01)
|   -Update ISAM2: 0.29 CPU (1001 times, 0.108681 wall, 0.29 children, min: 0 max: 0.01)
|   -chi2: 0 CPU (2 times, 0.00063 wall, 0 children, min: 0 max: 0)
chi2 = 0.00175299
Step 2000
-Total: 0 CPU (0 times, 0 wall, 0.98 children, min: 0 max: 0)
|   -Collect measurements: 0.15 CPU (2001 times, 0.091805 wall, 0.15 children, min: 0 max: 0.01)
|   -Update ISAM2: 0.82 CPU (2001 times, 0.31337 wall, 0.82 children, min: 0 max: 0.01)
|   -chi2: 0.01 CPU (3 times, 0.001879 wall, 0.01 children, min: 0.01 max: 0.01)
chi2 = 0.00177148
Step 3000
-Total: 0 CPU (0 times, 0 wall, 2.5 children, min: 0 max: 0)
|   -Collect measurements: 0.37 CPU (3001 times, 0.355865 wall, 0.37 children, min: 0 max: 0.01)
|   -Update ISAM2: 2.11 CPU (3001 times, 0.910435 wall, 2.11 children, min: 0 max: 0.02)
|   -chi2: 0.02 CPU (4 times, 0.00504 wall, 0.02 children, min: 0.01 max: 0.01)
chi2 = 0.00177683
Step 4000
-Total: 0 CPU (0 times, 0 wall, 4.89 children, min: 0 max: 0)
|   -Collect measurements: 0.92 CPU (4001 times, 0.877368 wall, 0.92 children, min: 0 max: 0.01)
|   -Update ISAM2: 3.94 CPU (4001 times, 1.85701 wall, 3.94 children, min: 0 max: 0.04)
|   -chi2: 0.03 CPU (5 times, 0.008958 wall, 0.03 children, min: 0.01 max: 0.01)
chi2 = 0.00177331
Step 5000
-Total: 0 CPU (0 times, 0 wall, 7.08 children, min: 0 max: 0)
|   -Collect measurements: 1.16 CPU (5001 times, 1.18555 wall, 1.16 children, min: 0 max: 0.01)
|   -Update ISAM2: 5.88 CPU (5001 times, 2.80783 wall, 5.88 children, min: 0 max: 0.04)
|   -chi2: 0.04 CPU (6 times, 0.014088 wall, 0.04 children, min: 0.01 max: 0.01)
chi2 = 0.00178298
Step 6000
-Total: 0 CPU (0 times, 0 wall, 10.64 children, min: 0 max: 0)
|   -Collect measurements: 2.02 CPU (6001 times, 2.10637 wall, 2.02 children, min: 0 max: 0.01)
|   -Update ISAM2: 8.57 CPU (6001 times, 4.3782 wall, 8.57 children, min: 0.01 max: 0.09)
|   -chi2: 0.05 CPU (7 times, 0.023237 wall, 0.05 children, min: 0.01 max: 0.01)
chi2 = 0.00177962
Step 7000
-Total: 0 CPU (0 times, 0 wall, 13.63 children, min: 0 max: 0)
|   -Collect measurements: 2.9 CPU (7001 times, 2.91521 wall, 2.9 children, min: 0 max: 0.01)
|   -Update ISAM2: 10.67 CPU (7001 times, 5.61952 wall, 10.67 children, min: 0 max: 0.09)
|   -chi2: 0.06 CPU (8 times, 0.030003 wall, 0.06 children, min: 0.01 max: 0.01)
chi2 = 0.00177708
Step 8000
-Total: 0 CPU (0 times, 0 wall, 16.33 children, min: 0 max: 0)
|   -Collect measurements: 3.77 CPU (8001 times, 3.71799 wall, 3.77 children, min: 0 max: 0.02)
|   -Update ISAM2: 12.49 CPU (8001 times, 6.74208 wall, 12.49 children, min: 0.01 max: 0.09)
|   -chi2: 0.07 CPU (9 times, 0.041187 wall, 0.07 children, min: 0.01 max: 0.01)
chi2 = 0.00177835
Step 9000
-Total: 0 CPU (0 times, 0 wall, 21.79 children, min: 0 max: 0)
|   -Collect measurements: 5.86 CPU (9001 times, 5.61159 wall, 5.86 children, min: 0 max: 0.02)
|   -Update ISAM2: 15.85 CPU (9001 times, 9.07511 wall, 15.85 children, min: 0.01 max: 0.11)
|   -chi2: 0.08 CPU (10 times, 0.055581 wall, 0.08 children, min: 0.01 max: 0.01)
Writing output file w_inc
unregistered class - derived class not registered or exported

My laptop has hybrid CPU Intel Core i7-13700H and I also tried TBB version 2021.12, which is newer than v2021.9.0 that is announced to be compatible with the hybrid CPUs.

from gtsam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.