numactl / numactl Goto Github PK

View Code? Open in Web Editor NEW

386.0 386.0 135.0 1.63 MB

NUMA support for Linux

License: GNU General Public License v2.0

C 72.65% Shell 4.10% Makefile 1.20% M4 3.75% Roff 18.11% Awk 0.19%

numactl's People

Contributors

Stargazers

Watchers

Forkers

filbranden cpwickman x-y-z eagles125 hacker-qian zevweiss justplay bhanug gxkevin burz kkulakowski pranjalsahu dongdongju codebees couchbasedeps chixiaobo awlauria ianw bvqbao liumorgan hubert-he sandeep1309 harish-24 offspring rzwisler dannf warmcat githubgsq zengjieyu qianlong-zhang pratik0509 yepuv1 sanskriti-s ubuntu-repo csluluyao erikajob91 aegl intel yogarajbaskaravel niteshnarayanlal lmingcsce nitingupta910 pfliu krwq stefanha fangxinmiao csqshz patman-cp servicefoundation benzzzlin lxlhappylife moneytech dhodovsk libertine-linux-clones mitosis-project flx42 seeteena benguang michalbiesek bwidawsk huangyibo keiya-nobuta samcmill jansci621 brannondorsey global-localhost global19 global19-atlassian-net hying-caritas agent-tao scop noyafangzhou chipflake kapil2099 rblake-llnl lkuppa jianlin-lv electricintel danfai ftang1 aliyuncontainerservice shuangkun xwentian2020 tyyim hanjianju phone1126 wangjunjie-lnnf dongbinghua oaken-source githubtang-t sphish lrfurtado peifeng-k kraj honggyukim watologo1 chenxy1988 blitheyu yueyedeai ynnsau

numactl's Issues

status from numa_move_pages() not correct?

I am trying to move a page to NUMA 0:

if (0 != numa_move_pages(getpid(), nPages, &ptrToSHM, nodes, status, MPOL_MF_MOVE_ALL)) {
        std::cout << "failed to move pages for /dev/shm/" << name << " to NUMA " << numaNodeID << " because " << strerror(errno) << std::endl;
    }

It returns zero, and then I check where those pages are on after move_pages() called:

if (0 != numa_move_pages(0, nPages, &ptrToSHM, nullptr, status, 0)) {
        std::cout << "failed to inquiry pages for /dev/shm/" << name << " because " << strerror(errno) << std::endl;
    }
    else {
        for (uint32_t i = 0; i < nPages; i++) {
            std::cout << "/dev/shm/" << name << "'s page # " << i << " locate at numa node " << status[i] << std::endl;
        }
    }

And it prints:

/dev/shm/test's page # 0 locate at numa node -2
/dev/shm/test's page # 1 locate at numa node -14

I looked at numactl/libnuma.c, numa_move_pages() just calls move_pages() directly.

According to move_pages()'s manpage, it states: nodes is an array of integers that specify the desired location for each page. Each element in the array is a node number. nodes can also be NULL, in which case move_pages() does not move any pages but instead will return the node where each page currently resides, in the status array. Obtaining the status of each page may be necessary to determine pages that need to be moved.

I wonder why it prints negative values although both moving pages and querying return success.

Thanks!

P.S. I am using libnuma 2.0.9 from CentOS 7's repos.

numctl/test/move_pages : nr_nodes calucalting wrongly

move_pages test application is failing " A minimum of 2 nodes is required for this test." on Numa system.

Looks like nr_nodes should be like below.

--- a/test/move_pages.c
+++ b/test/move_pages.c
@@ -28,7 +28,7 @@ int main(int argc, char **argv)

    pagesize = getpagesize();

```
  nr_nodes = numa_max_node();
```

  nr_nodes = numa_max_node() + 1;

  if (nr_nodes < 2) {
          printf("A minimum of 2 nodes is required for this test.\n");

Symbol versioning incompatibility with gcc 4.8.5

In 2.0.14, the SYMVER macro escapes quotes. For instance

SYMVER("numa_sched_getaffinity_v2", "numa_sched_getaffinity@@libnuma_1.2")
becomes
__asm__ (".symver " "\"numa_sched_getaffinity_v2\"" "," "\"numa_sched_getaffinity@@libnuma_1.2\"");

gcc 4.8.5, the default compiler on RHEL/CentOS 7, does not handle the escaped quotes:

/tmp/cckm8DwN.s: Assembler messages:
/tmp/cckm8DwN.s:3: Error: Missing symbol name in directive
/tmp/cckm8DwN.s:3: Error: expected comma after name in .symver

More recent compilers do not seem to have this limitation, e.g., gcc 8.3.1, handling the escaped quotes fine.

This change works for me on both 4.8.5 and 8.3.1:

--- a/util.h    2020-10-08 10:08:40.517167202 -0700
+++ b/util.h    2020-10-08 10:08:55.523301155 -0700
@@ -22,5 +22,5 @@
 #if HAVE_ATTRIBUTE_SYMVER
 #define SYMVER(a,b) __attribute__ ((symver (b)))
 #else
-#define SYMVER(a,b) __asm__ (".symver " #a "," #b);
+#define SYMVER(a,b) __asm__ (".symver " a "," b);
 #endif

Unused variable 'loose' in memhog.c

In memhog.c, the variable loose is iniltialzed to 0. It is then set to 1 in an else statement on line 105 . However, the connected if statement includes an exit. Thus if the program doesn't exit, it sets loose =1. Therefore, the usage of loose in lines 132 and 136 are redundant. Should loose be completely removed or does it serve some purpose that is now broken?

Low-priority memory leak

If I understand the code correctly, this function https://github.com/numactl/numactl/blob/master/numactl.c#L228 leaks cpus bitmask.

New release?

Would it be possible to make a new release?

URL in GitHub descriptions redirects to HPE

The URL http://oss.sgi.com/projects/libnuma/ from the GitHub description redirects to https://www.hpe.com/us/en/solutions/hpc-high-performance-computing.html. Is that wanted?

should avoid using malloc for initialization

It's currently impossible to write a malloc that uses libnuma because libnuma uses malloc during initialization for the cpu masks. Should use some simple mmap/brk based allocator to break this dependency.

Project website and ftp not reachable

It looks like the project website and the ftp server are down: http://oss.sgi.com/projects/libnuma/ and ftp://oss.sgi.com/www/projects/libnuma/download/.

nodemask test

Trying to compile the nodemask test as a standalone test against an install of numactl gives me this error:

ld.lld: error: undefined symbol: nodemask_isset
>>> referenced by nodemask.c

Using the clang-10 compiler.

If I change:

                printf("numa_get_run_node_mask nodemask_isset returns=0x%lx\n", nodemask_isset(&nodemask, i));

                printf("numa_get_run_node_mask nodemask_isset returns=0x%lx\n", nodemask_isset_compat(&nodemask, i));

It compiles fine. Is this a problem with Clang doing something non-standard? Or does the test need updated? Or is it just not designed to be built outside the numactl source tree?

Is this tool supported on Redhat linux?

I would like to use this tool on Redhat linux, can I directly build and run it by pulling code from github?

Linker errors with lld

lld is the LLVM linker.

I attempted to compile numactl at f567a26 with lld:

$ ./autogen.sh
$ ./configure
$ make

and encountered the following errors:

ld: error: duplicate symbol 'set_mempolicy' in version script
ld: error: duplicate symbol 'get_mempolicy' in version script
ld: error: duplicate symbol 'mbind' in version script
ld: error: duplicate symbol 'numa_alloc' in version script
ld: error: duplicate symbol 'numa_alloc_interleaved' in version script
ld: error: duplicate symbol 'numa_alloc_local' in version script
ld: error: duplicate symbol 'numa_alloc_onnode' in version script
ld: error: duplicate symbol 'numa_available' in version script
ld: error: duplicate symbol 'numa_distance' in version script
ld: error: duplicate symbol 'numa_error' in version script
ld: error: duplicate symbol 'numa_exit_on_error' in version script
ld: error: duplicate symbol 'numa_free' in version script
ld: error: duplicate symbol 'numa_get_interleave_node' in version script
ld: error: duplicate symbol 'numa_max_node' in version script
ld: error: duplicate symbol 'numa_migrate_pages' in version script
ld: error: duplicate symbol 'numa_node_size64' in version script
ld: error: duplicate symbol 'numa_node_size' in version script
ld: error: duplicate symbol 'numa_pagesize' in version script
ld: error: duplicate symbol 'numa_police_memory' in version script
ld: error: duplicate symbol 'numa_preferred' in version script
ld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors)
collect2: error: ld returned 1 exit status

I am on Fedora 29 and I'm using lld at this version:

$ ld.lld --version
LLD 7.0.0 (compatible with GNU linkers)

libnuma: Warning: node argument 1 is out of range

$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70
node 0 size: 128702 MB
node 0 free: 85302 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
node 1 size: 0 MB
node 1 free: 0 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

$ numactl --cpunodebind=1 ./a.out
libnuma: Warning: node argument 1 is out of range
...

There are 36 cpus on node 1 but I fail to bind it. I believe it's reasonable though there's no memory attaching on it.

Also, I noticed that using --membind=0 explicitly greatly reduce minor page faults. What could possibly cause this?

Is there any potential risk in numa_police_memory()

void numa_police_memory(void mem, size_t size)
{
int pagesize = numa_pagesize_int();
unsigned long i;
for (i = 0; i < size; i += pagesize)
((volatile char)mem)[i] = ((volatile char*)mem)[i];
}

is unsafe as it modifies the shared memory area which might interleave with other store operations, performed by other processes or threads, at the same time as numa_police_memory is executing.

memhog should have a manpage

It's commonly installed, so should have some documentation

test/bind_range fails when used with cgroups

The bind_range test fails whenever cgroups are used to confine processes to certain CPUs. More specifically, the test fails on the command:

numactl --physcpubind=$HIGHESTCPU ls

The variable $HIGHESTCPU is obtained from the last entry in /proc/cpuinfo. However, if the last CPU does not belong to the cgroup, then numactl fails with sched_setaffinity: Invalid argument.

A possible fix could be, if cgroups are used, then have the test select the highest CPU number in the cgroup's cpuset.

For example, for the particular system I am using, the number of CPUs is currently obtained from /proc/cpuinfo as follows:

$ cat /proc/cpuinfo | grep 'processor' | tail -n1
processor	: 79

The CPUs in the current cpuset can probably be obtained with something like this:

$ cat /sys/fs/cgroup/cpuset/$(cat /proc/$$/task/$$/cpuset)/cpuset.cpus
16-31,56-71

memory usage information misleading

Hello,
I noticed that running numastat -cm on a system whith HugePages allocated manually by numa nodes provides somewat misleading information.

Given we have a host with two numa nodes

$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 1007 MB
node 0 free: 953 MB
node 1 cpus: 1
node 1 size: 987 MB
node 1 free: 900 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10

Allocate 10 pages for each node:

echo 10 >  /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 10 >  /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

$numastat -cm
Per-node system memory usage (in MBs):
Token Node not in hash table.
Token Node not in hash table.
Token Node not in hash table.
Token Node not in hash table.
                 Node 0 Node 1 Total
                 ------ ------ -----
MemTotal           1007    988  1995
MemFree             953    900  1854
MemUsed              54     87   141
Active                1     19    20
Inactive              3     17    20
Active(anon)          0      3     3
Inactive(anon)        0      2     3
Active(file)          1     17    17
Inactive(file)        2     14    17
Unevictable           0      0     0
Mlocked               0      0     0
Dirty                 0      0     0
Writeback             0      0     0
FilePages             3     33    37
Mapped                1      8    10
AnonPages             0      4     4
Shmem                 0      0     1
KernelStack           1      1     2
PageTables            0      0     1
NFS_Unstable          0      0     0
Bounce                0      0     0
WritebackTmp          0      0     0
Slab                 18     19    38
SReclaimable          7      8    15
SUnreclaim           12     11    23
AnonHugePages         0      0     0
HugePages_Total      20     20    40
HugePages_Free       20     20    40
HugePages_Surp        0      0     0

Notice HugePages_Total reported as 20 per each node, which is misleading.

$ cat /proc/sys/vm/nr_hugepages
20

numa_set_interleave_mask(numa_all_nodes_ptr) fails with set_mempolicy: Invalid argument

This only starts to happen recently,

    if (numa_available() != -1)
    {
        numa_set_interleave_mask(numa_all_nodes_ptr);
    }

it fails to set interleave mask after returning set_mempolicy: Invalid argument

System: CentOS Linux release 7.5.1804 (Core)
Toolchain: gcc 8.2.0
packages:

numactl-devel-2.0.9-7.el7.x86_64
numactl-2.0.9-7.el7.x86_64
numactl-libs-2.0.9-7.el7.x86_64

How to submit patch for numactl?

Is there mail list?

On github, I find Pull requests is 0, but https://travis-ci.org/numactl/numactl/pull_requests is active.
But I don't know how to send patch to https://travis-ci.org/numactl/numactl/pull_requests.

@filbranden

Thanks,
--Hongzhi

MPOL_LOCAL "not declared" when compiling C program

Hi!

I've been trying to use MPOL_LOCAL as the mode to mbind but the symbol is not defined.

#include <numaif.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>    

#define N 134217728

int main() {
    uint64_t *a = (uint64_t*) malloc(N*sizeof(uint64_t));
    mbind(a, N, MPOL_LOCAL, 0, 0, MPOL_MF_STRICT | MPOL_MF_MOVE);
    printf("Hello world!\n");
    return 0;
}

$ gcc-8 -lnuma example.c
example.c: In function ‘main’:
example.c:10:14: error: ‘MPOL_LOCAL’ undeclared (first use in this function); did you mean ‘MPOL_MAX’?
  mbind(a, N, MPOL_LOCAL, 0, 0, MPOL_MF_STRICT | MPOL_MF_MOVE);
              ^~~~~~~~~~
              MPOL_MAX
example.c:10:14: note: each undeclared identifier is reported only once for each function it appears in

https://stackoverflow.com/q/52632754/4288486

libnuma: compiler warnings in numa.h

As a user of libnuma, I activate complier warnings and I get many of them after including the numa.h (v2.0.12). Especially the strict-prototypes warning appears many times. The used compiler is gcc 7 with -Wall -Wextra and -Wstrict-prototypes.

Poor diagnostic message when kernel adds new fields to meminfo

When the node specific meminfo files have field(s) that numastat doesn't know about it prints the cryptic message:

"Token Node not in hash table."

It would be more helpful to print the token that is missing, rather than the first token on the line (typically the word "Node").

$ diff orig-numastat.c numastat.c
827c827
< printf("Token %s not in hash table.\n", tok[0]);

                          printf("Token %s not in hash table.\n", tok[0 + tok_offset]);

How to show numerical differences in two numastat log files?

Hello,

I capture numastat logs before and after running a command like this:

numastat > numastat.start ; COMMAND; numastat > numastat.end

Is there a tool to show numerical differences between numastat.end and numastat.start files? I want to subtract the end values from the start values while keeping the file format.

I'm thinking about writing a small tool in Python for that, but I don't want to reinvent the wheel. I think I'm not the first one who is trying to solve this:-)

Any hint or advice?

Below are the example files, for which I'm trying to compute the numerical differences. I'm using Google Sheet for it right now, but I'm looking for an automated way.

Thanks a lot
Jirka

$ more numastat.0.start
                           node0           node1           node2           node3           node4           node5           node6           node7
numa_hit                 2941595         1849807         6654372         4838353        12229828         2819605         2426858         3311826
numa_miss                      0               0               0               0               0               0               0               0
numa_foreign                   0               0               0               0               0               0               0               0
interleave_hit             19833           19801           19837           19809           19825           19812           19835           19805
local_node               2932819         1822473         6632791         4811102        12191273         2792358         2399298         3278575
other_node                  8776           27334           21581           27251           38555           27247           27560           33251

$ more numastat.0.end
                           node0           node1           node2           node3           node4           node5           node6           node7
numa_hit                 2954256         1849828         6665977         4838617        12264020         2819647         2426858         3311826
numa_miss                      0               0               0               0               0               0               0               0
numa_foreign                   0               0               0               0               0               0               0               0
interleave_hit             19833           19801           19837           19809           19825           19812           19835           19805
local_node               2945472         1822494         6644392         4811366        12225461         2792400         2399298         3278575
other_node                  8784           27334           21585           27251           38559           27247           27560           33251

$ more numastat.0.diff (produced with help of Google Sheets, but I'm looking for a command-line utility to produce the same result. The side effect is that formatting is broken.)
	node0	node1	node2	node3	node4	node5	node6	node7
numa_hit	12661	21	11605	264	34192	42	0	0
numa_miss	0	0	0	0	0	0	0	0
numa_foreign	0	0	0	0	0	0	0	0
interleave_hit	0	0	0	0	0	0	0	0
local_node	12653	21	11601	264	34188	42	0	0
other_node	8	0	4	0	4	0	0	0

constructors / destructors execute twice

(By reading the code, this looks like known behavior. But I figured I could re-open and ask a few questions about it.)

While helping @sanskriti-s debug some of the valgrind complaints #43, we noticed strange behavior of the libnuma constructor and destructor routines.

If adding some printk debugging:

  diff --git a/libnuma.c b/libnuma.c
  index eb995678ad54..a80f851f65c6 100644
  --- a/libnuma.c
  +++ b/libnuma.c
  @@ -92,6 +92,7 @@ numa_init(void)
   {
          int max,i;
   
  +printf("DEBUG: %s\n", __func__);
          if (sizes_set)
                  return;
   
  @@ -111,6 +112,7 @@ numa_init(void)
   void __attribute__((destructor))
   numa_fini(void)
   {
  +printf("DEBUG: %s\n", __func__);
          FREE_AND_ZERO(numa_all_cpus_ptr);
          FREE_AND_ZERO(numa_possible_cpus_ptr);
          FREE_AND_ZERO(numa_all_nodes_ptr);

Things appear as expected when building a static libnuma.a:

  % make clean 
  % ./autogen.sh
  % ./configure --enable-static --enable-shared=no
  % make
  % ./numademo | grep DEBUG
  DEBUG: numa_init
  DEBUG: numa_fini

However, when building a dynamic shared libnuma.so, both are executed twice:

  % make clean 
  % ./autogen.sh
  % ./configure --enable-static=no --enable-shared=yes
  % make
  % LD_LIBRARY_PATH=$(pwd)/.libs .libs/numademo | grep DEBUG
  DEBUG: numa_init
  DEBUG: numa_init
  DEBUG: numa_fini
  DEBUG: numa_fini

Note that this 2x execution appears to be accounted for by commit ec6e455 "Use constructors for numa_init/exit".

However, if we strip the explicit linker directive to mark the -init and -fini routines from the Makefile.am:

  diff --git a/Makefile.am b/Makefile.am
  index 1c4266d43f26..c16a50780002 100644
  --- a/Makefile.am
  +++ b/Makefile.am
  @@ -40,7 +40,7 @@ memhog_SOURCES = memhog.c util.c
   memhog_LDADD = libnuma.la
   
   libnuma_la_SOURCES = libnuma.c syscall.c distance.c affinity.c affinity.h sysfs.c sysfs.h rtnetlink.c rtnetlink.h versions.ldscript
  -libnuma_la_LDFLAGS = -version-info 1:0:0 -Wl,--version-script,$(srcdir)/versions.ldscript -Wl,-init,numa_init -Wl,-fini,numa_fini
  +libnuma_la_LDFLAGS = -version-info 1:0:0 -Wl,--version-script,$(srcdir)/versions.ldscript
   
   check_PROGRAMS = \
          test/distance

both libnuma.a and libnuma.so test cases only execute the constructor/destructors once:

  % make clean 
  % ./autogen.sh
  % ./configure --enable-static --enable-shared=no
  % make
  % ./numademo | grep DEBUG
  DEBUG: numa_init
  DEBUG: numa_fini

  % ./autogen.sh
  % ./configure --enable-static=no --enable-shared=yes
  % make
  % LD_LIBRARY_PATH=$(pwd)/.libs .libs/numademo | grep DEBUG
  DEBUG: numa_init
  DEBUG: numa_fini

I did some testing with a minimal libmin.a and libmin.so to verify this behavior and I ended up with these results:

libmin.so = init/fini always executed
libmin.a - main program doesn't reference any libmin code
= no init/fini
libmin.a - main program references libmin code
= init/fini called

We were curious as to what the intended libnuma behavior should be. Can we safely remove the init/fini linker directives from Makefile.am? We weren't sure, so we coded any #43 libnuma fixes with this double execution in mind.

-- Joe

cpunodebind fails with "numa_sched_setaffinity_v2_int() failed: Invalid argument" error message.

getting following error while trying to do cpunodebind using numactl,

localhost:~ # numactl --cpunodebind=0 --membind=1 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
numa_sched_setaffinity_v2_int() failed: Invalid argument
sched_setaffinity: Invalid argument
localhost:~ #

logs

llocalhost:~ # numactl -H
available: 4 nodes (0-3)
node 0 cpus:
node 0 size: 8664 MB
node 0 free: 2650 MB
node 1 cpus:
node 1 size: 5602 MB
node 1 free: 5587 MB
node 2 cpus: 0 1 2 3 4 5 6 7
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus:
node 3 size: 1023 MB
node 3 free: 123 MB
node distances:
node 0 1 2 3
0: 10 20 20 20
1: 20 10 20 20
2: 20 20 10 20
3: 20 20 20 10
localhost:~ #

localhost:~ # numactl --cpunodebind=2 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
libnuma: Warning: node argument 2 is out of range
<2> is invalid
localhost:~ #

===>>For other nodes it gives following errors .

localhost:~ # numactl --cpunodebind=0 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
numa_sched_setaffinity_v2_int() failed: Invalid argument
sched_setaffinity: Invalid argument
localhost:~ # numactl --cpunodebind=1 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
numa_sched_setaffinity_v2_int() failed: Invalid argument
sched_setaffinity: Invalid argument

numactl--failed to map segment from shared object

I have tested it on Arm64 platfrom, then found the below error:
/tmp # whoami
root
/ # numactl
numactl: error while loading shared libraries: libnuma.so.1: failed to map segment from shared object

Used strace command catch this log:
1528 execve("/bin/numactl", ["numactl"], 0x7fdff72d00 /* 15 vars */) = 0
1528 brk(NULL) = 0x55ab640000
1528 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f94393000
1528 faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = 0
1528 openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=60, ...}) = 0
1528 mmap(NULL, 60, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f94392000
1528 close(3) = 0
1528 openat(AT_FDCWD, "/lib/libc_stubs.so", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=67736, ...}) = 0
1528 mmap(NULL, 131096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94349000
1528 mprotect(0x7f9434c000, 114688, PROT_NONE) = 0
1528 mmap(0x7f94368000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x7f94368000
1528 close(3) = 0
1528 openat(AT_FDCWD, "/lib/libc++.so.1.0", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=723248, ...}) = 0
1528 mmap(NULL, 799080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94285000
1528 mprotect(0x7f9432e000, 65536, PROT_NONE) = 0
1528 mmap(0x7f9433e000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xa9000) = 0x7f9433e000
1528 mmap(0x7f94346000, 8552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f94346000
1528 close(3) = 0
1528 openat(AT_FDCWD, "/lib/libc++abi.so.1.0", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=198872, ...}) = 0
1528 mmap(NULL, 263464, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94244000
1528 mprotect(0x7f94270000, 65536, PROT_NONE) = 0
1528 mmap(0x7f94280000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2c000) = 0x7f94280000
1528 close(3) = 0
1528 munmap(0x7f94392000, 60) = 0
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0P3\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0777, st_size=192304, ...}) = 0
1528 mmap(NULL, 123752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = -1 EPERM (Operation not permitted)
1528 close(3) = 0
1528 writev(2, [{iov_base="numactl", iov_len=7}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libnuma.so.1", iov_len=12}, {iov_base=": ", iov_len=2}, {iov_base="failed to map segment from share"..., iov_len=40}, {iov_base="", iov_len=0}, {iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 10) = 102
1528 exit_group(127) = ?
1528 +++ exited with 127 +++
/tmp #

This error was from mmap(NULL, 123752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = -1 EPERM (Operation not permitted), used root user exec it

numactl man page has incorrect information about --localalloc option

numactl man page has incorrect information about --localalloc option.

Current Text:-
--localalloc, -l
Always allocate on the current node.

It should be

   --localalloc, -l
          Try to allocate on the current node of the process, but if memory cannot be allocated there fall back to other nodes.

why not support openwrt?

does anyone add package for openwrt?

function numa_sched_setaffinity

in the file numa.h line 319
int numa_sched_setaffinity(pid_t, struct bitmask *);

no realizetion

int numa_sched_setaffinity(pid_t, struct bitmask *)
{
//how do you do ?
}

libnuma.so undefined reference to `find_first'

I compiled libnuma.so by aarch64-linux-gnu-gcc, it goes well, but when load libnuma.so, it tells:
/opt/sysroot-aarch64/usr/lib/libnuma.so: undefined reference to `find_first'
Is there anyone help me to figure out what i did wrong?

How to disable NUMA (have only 4 cores)

Hi!

After some package-updates+kernel upgrade I only have 4 cores but I have 8 cores on my machine.

CPU core 1,3,5,7 are offline.

I can't bring it online.

numactl -H

Output:
available: 1 nodes (0)
node 0 cpus: 0 2 4 6
node 0 size: 32142 MB
node 0 free: 29784 MB
node distances:
node 0
0: 10

How can I remove node 0 cpus: 0 2 4 6?

Memory leak in libnuma.c: node_cpu_mask_v2 is never freed

Hello,

$ cat numa.cc
#include <numa.h>
#include <stdio.h>

int
main(int args, char** argv)
{
    if (numa_available() == -1) {
        return(1);
    }

    printf("%d\n", numa_node_of_cpu(0));

    return(0);
}
$ g++ -g numa.cc -o numa -lnuma
$ valgrind --leak-check=full --show-reachable=yes ./numa
==4859== Memcheck, a memory error detector
==4859== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4859== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4859== Command: ./numa
==4859== 
0
==4859== 
==4859== HEAP SUMMARY:
==4859==     in use at exit: 8,720 bytes in 3 blocks
==4859==   total heap usage: 35 allocs, 32 frees, 81,614 bytes allocated
==4859== 
==4859== 16 bytes in 1 blocks are still reachable in loss record 1 of 3
==4859==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4859==    by 0x4E38A5E: numa_bitmask_alloc (libnuma.c:206)
==4859==    by 0x4E39846: numa_allocate_cpumask (libnuma.c:703)
==4859==    by 0x4E3ACA1: numa_node_to_cpus@@libnuma_1.2 (libnuma.c:1352)
==4859==    by 0x4E3AE99: numa_node_of_cpu (libnuma.c:1412)
==4859==    by 0x400786: main (numa.cc:11)
==4859== 
==4859== 512 bytes in 1 blocks are still reachable in loss record 2 of 3
==4859==    at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4859==    by 0x4E38A8D: numa_bitmask_alloc (libnuma.c:210)
==4859==    by 0x4E39846: numa_allocate_cpumask (libnuma.c:703)
==4859==    by 0x4E3ACA1: numa_node_to_cpus@@libnuma_1.2 (libnuma.c:1352)
==4859==    by 0x4E3AE99: numa_node_of_cpu (libnuma.c:1412)
==4859==    by 0x400786: main (numa.cc:11)
==4859== 
==4859== 8,192 bytes in 1 blocks are still reachable in loss record 3 of 3
==4859==    at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4859==    by 0x4E3A885: init_node_cpu_mask_v2 (libnuma.c:1240)
==4859==    by 0x4E3ABE6: numa_node_to_cpus@@libnuma_1.2 (libnuma.c:1333)
==4859==    by 0x4E3AE99: numa_node_of_cpu (libnuma.c:1412)
==4859==    by 0x400786: main (numa.cc:11)
==4859== 
==4859== LEAK SUMMARY:
==4859==    definitely lost: 0 bytes in 0 blocks
==4859==    indirectly lost: 0 bytes in 0 blocks
==4859==      possibly lost: 0 bytes in 0 blocks
==4859==    still reachable: 8,720 bytes in 3 blocks
==4859==         suppressed: 0 bytes in 0 blocks
==4859== 
==4859== For counts of detected and suppressed errors, rerun with: -v
==4859== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

The problems are in libnuma.c, numa_node_to_cpus_v2():

1322 int
1323 numa_node_to_cpus_v2(int node, struct bitmask *buffer)
1324 {
...
1352         mask = numa_allocate_cpumask();
...
1380         if (node_cpu_mask_v2[node]) {
1381                 /* how could this be? */
1382                 if (mask != buffer)
1383                         numa_bitmask_free(mask);
1384         } else {
1385                 /* we don't want to cache faulty result */
1386                 if (!err)
1387                         node_cpu_mask_v2[node] = mask;
1388                 else
1389                         numa_bitmask_free(mask);
1390         }
...

A code path via line 1387 is problematic because 'mask' is saved for
later use but the elements of node_cpu_mask_v2[] are never freed.
Also the array node_cpu_mask_v2[] itself is never freed either,
allocated at:

1237 static init_node_cpu_mask_v2(void)
1238 {
...
1240         node_cpu_mask_v2 = calloc (nnodes, sizeof(struct bitmask *));
...

Missing changelog

Hi,
I cannot find any changelog for the libnuma... I am using an older version and I wonder what's changed and what I should change in my program if I upgrade...

Thanks

Functions used to set memory policy should return -1 on error ?

set_mempolicy will return -1 on error, but functions like numa_set_membind return void. It looks like there is no way for the caller to know about whether numa_set_membind succeed.

2.0.12: numactrl isn't LTO ready

Without LTO optimisation everything is OK.

/bin/sh ./libtool  --tag=CC   --mode=link gcc -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -flto   -Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -flto -fuse-linker-plugin -o numactl numactl.o util.o shm.o libnuma.la 
libtool: link: gcc -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -flto -Wl,-z -Wl,relro -Wl,--as-needed -Wl,-z -Wl,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -flto -fuse-linker-plugin -o .libs/numactl numactl.o util.o shm.o  ./.libs/libnuma.so
/usr/bin/ld: /tmp/numactl.ikwiBY.ltrans0.ltrans.o: in function `show_physcpubind':
/home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:115: undefined reference to `numa_sched_getaffinity'
/usr/bin/ld: /tmp/numactl.ikwiBY.ltrans0.ltrans.o: in function `main':
/home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:140: undefined reference to `numa_get_run_node_mask'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:143: undefined reference to `numa_get_interleave_mask'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:144: undefined reference to `numa_get_membind'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:509: undefined reference to `numa_tonodemask_memory'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:449: undefined reference to `numa_interleave_memory'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:489: undefined reference to `numa_sched_setaffinity'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:228: undefined reference to `numa_node_to_cpus'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:511: undefined reference to `numa_set_membind'
/usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:451: undefined reference to `numa_set_interleave_mask'
collect2: error: ld returned 1 exit status

Invalid writes on Power 9

Hi, I have recently noticed that on P9 running Red Hat Enterprise Linux Server 7.4 Beta (Pegas)
there are some invalid writes occurring in numa_distance().

It can be reproduced fairly easily with the distance.c testcase in the tests directory.

==54619== Invalid write of size 4
==54619== at 0x40C9210: numa_distance (in /usr/lib64/libnuma.so.1)
==54619== by 0x100009A3: main (distance.c:17)
==54619== Address 0x4372860 is 0 bytes after a block of size 262,144 alloc'd
==54619== at 0x4086640: calloc (vg_replace_malloc.c:711)
==54619== by 0x40C92EF: numa_distance (in /usr/lib64/libnuma.so.1)
==54619== by 0x100009A3: main (distance.c:17)
==54619==

I cannot reproduce on Power 8 hardware.

test/distance uses wrong node names

Test uses the wrong node names by just traversing the number of nodes instead of the actual node names.

Ex:
node 0 8
is used as node 0 and node 1 and test fails with the following error

cd test && ./regress2
./../test/distance
000: 010 000 
001: 1: self distance is not 10 (0)
./../test/distance FAILED!!!!
Makefile:1850: recipe for target 'regress2' failed
make: *** [regress2] Error 1

Actual node is 008, hence fails with node value.

what is the return value of numa_get_run_node_mask() ?

Hi,

When I'm reading the man page, I found it says:

numa_get_run_node_mask() returns a mask of CPUs on which the current task is allowed to run.

But after reading the header file /usr/include/numa.h it says:

/* Return current mask of nodes the task can run on */
struct bitmask * numa_get_run_node_mask(void);

I wrote a demo for checking the output of this function:

#include <stdio.h>
#include <numa.h>

int main() {
    int available;
    int max_node;
    struct bitmask *bm;
    int cpus;
    int i;

    if ((available = numa_available()) < 0) {
        printf("numa not supported.\n");
        return -1;
    }
    max_node = numa_max_node();
    printf("max_node = %d\n", max_node);

    bm = numa_get_run_node_mask();
    for (i = 0; i < bm->size; i++) {
        if (numa_bitmask_isbitset(bm, i)) {
            printf("bit %d is set.\n", i);
        }
    }
    return 0;
}

The output of this demo:

max_node = 1
bit 0 is set.
bit 1 is set.

According to the output, I preffer the return value of this function is current mask of nodes the task can run on as told by /usr/include/numa.h.

But I'm not sure whether I misunderstand, since I found there was a patch for man page:

From: Cliff Wickman cpw@xxxxxxx
Correct the man page for numa_get_run_node_mask().
It returns a mask of cpus, not nodes.
Signed-off-by: Cliff Wickman cpw@xxxxxxx

So what is exactly the return value of this function?

Incorrect system call numbers on s390x

syscall.c contains this:

#elif defined(__s390x__)

#define __NR_mbind 235
#define __NR_get_mempolicy 236
#define __NR_set_mempolicy 237
#define __NR_migrate_pages 238
#define __NR_move_pages    239

Those numbers are incorrect.

sysdeps/unix/sysv/linux/s390/s390-64/arch-syscall.h in the glibc source tree contains a table of system call numbers and lists:

#define __NR_mbind 268
#define __NR_get_mempolicy 269
#define __NR_set_mempolicy 270
#define __NR_migrate_pages 287
#define __NR_move_pages 310

These numbers are regularly verified using cross compilers against the current sources and therefore known to be correct.

Adding the license file

It appears the license file (or files) are missing from this repo. Could we please include them? This is important both for people auditing their dependencies and package repos relaying this information to their users.

Need help to bind CPU for running hybrid code

Dear friends
The problem that I faced is I could not bind CPUs yet to test my hybrid code.
I tried various methods which I found on internet, but the main
problem is as we have different version of Linux and MPI, there is no general solution for doing that.
Any solutions are specific for their computers and operating systems.
The last and proper solution that I found is following:

module load gcc/5.2.1
module load openmpi-x86_64

export OMP_SCHEDULE="dynamic,200"
export OMP_NUM_THREADS=32
export OMP_PLACES=threads
export OMP_PROC_BIND=spread

numactl --all
numactl -N 0,1 > dbind.txt
numactl -C 0-15,32-47
numactl -C 1-31,48-63
numactl --show > dcpu.txt
mpirun -np 2 --map-by ppr:32ockete=2 ./pjet.gfortran > output.txt

I am using module (Open MPI) 1.8.1, Right now, I do not face with any errors. I changed NUMA setting and played with mpirun flags(as you could see above),
but it seems that openmp is not working in this condition. As computational time do not vary in different cases.
(it did not reduce or increase (in virtual thread cases) at all even if assuming export OMP_NUM_THREADS=16 or export OMP_NUM_THREADS=1)

These are some result which I got.
OpenMPI 1.8
mpirun -np 4 -x OMP_NUM_THREADS=1 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.475E+02
mpirun -np 4 -x OMP_NUM_THREADS=8 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.525E+02
mpirun -np 4 -x OMP_NUM_THREADS=16 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.611E+02

mvapich
mpirun -np 4 -genv OMP_NUM_THREADS 1 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.441E+02
mpirun -np 4 -genv OMP_NUM_THREADS 4 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.535E+02
mpirun -np 4 -genv OMP_NUM_THREADS 8 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.552E+02
mpirun -np 4 -genv OMP_NUM_THREADS 16 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.591E+02

mvapich2
mpirun -np 4 -genv OMP_NUM_THREADS 1 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.935E+02
mpirun -np 4 -genv OMP_NUM_THREADS 4 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 5.562E+02
mpirun -np 4 -genv OMP_NUM_THREADS 8 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 6.392E+02
mpirun -np 4 -genv OMP_NUM_THREADS 16 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 8.170E+02
As you can see in the "Total time result.txt"I played with the number of threads in various cases while running my main code with (Openmpi,Mpich,Mpich2).
As you can see in OpenMPi and Mpich it seems that openmp did not work at all as Total time did not change considerably. But in Mpich2 Total computational time increased with the increasing of the number of threads. It could be because of using virtual threads instead of physical threads.

Can you please tell me what else can I do for solving that?
Can you please tell me that am I in right direction? do you have any recommendation for that?
Best regards

Incorrect size in `malloc`

...I'm dumb please ignore! :-P

New version?🤔

Looking on v2.0.14...master I think that it would be good to flush all those changes as new release 😋

Compile problem; not finding .libs/libnuma.so

installation (make) cannot find libnuma.so . technically, the file does exist, but it is a broken link to nothing.

...
make  all-am
make[1]: Entering directory `/scratch/myoder96/Downloads/SeisSol/numactl'
  CC       numactl.o
  CC       util.o
  CC       shm.o
  CC       libnuma.lo
  CC       syscall.lo
  CC       distance.lo
  CC       affinity.lo
  CC       sysfs.lo
  CC       rtnetlink.lo
  CCLD     libnuma.la
  CCLD     numactl
icc: error #10236: File not found:  './.libs/libnuma.so'
make[1]: *** [numactl] Error 1
make[1]: Leaving directory `/scratch/

checktopology fails on sparc

cd test && ./regress2
./../test/distance
000: 010

./../test/nodemap
0: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

./../test/checkaffinity

./../test/checktopology
numactl --hardware cpus look bogus
./../test/checktopology FAILED!!!!
make: *** [Makefile:1997: regress2] Error 1

The problem in all this is the expected output of the testcase. That one wants to count CPUs by calling "grep -c processor /proc/cpuinfo", which doesn't help on sparc as the layout of this file is different there:

cpu             : UltraSparc T2 (Niagara2)
fpu             : UltraSparc T2 integrated FPU
pmu             : niagara2
prom            : OBP 4.33.6 2012/03/14 08:07
type            : sun4v
ncpus probed    : 64
ncpus active    : 64
D$ parity tl1   : 0
I$ parity tl1   : 0
cpucaps         : flush,stbar,swap,muldiv,v9,blkinit,n2,mul32,div32,v8plus,popc,vis,vis2,ASIBlkInit
Cpu0ClkTck      : 000000005458c3a0
Cpu1ClkTck      : 000000005458c3a0
Cpu2ClkTck      : 000000005458c3a0
...
Cpu62ClkTck     : 000000005458c3a0
Cpu63ClkTck     : 000000005458c3a0
MMU Type        : Hypervisor (sun4v)
MMU PGSZs       : 8K,64K,4MB,256MB
State:
CPU0:           online
CPU1:           online
CPU2:           online
...

General question on `numa_run_on_node`

I was reading:

/* Run current task only on node */
int numa_run_on_node(int node);

I would like to make sure that I have a correct understand of this. What does "task" refer in this case - is it also threads?

My goal is to split different threads (from the tbb arena) of the same process to different NUMA domains. Currently what I do is:

spawn 1 thread for each NUMA domain
for each thread restrict it to a different NUMA domain as:

numa_run_on_node(i);
numa_run_on_node_mask(numa_all_nodes_ptr);

From each thread spawn X threads, where X is the number of cores inside of each domain.

Does that make sense?

Why I always get this error when I try to run numactl?

Hi, All:

I build numactl from the source. Everything goes fine.

But after that, I try to use this command. It always shows the following error.

numactl: /usr/lib/x86_64-linux-gnu/libnuma.so.1: version `libnuma_1.5' not found (required by numactl).

Could anyone can tell me the reason?
Thanks in advance.

numactl --membind not working?

Hi,

I am new to NUMA stuffs. Hope that I don't raise a bogus bug.

I started my script with taskset and numactl:

$ taskset -c 19 numactl --membind=1 bash spin.sh &
[1] 98621

However, numastat indicated that some of the memory wasn't allocated on node 1:

$ sudo numastat -p 98621

Per-node process memory usage (in MBs) for PID 98621 (bash)
                           Node 0          Node 1           Total
                  --------------- --------------- ---------------
Huge                         0.00            0.00            0.00
Heap                         0.00            0.03            0.03
Stack                        0.00            0.02            0.02
Private                      1.14            0.13            1.27
----------------  --------------- --------------- ---------------
Total                        1.14            0.18            1.32

Is it expected? What does "private" mean? "Private" to process right?

spin.sh is very simple:

i=0
while true; do
        i=i+1
done

Thanks!

`numa_sched_getaffinity` always return -1

I am trying to get the external numactl information inside code. For example:

numactl -C 0-1 -m 0 ./main

Users set this process to running on Core 0-1.

Inside the main application, we want to get these core affinity information. So I use libnuma as

struct bitmask *bm;
int ncpus = numa_num_configured_cpus();
bm = numa_bitmask_alloc(ncpus);
auto result = numa_sched_getaffinity(0, bm);
numa_bitmask_free(bm);

Here I find numa_sched_getaffinity always return -1. Is there anything I mis-understand for the API?