numactl / numactl Goto Github PK
View Code? Open in Web Editor NEWNUMA support for Linux
License: GNU General Public License v2.0
NUMA support for Linux
License: GNU General Public License v2.0
I am trying to move a page to NUMA 0:
if (0 != numa_move_pages(getpid(), nPages, &ptrToSHM, nodes, status, MPOL_MF_MOVE_ALL)) {
std::cout << "failed to move pages for /dev/shm/" << name << " to NUMA " << numaNodeID << " because " << strerror(errno) << std::endl;
}
It returns zero, and then I check where those pages are on after move_pages() called:
if (0 != numa_move_pages(0, nPages, &ptrToSHM, nullptr, status, 0)) {
std::cout << "failed to inquiry pages for /dev/shm/" << name << " because " << strerror(errno) << std::endl;
}
else {
for (uint32_t i = 0; i < nPages; i++) {
std::cout << "/dev/shm/" << name << "'s page # " << i << " locate at numa node " << status[i] << std::endl;
}
}
And it prints:
/dev/shm/test's page # 0 locate at numa node -2
/dev/shm/test's page # 1 locate at numa node -14
I looked at numactl/libnuma.c, numa_move_pages() just calls move_pages() directly.
According to move_pages()'s manpage, it states: nodes is an array of integers that specify the desired location for each page. Each element in the array is a node number. nodes can also be NULL, in which case move_pages() does not move any pages but instead will return the node where each page currently resides, in the status array. Obtaining the status of each page may be necessary to determine pages that need to be moved.
I wonder why it prints negative values although both moving pages and querying return success.
Thanks!
P.S. I am using libnuma 2.0.9 from CentOS 7's repos.
move_pages test application is failing " A minimum of 2 nodes is required for this test." on Numa system.
Looks like nr_nodes should be like below.
--- a/test/move_pages.c
+++ b/test/move_pages.c
@@ -28,7 +28,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
nr_nodes = numa_max_node();
nr_nodes = numa_max_node() + 1;
if (nr_nodes < 2) {
printf("A minimum of 2 nodes is required for this test.\n");
In 2.0.14, the SYMVER
macro escapes quotes. For instance
SYMVER("numa_sched_getaffinity_v2", "numa_sched_getaffinity@@libnuma_1.2")
becomes
__asm__ (".symver " "\"numa_sched_getaffinity_v2\"" "," "\"numa_sched_getaffinity@@libnuma_1.2\"");
gcc 4.8.5, the default compiler on RHEL/CentOS 7, does not handle the escaped quotes:
/tmp/cckm8DwN.s: Assembler messages:
/tmp/cckm8DwN.s:3: Error: Missing symbol name in directive
/tmp/cckm8DwN.s:3: Error: expected comma after name in .symver
More recent compilers do not seem to have this limitation, e.g., gcc 8.3.1, handling the escaped quotes fine.
This change works for me on both 4.8.5 and 8.3.1:
--- a/util.h 2020-10-08 10:08:40.517167202 -0700
+++ b/util.h 2020-10-08 10:08:55.523301155 -0700
@@ -22,5 +22,5 @@
#if HAVE_ATTRIBUTE_SYMVER
#define SYMVER(a,b) __attribute__ ((symver (b)))
#else
-#define SYMVER(a,b) __asm__ (".symver " #a "," #b);
+#define SYMVER(a,b) __asm__ (".symver " a "," b);
#endif
In memhog.c, the variable loose is iniltialzed to 0. It is then set to 1 in an else statement on line 105 . However, the connected if statement includes an exit. Thus if the program doesn't exit, it sets loose =1. Therefore, the usage of loose in lines 132 and 136 are redundant. Should loose be completely removed or does it serve some purpose that is now broken?
If I understand the code correctly, this function https://github.com/numactl/numactl/blob/master/numactl.c#L228 leaks cpus bitmask.
Would it be possible to make a new release?
The URL http://oss.sgi.com/projects/libnuma/ from the GitHub description redirects to https://www.hpe.com/us/en/solutions/hpc-high-performance-computing.html. Is that wanted?
It's currently impossible to write a malloc that uses libnuma because libnuma uses malloc during initialization for the cpu masks. Should use some simple mmap/brk based allocator to break this dependency.
It looks like the project website and the ftp server are down: http://oss.sgi.com/projects/libnuma/ and ftp://oss.sgi.com/www/projects/libnuma/download/.
Trying to compile the nodemask test as a standalone test against an install of numactl gives me this error:
ld.lld: error: undefined symbol: nodemask_isset
>>> referenced by nodemask.c
Using the clang-10 compiler.
If I change:
printf("numa_get_run_node_mask nodemask_isset returns=0x%lx\n", nodemask_isset(&nodemask, i));
to
printf("numa_get_run_node_mask nodemask_isset returns=0x%lx\n", nodemask_isset_compat(&nodemask, i));
It compiles fine. Is this a problem with Clang doing something non-standard? Or does the test need updated? Or is it just not designed to be built outside the numactl source tree?
I would like to use this tool on Redhat linux, can I directly build and run it by pulling code from github?
lld is the LLVM linker.
I attempted to compile numactl at f567a26 with lld:
$ ./autogen.sh
$ ./configure
$ make
and encountered the following errors:
ld: error: duplicate symbol 'set_mempolicy' in version script
ld: error: duplicate symbol 'get_mempolicy' in version script
ld: error: duplicate symbol 'mbind' in version script
ld: error: duplicate symbol 'numa_alloc' in version script
ld: error: duplicate symbol 'numa_alloc_interleaved' in version script
ld: error: duplicate symbol 'numa_alloc_local' in version script
ld: error: duplicate symbol 'numa_alloc_onnode' in version script
ld: error: duplicate symbol 'numa_available' in version script
ld: error: duplicate symbol 'numa_distance' in version script
ld: error: duplicate symbol 'numa_error' in version script
ld: error: duplicate symbol 'numa_exit_on_error' in version script
ld: error: duplicate symbol 'numa_free' in version script
ld: error: duplicate symbol 'numa_get_interleave_node' in version script
ld: error: duplicate symbol 'numa_max_node' in version script
ld: error: duplicate symbol 'numa_migrate_pages' in version script
ld: error: duplicate symbol 'numa_node_size64' in version script
ld: error: duplicate symbol 'numa_node_size' in version script
ld: error: duplicate symbol 'numa_pagesize' in version script
ld: error: duplicate symbol 'numa_police_memory' in version script
ld: error: duplicate symbol 'numa_preferred' in version script
ld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors)
collect2: error: ld returned 1 exit status
I am on Fedora 29 and I'm using lld at this version:
$ ld.lld --version
LLD 7.0.0 (compatible with GNU linkers)
$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70
node 0 size: 128702 MB
node 0 free: 85302 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
node 1 size: 0 MB
node 1 free: 0 MB
node distances:
node 0 1
0: 10 20
1: 20 10
$ numactl --cpunodebind=1 ./a.out
libnuma: Warning: node argument 1 is out of range
...
There are 36 cpus on node 1 but I fail to bind it. I believe it's reasonable though there's no memory attaching on it.
Also, I noticed that using --membind=0
explicitly greatly reduce minor page faults. What could possibly cause this?
void numa_police_memory(void mem, size_t size)
{
int pagesize = numa_pagesize_int();
unsigned long i;
for (i = 0; i < size; i += pagesize)
((volatile char)mem)[i] = ((volatile char*)mem)[i];
}
is unsafe as it modifies the shared memory area which might interleave with other store operations, performed by other processes or threads, at the same time as numa_police_memory is executing.
It's commonly installed, so should have some documentation
The bind_range
test fails whenever cgroups are used to confine processes to certain CPUs. More specifically, the test fails on the command:
numactl --physcpubind=$HIGHESTCPU ls
The variable $HIGHESTCPU
is obtained from the last entry in /proc/cpuinfo
. However, if the last CPU does not belong to the cgroup, then numactl
fails with sched_setaffinity: Invalid argument
.
A possible fix could be, if cgroups are used, then have the test select the highest CPU number in the cgroup's cpuset.
For example, for the particular system I am using, the number of CPUs is currently obtained from /proc/cpuinfo
as follows:
$ cat /proc/cpuinfo | grep 'processor' | tail -n1
processor : 79
The CPUs in the current cpuset can probably be obtained with something like this:
$ cat /sys/fs/cgroup/cpuset/$(cat /proc/$$/task/$$/cpuset)/cpuset.cpus
16-31,56-71
Hello,
I noticed that running numastat -cm on a system whith HugePages allocated manually by numa nodes provides somewat misleading information.
Given we have a host with two numa nodes
$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 1007 MB
node 0 free: 953 MB
node 1 cpus: 1
node 1 size: 987 MB
node 1 free: 900 MB
node distances:
node 0 1
0: 10 20
1: 20 10
Allocate 10 pages for each node:
echo 10 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 10 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
$numastat -cm
Per-node system memory usage (in MBs):
Token Node not in hash table.
Token Node not in hash table.
Token Node not in hash table.
Token Node not in hash table.
Node 0 Node 1 Total
------ ------ -----
MemTotal 1007 988 1995
MemFree 953 900 1854
MemUsed 54 87 141
Active 1 19 20
Inactive 3 17 20
Active(anon) 0 3 3
Inactive(anon) 0 2 3
Active(file) 1 17 17
Inactive(file) 2 14 17
Unevictable 0 0 0
Mlocked 0 0 0
Dirty 0 0 0
Writeback 0 0 0
FilePages 3 33 37
Mapped 1 8 10
AnonPages 0 4 4
Shmem 0 0 1
KernelStack 1 1 2
PageTables 0 0 1
NFS_Unstable 0 0 0
Bounce 0 0 0
WritebackTmp 0 0 0
Slab 18 19 38
SReclaimable 7 8 15
SUnreclaim 12 11 23
AnonHugePages 0 0 0
HugePages_Total 20 20 40
HugePages_Free 20 20 40
HugePages_Surp 0 0 0
Notice HugePages_Total reported as 20 per each node, which is misleading.
$ cat /proc/sys/vm/nr_hugepages
20
This only starts to happen recently,
if (numa_available() != -1)
{
numa_set_interleave_mask(numa_all_nodes_ptr);
}
it fails to set interleave mask after returning set_mempolicy: Invalid argument
System: CentOS Linux release 7.5.1804 (Core)
Toolchain: gcc 8.2.0
packages:
numactl-devel-2.0.9-7.el7.x86_64
numactl-2.0.9-7.el7.x86_64
numactl-libs-2.0.9-7.el7.x86_64
Is there mail list?
On github, I find Pull requests is 0, but https://travis-ci.org/numactl/numactl/pull_requests is active.
But I don't know how to send patch to https://travis-ci.org/numactl/numactl/pull_requests.
Thanks,
--Hongzhi
Hi!
I've been trying to use MPOL_LOCAL
as the mode
to mbind
but the symbol is not defined.
#include <numaif.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#define N 134217728
int main() {
uint64_t *a = (uint64_t*) malloc(N*sizeof(uint64_t));
mbind(a, N, MPOL_LOCAL, 0, 0, MPOL_MF_STRICT | MPOL_MF_MOVE);
printf("Hello world!\n");
return 0;
}
$ gcc-8 -lnuma example.c
example.c: In function โmainโ:
example.c:10:14: error: โMPOL_LOCALโ undeclared (first use in this function); did you mean โMPOL_MAXโ?
mbind(a, N, MPOL_LOCAL, 0, 0, MPOL_MF_STRICT | MPOL_MF_MOVE);
^~~~~~~~~~
MPOL_MAX
example.c:10:14: note: each undeclared identifier is reported only once for each function it appears in
As a user of libnuma, I activate complier warnings and I get many of them after including the numa.h (v2.0.12). Especially the strict-prototypes warning appears many times. The used compiler is gcc 7 with -Wall -Wextra and -Wstrict-prototypes.
When the node specific meminfo files have field(s) that numastat doesn't know about it prints the cryptic message:
"Token Node not in hash table."
It would be more helpful to print the token that is missing, rather than the first token on the line (typically the word "Node").
printf("Token %s not in hash table.\n", tok[0 + tok_offset]);
Hello,
I capture numastat logs before and after running a command like this:
numastat > numastat.start ; COMMAND; numastat > numastat.end
Is there a tool to show numerical differences between numastat.end
and numastat.start
files? I want to subtract the end values
from the start values
while keeping the file format.
I'm thinking about writing a small tool in Python for that, but I don't want to reinvent the wheel. I think I'm not the first one who is trying to solve this:-)
Any hint or advice?
Below are the example files, for which I'm trying to compute the numerical differences. I'm using Google Sheet for it right now, but I'm looking for an automated way.
Thanks a lot
Jirka
$ more numastat.0.start
node0 node1 node2 node3 node4 node5 node6 node7
numa_hit 2941595 1849807 6654372 4838353 12229828 2819605 2426858 3311826
numa_miss 0 0 0 0 0 0 0 0
numa_foreign 0 0 0 0 0 0 0 0
interleave_hit 19833 19801 19837 19809 19825 19812 19835 19805
local_node 2932819 1822473 6632791 4811102 12191273 2792358 2399298 3278575
other_node 8776 27334 21581 27251 38555 27247 27560 33251
$ more numastat.0.end
node0 node1 node2 node3 node4 node5 node6 node7
numa_hit 2954256 1849828 6665977 4838617 12264020 2819647 2426858 3311826
numa_miss 0 0 0 0 0 0 0 0
numa_foreign 0 0 0 0 0 0 0 0
interleave_hit 19833 19801 19837 19809 19825 19812 19835 19805
local_node 2945472 1822494 6644392 4811366 12225461 2792400 2399298 3278575
other_node 8784 27334 21585 27251 38559 27247 27560 33251
$ more numastat.0.diff (produced with help of Google Sheets, but I'm looking for a command-line utility to produce the same result. The side effect is that formatting is broken.)
node0 node1 node2 node3 node4 node5 node6 node7
numa_hit 12661 21 11605 264 34192 42 0 0
numa_miss 0 0 0 0 0 0 0 0
numa_foreign 0 0 0 0 0 0 0 0
interleave_hit 0 0 0 0 0 0 0 0
local_node 12653 21 11601 264 34188 42 0 0
other_node 8 0 4 0 4 0 0 0
(By reading the code, this looks like known behavior. But I figured I could re-open and ask a few questions about it.)
While helping @sanskriti-s debug some of the valgrind complaints #43, we noticed strange behavior of the libnuma constructor and destructor routines.
If adding some printk debugging:
diff --git a/libnuma.c b/libnuma.c
index eb995678ad54..a80f851f65c6 100644
--- a/libnuma.c
+++ b/libnuma.c
@@ -92,6 +92,7 @@ numa_init(void)
{
int max,i;
+printf("DEBUG: %s\n", __func__);
if (sizes_set)
return;
@@ -111,6 +112,7 @@ numa_init(void)
void __attribute__((destructor))
numa_fini(void)
{
+printf("DEBUG: %s\n", __func__);
FREE_AND_ZERO(numa_all_cpus_ptr);
FREE_AND_ZERO(numa_possible_cpus_ptr);
FREE_AND_ZERO(numa_all_nodes_ptr);
Things appear as expected when building a static libnuma.a:
% make clean
% ./autogen.sh
% ./configure --enable-static --enable-shared=no
% make
% ./numademo | grep DEBUG
DEBUG: numa_init
DEBUG: numa_fini
However, when building a dynamic shared libnuma.so, both are executed twice:
% make clean
% ./autogen.sh
% ./configure --enable-static=no --enable-shared=yes
% make
% LD_LIBRARY_PATH=$(pwd)/.libs .libs/numademo | grep DEBUG
DEBUG: numa_init
DEBUG: numa_init
DEBUG: numa_fini
DEBUG: numa_fini
Note that this 2x execution appears to be accounted for by commit ec6e455 "Use constructors for numa_init/exit".
However, if we strip the explicit linker directive to mark the -init and -fini routines from the Makefile.am:
diff --git a/Makefile.am b/Makefile.am
index 1c4266d43f26..c16a50780002 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -40,7 +40,7 @@ memhog_SOURCES = memhog.c util.c
memhog_LDADD = libnuma.la
libnuma_la_SOURCES = libnuma.c syscall.c distance.c affinity.c affinity.h sysfs.c sysfs.h rtnetlink.c rtnetlink.h versions.ldscript
-libnuma_la_LDFLAGS = -version-info 1:0:0 -Wl,--version-script,$(srcdir)/versions.ldscript -Wl,-init,numa_init -Wl,-fini,numa_fini
+libnuma_la_LDFLAGS = -version-info 1:0:0 -Wl,--version-script,$(srcdir)/versions.ldscript
check_PROGRAMS = \
test/distance
both libnuma.a and libnuma.so test cases only execute the constructor/destructors once:
% make clean
% ./autogen.sh
% ./configure --enable-static --enable-shared=no
% make
% ./numademo | grep DEBUG
DEBUG: numa_init
DEBUG: numa_fini
% ./autogen.sh
% ./configure --enable-static=no --enable-shared=yes
% make
% LD_LIBRARY_PATH=$(pwd)/.libs .libs/numademo | grep DEBUG
DEBUG: numa_init
DEBUG: numa_fini
I did some testing with a minimal libmin.a and libmin.so to verify this behavior and I ended up with these results:
We were curious as to what the intended libnuma behavior should be. Can we safely remove the init/fini linker directives from Makefile.am? We weren't sure, so we coded any #43 libnuma fixes with this double execution in mind.
-- Joe
getting following error while trying to do cpunodebind using numactl,
localhost:~ # numactl --cpunodebind=0 --membind=1 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
numa_sched_setaffinity_v2_int() failed: Invalid argument
sched_setaffinity: Invalid argument
localhost:~ #
llocalhost:~ # numactl -H
available: 4 nodes (0-3)
node 0 cpus:
node 0 size: 8664 MB
node 0 free: 2650 MB
node 1 cpus:
node 1 size: 5602 MB
node 1 free: 5587 MB
node 2 cpus: 0 1 2 3 4 5 6 7
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus:
node 3 size: 1023 MB
node 3 free: 123 MB
node distances:
node 0 1 2 3
0: 10 20 20 20
1: 20 10 20 20
2: 20 20 10 20
3: 20 20 20 10
localhost:~ #
localhost:~ # numactl --cpunodebind=2 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
libnuma: Warning: node argument 2 is out of range
<2> is invalid
localhost:~ #
===>>For other nodes it gives following errors .
localhost:~ # numactl --cpunodebind=0 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
numa_sched_setaffinity_v2_int() failed: Invalid argument
sched_setaffinity: Invalid argument
localhost:~ # numactl --cpunodebind=1 dd if=/dev/zero of=/dev/mapper/360050768108001b3a80000000000010b bs=1M count=1024
numa_sched_setaffinity_v2_int() failed: Invalid argument
sched_setaffinity: Invalid argument
I have tested it on Arm64 platfrom, then found the below error:
/tmp # whoami
root
/ # numactl
numactl: error while loading shared libraries: libnuma.so.1: failed to map segment from shared object
Used strace command catch this log:
1528 execve("/bin/numactl", ["numactl"], 0x7fdff72d00 /* 15 vars */) = 0
1528 brk(NULL) = 0x55ab640000
1528 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f94393000
1528 faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = 0
1528 openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=60, ...}) = 0
1528 mmap(NULL, 60, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f94392000
1528 close(3) = 0
1528 openat(AT_FDCWD, "/lib/libc_stubs.so", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=67736, ...}) = 0
1528 mmap(NULL, 131096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94349000
1528 mprotect(0x7f9434c000, 114688, PROT_NONE) = 0
1528 mmap(0x7f94368000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x7f94368000
1528 close(3) = 0
1528 openat(AT_FDCWD, "/lib/libc++.so.1.0", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=723248, ...}) = 0
1528 mmap(NULL, 799080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94285000
1528 mprotect(0x7f9432e000, 65536, PROT_NONE) = 0
1528 mmap(0x7f9433e000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xa9000) = 0x7f9433e000
1528 mmap(0x7f94346000, 8552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f94346000
1528 close(3) = 0
1528 openat(AT_FDCWD, "/lib/libc++abi.so.1.0", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0644, st_size=198872, ...}) = 0
1528 mmap(NULL, 263464, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94244000
1528 mprotect(0x7f94270000, 65536, PROT_NONE) = 0
1528 mmap(0x7f94280000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2c000) = 0x7f94280000
1528 close(3) = 0
1528 munmap(0x7f94392000, 60) = 0
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/tls", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/mnt/fileroot/chunguo.feng/opensource/numactl/build/lib", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/tls/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/tls", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/aarch64/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/aarch64/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/aarch64/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/aarch64", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/cpuid/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
1528 newfstatat(AT_FDCWD, "/tmp/build/lib/cpuid", 0x7fe0c59620, 0) = -1 ENOENT (No such file or directory)
1528 openat(AT_FDCWD, "/tmp/build/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = 3
1528 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0P3\0\0\0\0\0\0"..., 832) = 832
1528 fstat(3, {st_mode=S_IFREG|0777, st_size=192304, ...}) = 0
1528 mmap(NULL, 123752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = -1 EPERM (Operation not permitted)
1528 close(3) = 0
1528 writev(2, [{iov_base="numactl", iov_len=7}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libnuma.so.1", iov_len=12}, {iov_base=": ", iov_len=2}, {iov_base="failed to map segment from share"..., iov_len=40}, {iov_base="", iov_len=0}, {iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 10) = 102
1528 exit_group(127) = ?
1528 +++ exited with 127 +++
/tmp #
This error was from mmap(NULL, 123752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = -1 EPERM (Operation not permitted), used root user exec it
numactl man page has incorrect information about --localalloc option.
Current Text:-
--localalloc, -l
Always allocate on the current node.
It should be
--localalloc, -l
Try to allocate on the current node of the process, but if memory cannot be allocated there fall back to other nodes.
does anyone add package for openwrt?
in the file numa.h line 319
int numa_sched_setaffinity(pid_t, struct bitmask *);
no realizetion
int numa_sched_setaffinity(pid_t, struct bitmask *)
{
//how do you do ?
}
I compiled libnuma.so by aarch64-linux-gnu-gcc, it goes well, but when load libnuma.so, it tells:
/opt/sysroot-aarch64/usr/lib/libnuma.so: undefined reference to `find_first'
Is there anyone help me to figure out what i did wrong?
Hi!
After some package-updates+kernel upgrade I only have 4 cores but I have 8 cores on my machine.
CPU core 1,3,5,7 are offline.
I can't bring it online.
Output:
available: 1 nodes (0)
node 0 cpus: 0 2 4 6
node 0 size: 32142 MB
node 0 free: 29784 MB
node distances:
node 0
0: 10
How can I remove node 0 cpus: 0 2 4 6?
Hello,
$ cat numa.cc
#include <numa.h>
#include <stdio.h>
int
main(int args, char** argv)
{
if (numa_available() == -1) {
return(1);
}
printf("%d\n", numa_node_of_cpu(0));
return(0);
}
$ g++ -g numa.cc -o numa -lnuma
$ valgrind --leak-check=full --show-reachable=yes ./numa
==4859== Memcheck, a memory error detector
==4859== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4859== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4859== Command: ./numa
==4859==
0
==4859==
==4859== HEAP SUMMARY:
==4859== in use at exit: 8,720 bytes in 3 blocks
==4859== total heap usage: 35 allocs, 32 frees, 81,614 bytes allocated
==4859==
==4859== 16 bytes in 1 blocks are still reachable in loss record 1 of 3
==4859== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4859== by 0x4E38A5E: numa_bitmask_alloc (libnuma.c:206)
==4859== by 0x4E39846: numa_allocate_cpumask (libnuma.c:703)
==4859== by 0x4E3ACA1: numa_node_to_cpus@@libnuma_1.2 (libnuma.c:1352)
==4859== by 0x4E3AE99: numa_node_of_cpu (libnuma.c:1412)
==4859== by 0x400786: main (numa.cc:11)
==4859==
==4859== 512 bytes in 1 blocks are still reachable in loss record 2 of 3
==4859== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4859== by 0x4E38A8D: numa_bitmask_alloc (libnuma.c:210)
==4859== by 0x4E39846: numa_allocate_cpumask (libnuma.c:703)
==4859== by 0x4E3ACA1: numa_node_to_cpus@@libnuma_1.2 (libnuma.c:1352)
==4859== by 0x4E3AE99: numa_node_of_cpu (libnuma.c:1412)
==4859== by 0x400786: main (numa.cc:11)
==4859==
==4859== 8,192 bytes in 1 blocks are still reachable in loss record 3 of 3
==4859== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4859== by 0x4E3A885: init_node_cpu_mask_v2 (libnuma.c:1240)
==4859== by 0x4E3ABE6: numa_node_to_cpus@@libnuma_1.2 (libnuma.c:1333)
==4859== by 0x4E3AE99: numa_node_of_cpu (libnuma.c:1412)
==4859== by 0x400786: main (numa.cc:11)
==4859==
==4859== LEAK SUMMARY:
==4859== definitely lost: 0 bytes in 0 blocks
==4859== indirectly lost: 0 bytes in 0 blocks
==4859== possibly lost: 0 bytes in 0 blocks
==4859== still reachable: 8,720 bytes in 3 blocks
==4859== suppressed: 0 bytes in 0 blocks
==4859==
==4859== For counts of detected and suppressed errors, rerun with: -v
==4859== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
The problems are in libnuma.c, numa_node_to_cpus_v2():
1322 int
1323 numa_node_to_cpus_v2(int node, struct bitmask *buffer)
1324 {
...
1352 mask = numa_allocate_cpumask();
...
1380 if (node_cpu_mask_v2[node]) {
1381 /* how could this be? */
1382 if (mask != buffer)
1383 numa_bitmask_free(mask);
1384 } else {
1385 /* we don't want to cache faulty result */
1386 if (!err)
1387 node_cpu_mask_v2[node] = mask;
1388 else
1389 numa_bitmask_free(mask);
1390 }
...
A code path via line 1387 is problematic because 'mask' is saved for
later use but the elements of node_cpu_mask_v2[] are never freed.
Also the array node_cpu_mask_v2[] itself is never freed either,
allocated at:
1237 static init_node_cpu_mask_v2(void)
1238 {
...
1240 node_cpu_mask_v2 = calloc (nnodes, sizeof(struct bitmask *));
...
Hi,
I cannot find any changelog for the libnuma... I am using an older version and I wonder what's changed and what I should change in my program if I upgrade...
Thanks
set_mempolicy
will return -1 on error, but functions like numa_set_membind
return void. It looks like there is no way for the caller to know about whether numa_set_membind
succeed.
Without LTO optimisation everything is OK.
/bin/sh ./libtool --tag=CC --mode=link gcc -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -flto -Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -flto -fuse-linker-plugin -o numactl numactl.o util.o shm.o libnuma.la libtool: link: gcc -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -flto -Wl,-z -Wl,relro -Wl,--as-needed -Wl,-z -Wl,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -flto -fuse-linker-plugin -o .libs/numactl numactl.o util.o shm.o ./.libs/libnuma.so /usr/bin/ld: /tmp/numactl.ikwiBY.ltrans0.ltrans.o: in function `show_physcpubind': /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:115: undefined reference to `numa_sched_getaffinity' /usr/bin/ld: /tmp/numactl.ikwiBY.ltrans0.ltrans.o: in function `main': /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:140: undefined reference to `numa_get_run_node_mask' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:143: undefined reference to `numa_get_interleave_mask' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:144: undefined reference to `numa_get_membind' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:509: undefined reference to `numa_tonodemask_memory' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:449: undefined reference to `numa_interleave_memory' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:489: undefined reference to `numa_sched_setaffinity' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:228: undefined reference to `numa_node_to_cpus' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:511: undefined reference to `numa_set_membind' /usr/bin/ld: /home/tkloczko/rpmbuild/BUILD/numactl-2.0.12/numactl.c:451: undefined reference to `numa_set_interleave_mask' collect2: error: ld returned 1 exit status
Hi, I have recently noticed that on P9 running Red Hat Enterprise Linux Server 7.4 Beta (Pegas)
there are some invalid writes occurring in numa_distance().
It can be reproduced fairly easily with the distance.c testcase in the tests directory.
==54619== Invalid write of size 4
==54619== at 0x40C9210: numa_distance (in /usr/lib64/libnuma.so.1)
==54619== by 0x100009A3: main (distance.c:17)
==54619== Address 0x4372860 is 0 bytes after a block of size 262,144 alloc'd
==54619== at 0x4086640: calloc (vg_replace_malloc.c:711)
==54619== by 0x40C92EF: numa_distance (in /usr/lib64/libnuma.so.1)
==54619== by 0x100009A3: main (distance.c:17)
==54619==
I cannot reproduce on Power 8 hardware.
Test uses the wrong node names by just traversing the number of nodes instead of the actual node names.
Ex:
node 0 8
is used as node 0 and node 1 and test fails with the following error
cd test && ./regress2
./../test/distance
000: 010 000
001: 1: self distance is not 10 (0)
./../test/distance FAILED!!!!
Makefile:1850: recipe for target 'regress2' failed
make: *** [regress2] Error 1
Actual node is 008, hence fails with node value.
Hi,
When I'm reading the man page, I found it says:
numa_get_run_node_mask() returns a mask of CPUs on which the current task is allowed to run.
But after reading the header file /usr/include/numa.h
it says:
/* Return current mask of nodes the task can run on */
struct bitmask * numa_get_run_node_mask(void);
I wrote a demo for checking the output of this function:
#include <stdio.h>
#include <numa.h>
int main() {
int available;
int max_node;
struct bitmask *bm;
int cpus;
int i;
if ((available = numa_available()) < 0) {
printf("numa not supported.\n");
return -1;
}
max_node = numa_max_node();
printf("max_node = %d\n", max_node);
bm = numa_get_run_node_mask();
for (i = 0; i < bm->size; i++) {
if (numa_bitmask_isbitset(bm, i)) {
printf("bit %d is set.\n", i);
}
}
return 0;
}
The output of this demo:
max_node = 1
bit 0 is set.
bit 1 is set.
According to the output, I preffer the return value of this function is current mask of nodes the task can run on
as told by /usr/include/numa.h
.
But I'm not sure whether I misunderstand, since I found there was a patch for man page:
From: Cliff Wickman cpw@xxxxxxx
Correct the man page for numa_get_run_node_mask().
It returns a mask of cpus, not nodes.
Signed-off-by: Cliff Wickman cpw@xxxxxxx
So what is exactly the return value of this function?
syscall.c
contains this:
#elif defined(__s390x__)
#define __NR_mbind 235
#define __NR_get_mempolicy 236
#define __NR_set_mempolicy 237
#define __NR_migrate_pages 238
#define __NR_move_pages 239
Those numbers are incorrect.
sysdeps/unix/sysv/linux/s390/s390-64/arch-syscall.h
in the glibc source tree contains a table of system call numbers and lists:
#define __NR_mbind 268
#define __NR_get_mempolicy 269
#define __NR_set_mempolicy 270
#define __NR_migrate_pages 287
#define __NR_move_pages 310
These numbers are regularly verified using cross compilers against the current sources and therefore known to be correct.
It appears the license file (or files) are missing from this repo. Could we please include them? This is important both for people auditing their dependencies and package repos relaying this information to their users.
Dear friends
The problem that I faced is I could not bind CPUs yet to test my hybrid code.
I tried various methods which I found on internet, but the main
problem is as we have different version of Linux and MPI, there is no general solution for doing that.
Any solutions are specific for their computers and operating systems.
The last and proper solution that I found is following:
module load gcc/5.2.1
module load openmpi-x86_64
export OMP_SCHEDULE="dynamic,200"
export OMP_NUM_THREADS=32
export OMP_PLACES=threads
export OMP_PROC_BIND=spread
numactl --all
numactl -N 0,1 > dbind.txt
numactl -C 0-15,32-47
numactl -C 1-31,48-63
numactl --show > dcpu.txt
mpirun -np 2 --map-by ppr:32ockete=2 ./pjet.gfortran > output.txt
I am using module (Open MPI) 1.8.1, Right now, I do not face with any errors. I changed NUMA setting and played with mpirun flags(as you could see above),
but it seems that openmp is not working in this condition. As computational time do not vary in different cases.
(it did not reduce or increase (in virtual thread cases) at all even if assuming export OMP_NUM_THREADS=16 or export OMP_NUM_THREADS=1)
These are some result which I got.
OpenMPI 1.8
mpirun -np 4 -x OMP_NUM_THREADS=1 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.475E+02
mpirun -np 4 -x OMP_NUM_THREADS=8 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.525E+02
mpirun -np 4 -x OMP_NUM_THREADS=16 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.611E+02
mvapich
mpirun -np 4 -genv OMP_NUM_THREADS 1 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.441E+02
mpirun -np 4 -genv OMP_NUM_THREADS 4 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.535E+02
mpirun -np 4 -genv OMP_NUM_THREADS 8 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.552E+02
mpirun -np 4 -genv OMP_NUM_THREADS 16 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.591E+02
mvapich2
mpirun -np 4 -genv OMP_NUM_THREADS 1 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 4.935E+02
mpirun -np 4 -genv OMP_NUM_THREADS 4 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 5.562E+02
mpirun -np 4 -genv OMP_NUM_THREADS 8 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 6.392E+02
mpirun -np 4 -genv OMP_NUM_THREADS 16 -bind-to socket -map-by socket ./pjet.gfortran > output.txt
Total time = 8.170E+02
As you can see in the "Total time result.txt"I played with the number of threads in various cases while running my main code with (Openmpi,Mpich,Mpich2).
As you can see in OpenMPi and Mpich it seems that openmp did not work at all as Total time did not change considerably. But in Mpich2 Total computational time increased with the increasing of the number of threads. It could be because of using virtual threads instead of physical threads.
Can you please tell me what else can I do for solving that?
Can you please tell me that am I in right direction? do you have any recommendation for that?
Best regards
...I'm dumb please ignore! :-P
Looking on v2.0.14...master I think that it would be good to flush all those changes as new release ๐
installation (make) cannot find libnuma.so
. technically, the file does exist, but it is a broken link to nothing.
...
make all-am
make[1]: Entering directory `/scratch/myoder96/Downloads/SeisSol/numactl'
CC numactl.o
CC util.o
CC shm.o
CC libnuma.lo
CC syscall.lo
CC distance.lo
CC affinity.lo
CC sysfs.lo
CC rtnetlink.lo
CCLD libnuma.la
CCLD numactl
icc: error #10236: File not found: './.libs/libnuma.so'
make[1]: *** [numactl] Error 1
make[1]: Leaving directory `/scratch/
cd test && ./regress2
./../test/distance
000: 010
./../test/nodemap
0: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
./../test/checkaffinity
./../test/checktopology
numactl --hardware cpus look bogus
./../test/checktopology FAILED!!!!
make: *** [Makefile:1997: regress2] Error 1
The problem in all this is the expected output of the testcase. That one wants to count CPUs by calling "grep -c processor /proc/cpuinfo", which doesn't help on sparc as the layout of this file is different there:
cpu : UltraSparc T2 (Niagara2)
fpu : UltraSparc T2 integrated FPU
pmu : niagara2
prom : OBP 4.33.6 2012/03/14 08:07
type : sun4v
ncpus probed : 64
ncpus active : 64
D$ parity tl1 : 0
I$ parity tl1 : 0
cpucaps : flush,stbar,swap,muldiv,v9,blkinit,n2,mul32,div32,v8plus,popc,vis,vis2,ASIBlkInit
Cpu0ClkTck : 000000005458c3a0
Cpu1ClkTck : 000000005458c3a0
Cpu2ClkTck : 000000005458c3a0
...
Cpu62ClkTck : 000000005458c3a0
Cpu63ClkTck : 000000005458c3a0
MMU Type : Hypervisor (sun4v)
MMU PGSZs : 8K,64K,4MB,256MB
State:
CPU0: online
CPU1: online
CPU2: online
...
I was reading:
/* Run current task only on node */
int numa_run_on_node(int node);
I would like to make sure that I have a correct understand of this. What does "task" refer in this case - is it also threads?
My goal is to split different threads (from the tbb arena) of the same process to different NUMA domains. Currently what I do is:
numa_run_on_node(i);
numa_run_on_node_mask(numa_all_nodes_ptr);
Does that make sense?
Hi, All:
I build numactl from the source. Everything goes fine.
But after that, I try to use this command. It always shows the following error.
numactl: /usr/lib/x86_64-linux-gnu/libnuma.so.1: version `libnuma_1.5' not found (required by numactl).
Could anyone can tell me the reason?
Thanks in advance.
Hi,
I am new to NUMA stuffs. Hope that I don't raise a bogus bug.
I started my script with taskset and numactl:
$ taskset -c 19 numactl --membind=1 bash spin.sh &
[1] 98621
However, numastat indicated that some of the memory wasn't allocated on node 1:
$ sudo numastat -p 98621
Per-node process memory usage (in MBs) for PID 98621 (bash)
Node 0 Node 1 Total
--------------- --------------- ---------------
Huge 0.00 0.00 0.00
Heap 0.00 0.03 0.03
Stack 0.00 0.02 0.02
Private 1.14 0.13 1.27
---------------- --------------- --------------- ---------------
Total 1.14 0.18 1.32
Is it expected? What does "private" mean? "Private" to process right?
spin.sh is very simple:
i=0
while true; do
i=i+1
done
Thanks!
I am trying to get the external numactl
information inside code. For example:
numactl -C 0-1 -m 0 ./main
Users set this process to running on Core 0-1.
Inside the main application, we want to get these core affinity information. So I use libnuma as
struct bitmask *bm;
int ncpus = numa_num_configured_cpus();
bm = numa_bitmask_alloc(ncpus);
auto result = numa_sched_getaffinity(0, bm);
numa_bitmask_free(bm);
Here I find numa_sched_getaffinity
always return -1. Is there anything I mis-understand for the API?
Is there a bazelized version?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.