This is the first time I have encountered the problem, but it is puzzling.
From the serial log:
2019/06/05 06:08:35 dhcp4d.go:148: DHCPACK &{Num:25 Addr:10.0.0.27 HardwareAddr:(removed) Hostname:scan2drive Expiry:0001-01-01 00:00:00 +0000 UTC}
[843041.359054] neighbour: arp_cache: neighbor table overflow!
[843041.364702] neighbour: arp_cache: neighbor table overflow!
These messages keep repeating multiple times per second.
tcpdump shows no suspicious traffic on either uplink0 or lan0.
The neighbor table garbage collection settings are unchanged from the default:
# sysctl -a | grep neigh
[…]
net.ipv4.neigh.default.gc_interval = 30
net.ipv4.neigh.default.gc_stale_time = 60
net.ipv4.neigh.default.gc_thresh1 = 128
net.ipv4.neigh.default.gc_thresh2 = 512
net.ipv4.neigh.default.gc_thresh3 = 1024
[…]
The error message (arp_cache instead of ndisc_cache) leads me to believe that the problem is IPv4-related, though the IPv6 neighbor table only contains FAILED, INCOMPLETE and NOARP entries for lan0 (maybe a symptom caused by the IPv4 issue?).
Anyway, the IPv4 neighbor table only seems to contain one entry:
# ./ip -4 neigh show nud all
212.51.156.1 dev uplink0 lladdr 00:24:14:ef:72:ff REACHABLE
(In normal operation, it contains only one entry on uplink0, but a whole bunch of entries on lan0.)
I also checked /proc/net/stat/arp_cache
:
entries allocs destroys hash_grows lookups hits res_failed rcv_probes_mcast rcv_probes_ucast periodic_gc_runs forced_gc_runs unresolved_discards table_fulls
00000001 00001e12 000029f2 00000000 00338bff 001a39c2 000003e5 00000000 00000000 00000000 000015e8 00000000 00000b60
00000001 0000128d 00001860 00000000 00000000 00000000 000002a2 00000000 00000000 00000000 00000d85 00000000 00000b09
00000001 00002952 00002910 00000000 00000000 00000000 000003fe 00000000 00000000 0000d635 000017b0 00000000 00000b41
00000001 00005498 00004326 00000002 0012915b 000eef8d 000001b0 00000000 00000000 00000000 00004f53 00000071 00002e2a
I tried inserting a new entry into the neighbor table:
execve("./arp", ["./arp", "-s", "10.0.0.76", "(removed)"], 0x7ffd220b0438 /* 7 vars */) = 0
brk(NULL) = 0x21ce000
brk(0x21cf200) = 0x21cf200
arch_prctl(ARCH_SET_FS, 0x21ce8c0) = 0
uname({sysname="Linux", nodename="router7", ...}) = 0
readlink("/proc/self/exe", "/perm/sh", 4096) = 8
brk(0x21f0200) = 0x21f0200
brk(0x21f1000) = 0x21f1000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
getuid() = 0
socket(AF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
ioctl(3, SIOCSARP, 0x7ffe5607713c) = -1 ENOBUFS (No buffer space available)
write(2, "arp: SIOCSARP: No buffer space a"..., 41arp: SIOCSARP: No buffer space available
) = 41
exit_group(1) = ?
+++ exited with 1 +++
I also checked free memory:
total used free shared buffers
Mem: 4020136 561812 3458324 38352 101844
-/+ buffers: 459968 3560168
Swap: 0 0 0
It’s a mystery to me how the neighbor table can be considered full with only one entry in it.
This is with Linux 5.1.1.