Comments (50)
Thanks a lot, will check this version and let you know about the results.
from sysdig.
In my logs I get next error:
libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted
But in file2.txt there is also an error:
llc -march=bpf -filetype=obj -o /usr/src/scap-6.0.1+driver/bpf/probe.o /usr/src/scap-6.0.1+driver/bpf/probe.ll
MODPOST /usr/src/scap-6.0.1+driver/bpf/Module.symvers
/bin/sh: scripts/mod/modpost: cannot execute binary file: Exec format error
I think this 2 problems are related.
from sysdig.
This is a ticket for falco lib team.
falcosecurity/libs#1639
from sysdig.
Hi @albe19029! Could you provide more context on why it don't run?
from sysdig.
@therealbobo is there any information required to reproduce the issue?
from sysdig.
Hey @albe19029! Thank you for the issue! We are investigating it! Just out of curiosity: why don't you try the modern ebpf probe? It doesn't require any additional compilation :)
from sysdig.
To be honest, I didn't think about it. For x64 we needed to support older kernels. But for arm64 the version with which everything works stably is 5.8. So it makes sense. I'll try and let you know the results.
from sysdig.
Are you encountering the same problem on x64?
from sysdig.
no, on x64 everything working perfectly.
For arm64 bugs like this blocks us of using scap in production:
https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.5
- for-next/double-page-fault:
: Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag
mm: fix double page fault on arm64 if PTE_AF is cleared
x86/mm: implement arch_faults_on_old_pte() stub on x86
arm64: mm: implement arch_faults_on_old_pte() on arm64
arm64: cpufeature: introduce helper cpu_has_hw_af()
As when code try to read valid user space memory on kernel code (bpf_probe_read*) - sometimes it says it is in invalid. It works stable only starting from kernel 5.8. Didn't find which commit on version 5.8 fixed to issue fully, but starting only from this version arm64 user space check logic working correct for valid cases.
from sysdig.
That's strange! You could open an issue on https://github.com/falcosecurity/libs : sysdig
just uses libscap from there as building block :) BTW please let me know if the modern bpf works smoothly!
from sysdig.
Well, I say there were workaround for clone and execve from their side (falcosecurity/libs#1605). And this changes helped us a lot. But since fixes on memory access were not for bpf (even module scap driver fails), but for arm64 kernel code - I thought it was hard to fix it on https://github.com/falcosecurity/libs side also.
from sysdig.
Looking around the header issue seems related to arm64 only.
from sysdig.
Correct, we faced this issue only on arm64, and only for GKE server (Azure, AWS working correct)
from sysdig.
There is a varialbe for bpf driver - SYSDIG_BPF_PROBE, but how can I enable modern bpf?
from sysdig.
just use the --modern-bpf
cli flag :)
from sysdig.
and if I use scap-driver-loader to build driver, and then resulting file in my code?
from sysdig.
You don't need it! The modern bpf probe is already compiled and bundled inside the sysdig binary :)
from sysdig.
Sorry for delay, but it took me some time to build modern bpf for our project. Unfortunately, when I ran the tests for our project - I saw event loss errors. It will require time to debug this errors, but the behavior of modern bpf and old one have differences.
from sysdig.
Maybe there is an update about bpf error for GKE? Could you reproduce an issue? And maybe know how to fix it? Just to understand if there will be a fix in 1-2 weeks, or we should wait a bit longer. Thanks.
For modern bpf we have plans to migrate to it, and as we have an errors - we will investigate them and will create an issue with description for https://github.com/falcosecurity/libs But probably it will be a bit later (will discuss with team when it will be).
from sysdig.
Could you please check out if you have the div64.h
header somewhere? 🤔
from sysdig.
No, we don't. The only div64.h we have is from this archive https://storage.googleapis.com/cos-tools/17412.156.23/kernel-headers.tgz
from sysdig.
if I use this link: https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz
I get next div64.h files:
./include/asm-generic/div64.h
./arch/arm64/include/generated/asm/div64.h
./arch/arm/include/asm/div64.h
./arch/m68k/include/asm/div64.h
./arch/alpha/include/asm/div64.h
./arch/x86/include/asm/div64.h
./arch/ia64/include/asm/div64.h
./arch/mips/include/asm/div64.h
If I use https://storage.googleapis.com/cos-tools/17412.156.23/kernel-headers.tgz:
./include/asm-generic/div64.h
./arch/arm/include/asm/div64.h
./arch/m68k/include/asm/div64.h
./arch/alpha/include/asm/div64.h
./arch/x86/include/asm/div64.h
./arch/ia64/include/asm/div64.h
./arch/mips/include/asm/div64.h
from sysdig.
It might be enough doing something like sudo ln -s /usr/include/asm-generic /usr/include/asm
🤔
from sysdig.
and which version of kernel-header to use? lakitu-arm64 or current one?
from sysdig.
I'd bet on the current one but a quick uname -a
will probably give you the correct answer :)
from sysdig.
uname -a
Linux gke-qa-dec-2028-18-12--8-default-pool-f35026d3-k2tn 5.15.120+ #1 SMP Sat Aug 19 11:17:43 UTC 2023 aarch64 GNU/Linux
cat /etc/os-release
NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
GOOGLE_METRICS_PRODUCT_ID=26
GOOGLE_CRASH_ID=Lakitu-arm
KERNEL_COMMIT_ID=f0d6dcd5188bababf189e3aede8360342859fcb8
VERSION=105
VERSION_ID=105
BUILD_ID=17412.156.23
from sysdig.
No luck there. Could you please checkout the /usr/include
directory? Please keep an eye open on any symbolic link present there.
from sysdig.
For host system - no /usr/include directory.
For container - /usr/include directory contains file from Red Hat Enterprise Linux 8
What should I check there?
from sysdig.
To reproduce an issue I use next yaml (file is saved in txt) :
scap.txt
Then I run this file on any GKE kubernates cluster (arm64):
kubectl apply -f scap.yaml
And then attach to pod:
kubectl exec --stdin --tty sysdig-0341 -- /bin/bash
And run scap-driver-loader. And get div64.h error.
After editing /usr/bin/scap-driver-loader (link to arm kernel headers) I run scap-driver-loader again and get second problem.
As you can see I share only /etc and /boot from host, so there can't be any conflict, as I use docker.io/sysdig/sysdig:0.34.1 image.
from sysdig.
I have checked both scripts/mod/modpost from kernel header archive and get next information
For https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz I get next result:
file modpost
modpost: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=96cdb1cdfa76c1f3, not stripped
For https://storage.googleapis.com/cos-tools/17412.156.23/kernel-headers.tgz I get next result:
file modpost
modpost: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=96cdb1cdfa76c1f3, not stripped
So for arm64 there is an invalid modpost binary. Will continue to investigate why.
from sysdig.
So, after I added arm64 modprobe (borrowed from AWS kernel)
if [ "${TARGET_ID}" == "cos" ] && [ "${ARCH}" == "aarch64" ]; then
cp /modpost "$KERNELDIR/scripts/mod"
fi
The compilation finished with success result. But still the code is not running. So probably there are errors with in lakitu-arm64/kernel-headers.tgz.
For now have no ideas how to fix it or investigate further.
from sysdig.
Have created a bug for ChromeOS team. https://issuetracker.google.com/issues/321501036
from sysdig.
That's not the first time I encounter this:
modpost: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=96cdb1cdfa76c1f3, not stripped
In my opinion, you should create a symlink to the ./include/asm-generic that points to ./include/asm . Other than that, I'm out of ideas too :/
from sysdig.
Well, file /usr/src/linux-headers-5.15.120+/arch/arm64/include/generated/asm/div64.h has next content:
#include <asm-generic/div64.h>
Is it good?
from sysdig.
Ops, I missed the part where you said the compilation was successful. Can you attach the logs of the build?
from sysdig.
This logs I get after changing link to https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz
and replaces modprobe to valid one.
I also add --trace to make and remove > /dev/null.
make -C "/usr/src/${DRIVER_NAME}-${DRIVER_VERSION}/bpf" > /dev/null
to
make -C "/usr/src/${DRIVER_NAME}-${DRIVER_VERSION}/bpf" --trace
from sysdig.
That's great! But how is sysdig failing? Could you share that log? 🤔
from sysdig.
Yes, sysdig failed. Here is a log.
sysdig_log.txt
from sysdig.
I have found that on x64 starting from sysdig 0.33.1 - sysdig is not working also. With the same error:
libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted
On sysdig 0.32.1 - everything is working. So maybe error is not only arm64, but common.
from sysdig.
Have checked sysdig 0.32.1 on arm with link fix - sysdig is working correct. So there is sure a corruption for COS starting from 0.33.1 version of sysdig.
from sysdig.
-
So this fix is correct:
if [ "${ARCH}" == "aarch64" ]; then
BPF_KERNEL_SOURCES_URL="https://storage.googleapis.com/cos-tools/${BUILD_ID}/lakitu-arm64/kernel-headers.tgz"
else
BPF_KERNEL_SOURCES_URL="https://storage.googleapis.com/cos-tools/${BUILD_ID}/kernel-headers.tgz"
fi -
Even with GKE binary corruption sysdig 0.32.1 is working correctly both for arm64 and x64 with link fix.
-
starting from sysdig 0.33.1 - cos is not working both for arm64 and x64.
from sysdig.
Good day, I have found a problem. This commit leads to problems on GKE.
So there is a loop with 2 max values:
#define MAX_THREADS_GROUPS 30
#define MAX_HIERARCHY_TRAVERSE 60
For COS kernel this is too big. Which leads to this errors:
processed 40396 insns (limit 1000000) max_states_per_insn 1 total_states 4057 peak_states 4057 mark_read 73
-- END PROG LOAD LOG --
libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted
And now this message is clear. function sys_procexit_e has more then 1M instruction from point of BPF verifier view.
I tested a bit, and found that with values:
#define MAX_THREADS_GROUPS 25
#define MAX_HIERARCHY_TRAVERSE 35
this code is also working for both arm64 and x64. So will create an issue for falco lib team.
from sysdig.
As I understand from your side I need only a fix of a link for arm64.
https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz
And a ticket can be closed.
from sysdig.
Hey @albe19029! Thank you so much for the in deep investigation! Great catch!
from sysdig.
@therealbobo Sorry, you closed a ticket, but what about invalid link? As I can see it is still not fixed.
from sysdig.
Fix for falcosecurity/libs#1639 is ready. Don't you know when there will be a new release of sysdig and is it possible to add this fix to it?
As current 0.34.1 - will broke GKE.
from sysdig.
The next sysdig release is coming in the next days. I have to double check but I think that we can apply this patch :)
from sysdig.
It will be great, as this bug blocking us very much.
Thanks in advance.
from sysdig.
@therealbobo will there be a fix of invalid link for scap-driver-loader.in?
Also there was a release of 0.35.0 but without COS driver fix. Don't you know when there will be a patch release?
from sysdig.
Hey @albe19029! I just released 0.35.1
with all the fixes! Please let me know if you encounter any problem! :)
from sysdig.
Related Issues (20)
- if use "container.id" or "container.name" ,there is no feedback or output HOT 4
- How to install or use sysdig in container on Fedora Silverblue or Fedora CoreOS? HOT 4
- Red Hat 9 isn't supported - SHA 1 deprecated HOT 2
- Tools for tracking down I/O Wait HOT 3
- Container image missing tools used in chisels
- CPU usage reported as 0.00 when using `--modern-bpf` HOT 2
- sysdig manual needs to be updated HOT 4
- The tracking of orphan processes is lost. HOT 2
- The situation where syscall events are lost when tracing programs with frequent system calls. HOT 3
- Docker image build "E: Unable to locate package clang-7 E: Unable to locate package llvm-7 E: Unable to locate package libmpx2" HOT 4
- Compilation error: “install TARGETS given target "sinsp" which does not exist in this directory.” HOT 5
- Unable to load the driver. Error opening device /dev/scap0 HOT 3
- CSysdig Rendering IPv6 Addresses For IPv4 Addresses? HOT 1
- Undetected syscall error HOT 1
- All chisels are broken: `attempt to index global 'sysdig' (a nil value)` HOT 13
- `evt.res` and `evt.rawres` and `evt.failed` are gone HOT 2
- csysdig Crashes When `~/chisels` Directory Is Missing HOT 11
- Ask about evt.type=procinfo HOT 3
- UDP connections not closing anymore HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sysdig.