Git Product home page Git Product logo

Comments (50)

albe19029 avatar albe19029 commented on September 15, 2024 2

Thanks a lot, will check this version and let you know about the results.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024 1

In my logs I get next error:
libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted

But in file2.txt there is also an error:
llc -march=bpf -filetype=obj -o /usr/src/scap-6.0.1+driver/bpf/probe.o /usr/src/scap-6.0.1+driver/bpf/probe.ll
MODPOST /usr/src/scap-6.0.1+driver/bpf/Module.symvers
/bin/sh: scripts/mod/modpost: cannot execute binary file: Exec format error

I think this 2 problems are related.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024 1

This is a ticket for falco lib team.
falcosecurity/libs#1639

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Hi @albe19029! Could you provide more context on why it don't run?

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

@therealbobo is there any information required to reproduce the issue?

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Hey @albe19029! Thank you for the issue! We are investigating it! Just out of curiosity: why don't you try the modern ebpf probe? It doesn't require any additional compilation :)

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

To be honest, I didn't think about it. For x64 we needed to support older kernels. But for arm64 the version with which everything works stably is 5.8. So it makes sense. I'll try and let you know the results.

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Are you encountering the same problem on x64?

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

no, on x64 everything working perfectly.

For arm64 bugs like this blocks us of using scap in production:

https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.5

  • for-next/double-page-fault:
    : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag
    mm: fix double page fault on arm64 if PTE_AF is cleared
    x86/mm: implement arch_faults_on_old_pte() stub on x86
    arm64: mm: implement arch_faults_on_old_pte() on arm64
    arm64: cpufeature: introduce helper cpu_has_hw_af()

As when code try to read valid user space memory on kernel code (bpf_probe_read*) - sometimes it says it is in invalid. It works stable only starting from kernel 5.8. Didn't find which commit on version 5.8 fixed to issue fully, but starting only from this version arm64 user space check logic working correct for valid cases.

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

That's strange! You could open an issue on https://github.com/falcosecurity/libs : sysdig just uses libscap from there as building block :) BTW please let me know if the modern bpf works smoothly!

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Well, I say there were workaround for clone and execve from their side (falcosecurity/libs#1605). And this changes helped us a lot. But since fixes on memory access were not for bpf (even module scap driver fails), but for arm64 kernel code - I thought it was hard to fix it on https://github.com/falcosecurity/libs side also.

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Looking around the header issue seems related to arm64 only.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Correct, we faced this issue only on arm64, and only for GKE server (Azure, AWS working correct)

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

There is a varialbe for bpf driver - SYSDIG_BPF_PROBE, but how can I enable modern bpf?

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

just use the --modern-bpf cli flag :)

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

and if I use scap-driver-loader to build driver, and then resulting file in my code?

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

You don't need it! The modern bpf probe is already compiled and bundled inside the sysdig binary :)

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Sorry for delay, but it took me some time to build modern bpf for our project. Unfortunately, when I ran the tests for our project - I saw event loss errors. It will require time to debug this errors, but the behavior of modern bpf and old one have differences.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Maybe there is an update about bpf error for GKE? Could you reproduce an issue? And maybe know how to fix it? Just to understand if there will be a fix in 1-2 weeks, or we should wait a bit longer. Thanks.

For modern bpf we have plans to migrate to it, and as we have an errors - we will investigate them and will create an issue with description for https://github.com/falcosecurity/libs But probably it will be a bit later (will discuss with team when it will be).

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Could you please check out if you have the div64.h header somewhere? 🤔

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

No, we don't. The only div64.h we have is from this archive https://storage.googleapis.com/cos-tools/17412.156.23/kernel-headers.tgz

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

if I use this link: https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz
I get next div64.h files:
./include/asm-generic/div64.h
./arch/arm64/include/generated/asm/div64.h
./arch/arm/include/asm/div64.h
./arch/m68k/include/asm/div64.h
./arch/alpha/include/asm/div64.h
./arch/x86/include/asm/div64.h
./arch/ia64/include/asm/div64.h
./arch/mips/include/asm/div64.h

If I use https://storage.googleapis.com/cos-tools/17412.156.23/kernel-headers.tgz:
./include/asm-generic/div64.h
./arch/arm/include/asm/div64.h
./arch/m68k/include/asm/div64.h
./arch/alpha/include/asm/div64.h
./arch/x86/include/asm/div64.h
./arch/ia64/include/asm/div64.h
./arch/mips/include/asm/div64.h

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

It might be enough doing something like sudo ln -s /usr/include/asm-generic /usr/include/asm 🤔

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

and which version of kernel-header to use? lakitu-arm64 or current one?

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

I'd bet on the current one but a quick uname -a will probably give you the correct answer :)

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

uname -a
Linux gke-qa-dec-2028-18-12--8-default-pool-f35026d3-k2tn 5.15.120+ #1 SMP Sat Aug 19 11:17:43 UTC 2023 aarch64 GNU/Linux

cat /etc/os-release
NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
GOOGLE_METRICS_PRODUCT_ID=26
GOOGLE_CRASH_ID=Lakitu-arm
KERNEL_COMMIT_ID=f0d6dcd5188bababf189e3aede8360342859fcb8
VERSION=105
VERSION_ID=105
BUILD_ID=17412.156.23

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

No luck there. Could you please checkout the /usr/include directory? Please keep an eye open on any symbolic link present there.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

For host system - no /usr/include directory.
For container - /usr/include directory contains file from Red Hat Enterprise Linux 8

What should I check there?

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

To reproduce an issue I use next yaml (file is saved in txt) :
scap.txt

Then I run this file on any GKE kubernates cluster (arm64):
kubectl apply -f scap.yaml

And then attach to pod:
kubectl exec --stdin --tty sysdig-0341 -- /bin/bash

And run scap-driver-loader. And get div64.h error.

After editing /usr/bin/scap-driver-loader (link to arm kernel headers) I run scap-driver-loader again and get second problem.

As you can see I share only /etc and /boot from host, so there can't be any conflict, as I use docker.io/sysdig/sysdig:0.34.1 image.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

I have checked both scripts/mod/modpost from kernel header archive and get next information
For https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz I get next result:

file modpost
modpost: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=96cdb1cdfa76c1f3, not stripped

For https://storage.googleapis.com/cos-tools/17412.156.23/kernel-headers.tgz I get next result:

file modpost
modpost: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=96cdb1cdfa76c1f3, not stripped

So for arm64 there is an invalid modpost binary. Will continue to investigate why.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

So, after I added arm64 modprobe (borrowed from AWS kernel)

if [ "${TARGET_ID}" == "cos" ] && [ "${ARCH}" == "aarch64" ]; then
cp /modpost "$KERNELDIR/scripts/mod"
fi

The compilation finished with success result. But still the code is not running. So probably there are errors with in lakitu-arm64/kernel-headers.tgz.

For now have no ideas how to fix it or investigate further.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Have created a bug for ChromeOS team. https://issuetracker.google.com/issues/321501036

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

That's not the first time I encounter this:

modpost: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=96cdb1cdfa76c1f3, not stripped

In my opinion, you should create a symlink to the ./include/asm-generic that points to ./include/asm . Other than that, I'm out of ideas too :/

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Well, file /usr/src/linux-headers-5.15.120+/arch/arm64/include/generated/asm/div64.h has next content:
#include <asm-generic/div64.h>

Is it good?

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Ops, I missed the part where you said the compilation was successful. Can you attach the logs of the build?

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

success_logs.txt

This logs I get after changing link to https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz
and replaces modprobe to valid one.

I also add --trace to make and remove > /dev/null.

make -C "/usr/src/${DRIVER_NAME}-${DRIVER_VERSION}/bpf" > /dev/null

to
make -C "/usr/src/${DRIVER_NAME}-${DRIVER_VERSION}/bpf" --trace

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

That's great! But how is sysdig failing? Could you share that log? 🤔

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Yes, sysdig failed. Here is a log.
sysdig_log.txt

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

I have found that on x64 starting from sysdig 0.33.1 - sysdig is not working also. With the same error:

libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted

On sysdig 0.32.1 - everything is working. So maybe error is not only arm64, but common.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Have checked sysdig 0.32.1 on arm with link fix - sysdig is working correct. So there is sure a corruption for COS starting from 0.33.1 version of sysdig.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024
  1. So this fix is correct:
    if [ "${ARCH}" == "aarch64" ]; then
    BPF_KERNEL_SOURCES_URL="https://storage.googleapis.com/cos-tools/${BUILD_ID}/lakitu-arm64/kernel-headers.tgz"
    else
    BPF_KERNEL_SOURCES_URL="https://storage.googleapis.com/cos-tools/${BUILD_ID}/kernel-headers.tgz"
    fi

  2. Even with GKE binary corruption sysdig 0.32.1 is working correctly both for arm64 and x64 with link fix.

  3. starting from sysdig 0.33.1 - cos is not working both for arm64 and x64.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Good day, I have found a problem. This commit leads to problems on GKE.

falcosecurity/libs@1e06bd3

So there is a loop with 2 max values:
#define MAX_THREADS_GROUPS 30
#define MAX_HIERARCHY_TRAVERSE 60

For COS kernel this is too big. Which leads to this errors:

processed 40396 insns (limit 1000000) max_states_per_insn 1 total_states 4057 peak_states 4057 mark_read 73
-- END PROG LOAD LOG --
libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted

And now this message is clear. function sys_procexit_e has more then 1M instruction from point of BPF verifier view.
I tested a bit, and found that with values:

#define MAX_THREADS_GROUPS 25
#define MAX_HIERARCHY_TRAVERSE 35

this code is also working for both arm64 and x64. So will create an issue for falco lib team.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

As I understand from your side I need only a fix of a link for arm64.

https://storage.googleapis.com/cos-tools/17412.156.23/lakitu-arm64/kernel-headers.tgz

And a ticket can be closed.

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Hey @albe19029! Thank you so much for the in deep investigation! Great catch!

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

@therealbobo Sorry, you closed a ticket, but what about invalid link? As I can see it is still not fixed.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

Fix for falcosecurity/libs#1639 is ready. Don't you know when there will be a new release of sysdig and is it possible to add this fix to it?
As current 0.34.1 - will broke GKE.

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

The next sysdig release is coming in the next days. I have to double check but I think that we can apply this patch :)

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

It will be great, as this bug blocking us very much.
Thanks in advance.

from sysdig.

albe19029 avatar albe19029 commented on September 15, 2024

@therealbobo will there be a fix of invalid link for scap-driver-loader.in?

Also there was a release of 0.35.0 but without COS driver fix. Don't you know when there will be a patch release?

from sysdig.

therealbobo avatar therealbobo commented on September 15, 2024

Hey @albe19029! I just released 0.35.1 with all the fixes! Please let me know if you encounter any problem! :)

from sysdig.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.