Git Product home page Git Product logo

Comments (24)

debarshiray avatar debarshiray commented on September 13, 2024

The daily CI runs, linked from README.md, include package-based Fedora 40, and they are passing.

I wonder if this is specific to Fedora Silverblue.

What happens if you do this on the host:

# touch /tmp/machine-id
# mount --rbind /etc/machine-id /tmp/machine-id

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

This is not my only silverblue 40 machine, the other seems to be able to run the toolbox just fine.

$ touch /tmp/machine-id
$ mount --rbind /etc/machine-id /tmp/machine-id
mount: /tmp/machine-id: must be superuser to use mount.
       dmesg(1) may have more information after failed mount system call.
$ sudo mount --rbind /etc/machine-id /tmp/machine-id
$ toolbox enter
Error: failed to initialize container fedora-toolbox-40

I also tried the touch with sudo.

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

For more context, this is the second time it happens this month. I recreated the the toolbox only a few days ago. I saw another report at one GNOME Matrix channel.

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

This is not my only silverblue 40 machine, the other seems to be able to run the toolbox just fine.

$ touch /tmp/machine-id
$ mount --rbind /etc/machine-id /tmp/machine-id
mount: /tmp/machine-id: must be superuser to use mount.
       dmesg(1) may have more information after failed mount system call.
$ sudo mount --rbind /etc/machine-id /tmp/machine-id

The mount(8) has to be done as root. That's why I used a # prompt in my example.

Looking at the code, /etc/machine-id is the first bind mount that the container's entry point attempts to do, and then there's this from the container's logs:

level=debug msg="Running as real user ID 0"
...
level=debug msg="Binding /etc/machine-id to /run/host/etc/machine-id"
mount: /etc/machine-id: must be superuser to use mount.
       dmesg(1) may have more information after failed mount system call.

Those two things can't be true at the same time. So, I am beginning to wonder if there's something going wrong inside mount(8). It would be revealing to prepend a call to strace(1) and then try it with a Toolbx container that includes the strace(1) binary. Something like:

$ git diff
diff --git a/src/cmd/initContainer.go b/src/cmd/initContainer.go
index de7bcfcc5302..c6108edc4135 100644
--- a/src/cmd/initContainer.go
+++ b/src/cmd/initContainer.go
@@ -724,6 +724,7 @@ func mountBind(containerPath, source, flags string) error {
        logrus.Debugf("Binding %s to %s", containerPath, source)
 
        args := []string{
+               "mount",
                "--rbind",
        }
 
@@ -733,7 +734,7 @@ func mountBind(containerPath, source, flags string) error {
 
        args = append(args, []string{source, containerPath}...)
 
-       if err := shell.Run("mount", nil, nil, nil, args...); err != nil {
+       if err := shell.Run("strace", nil, nil, nil, args...); err != nil {
                return fmt.Errorf("failed to bind %s to %s", containerPath, source)
        }

However, you need Toolbx to build Toolbx on Fedora Silverblue. So, I suppose I should put together a debug RPM.

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

I just upgraded my other machine and its toolbox still works. This is very weird considering the machines are configured the same (afaik). If you prepare a rpm or binary I can try that thanks!

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

Just tried it with this Fedora 40 Silverblue deployment and couldn't reproduce:

Deployments:
● fedora:fedora/40/x86_64/silverblue
                  Version: 40.20240618.0 (2024-06-18T00:52:57Z)
               BaseCommit: fa68d62df2fae64e52bbfe15784915c78ab2914767cacded8c5de2f5b7ddab62
             GPGSignature: Valid signature by 115DF9AEF857853EE8445D0A0727707EA15B79CC

Just to be sure, do you have the same deployment on both your machines?

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

I submitted a Fedora 40 build for a debug RPM:
https://koji.fedoraproject.org/koji/taskinfo?taskID=119303969

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

Nah, the one broken is yesterday's (40.20240618.0 (2024-06-18T00:52:57Z)) deployment and the other machine which is working has today's. I am upgrading right now but I don't think this is it.

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

Attached the output of

strace toolbox enter &> strace.txt

with the debug build. Is that enough?

strace.txt

EDIT: Note that the error is different this time? I still see

jun 19 20:47:28 alpha fedora-toolbox-40[8291]: Error: failed to bind /etc/machine-id to /run/host/etc/machine-id

in journalctl -b.

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

Attached the output of

strace toolbox enter &> strace.txt

We don't need to run strace(1) against toolbox enter. For that we wouldn't need a debug build.

We are running strace(1) against the mount(8) getting called inside the container from the entry point by adjusting the toolbox(1) binary. So we need to look at the strace(1) output from podman start --attach or podman logs.

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

Sorry I am not sure how to get the strace from inside the container, you mean

$ strace podman start --attach fedora-toolbox-40 &> podman-attach.txt

? If so it is attached bellow.

podman-attach.txt

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

No need to manually attach strace(1) anywhere.

Before you install the debug build of toolbox, ensure that you have a Toolbx container with strace(1) in it.

Then, install the debug build of toolbox, stop all your containers with podman stop --all, then try to enter one with strace(1). If the error reproduces, then share with us what you have in podman start --attach ... or podman logs ....

from toolbox.

nielsdg avatar nielsdg commented on September 13, 2024

FWIW, I can reliably reproduce my toolbox containers breaking after doing a reboot

from toolbox.

nielsdg avatar nielsdg commented on September 13, 2024

Maybe this is related? https://discussion.fedoraproject.org/t/rpm-ostree-update-breaks-toolbox-fedora-40/120095/4

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

I did a reset of my conifg. Here is the diff of the prior and newer output of podman system info

--- a	2024-06-19 23:28:25.883686898 +0200
+++ b	2024-06-19 23:28:38.536401465 +0200
@@ -13,17 +13,17 @@
     path: /usr/bin/conmon
     version: 'conmon version 2.1.10, commit: '
   cpuUtilization:
-    idlePercent: 91.02
-    systemPercent: 3.79
-    userPercent: 5.2
+    idlePercent: 92.44
+    systemPercent: 3.63
+    userPercent: 3.93
   cpus: 16
-  databaseBackend: boltdb
+  databaseBackend: sqlite
   distribution:
     distribution: fedora
     variant: silverblue
     version: "40"
   eventLogger: journald
-  freeLocks: 2047
+  freeLocks: 2048
   hostname: alpha
   idMappings:
     gidmap:
@@ -43,7 +43,7 @@
   kernel: 6.9.4-200.fc40.x86_64
   linkmode: dynamic
   logDriver: journald
-  memFree: 11161944064
+  memFree: 10079854592
   memTotal: 16673759232
   networkBackend: netavark
   networkBackendInfo:
@@ -99,7 +99,7 @@
       libseccomp: 2.5.5
   swapFree: 8589930496
   swapTotal: 8589930496
-  uptime: 0h 2m 32.00s
+  uptime: 0h 2m 18.00s
   variant: ""
 plugins:
   authorization: null
@@ -122,25 +122,25 @@
 store:
   configFile: /var/home/deathwish/.config/containers/storage.conf
   containerStore:
-    number: 1
+    number: 0
     paused: 0
     running: 0
-    stopped: 1
+    stopped: 0
   graphDriverName: overlay
   graphOptions: {}
   graphRoot: /var/home/deathwish/.local/share/containers/storage
   graphRootAllocated: 1000204886016
-  graphRootUsed: 952381968384
+  graphRootUsed: 949989421056
   graphStatus:
     Backing Filesystem: btrfs
-    Native Overlay Diff: "false"
+    Native Overlay Diff: "true"
     Supports d_type: "true"
-    Supports shifting: "true"
+    Supports shifting: "false"
     Supports volatile: "true"
     Using metacopy: "false"
   imageCopyTmpDir: /var/tmp
   imageStore:
-    number: 1
+    number: 0
   runRoot: /run/user/1000/containers
   transientStore: false
   volumePath: /var/home/deathwish/.local/share/containers/storage/volumes

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

Maybe this is related? https://discussion.fedoraproject.org/t/rpm-ostree-update-breaks-toolbox-fedora-40/120095/4

I quickly skimmed through it. On the surface it doesn't seem related to why mount(8) thinks that it's not running as root.

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

I did a reset of my conifg. Here is the diff of the prior and newer output of podman system info

Did resetting the Podman configuration reliably fix this problem?

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

FWIW, I can reliably reproduce my toolbox containers breaking after doing a reboot

Okay, that's great. Are you in a position to get the strace(1) logs using the debug build, like I described above? If things are really badly broken, then I can come up with other steps. :)

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

Not at home atm, but no. I was not able to create new toolboxes. I will check in more detail later today

from toolbox.

debarshiray avatar debarshiray commented on September 13, 2024

I was not able to create new toolboxes.

Why? What was the exact problem?

If you can't enter a container to install strace, then you can create a custom image using a Container/Dockerfile like this:

FROM registry.fedoraproject.org/fedora:40
RUN dnf --assumeyes install strace

... followed by:

$ podman build --squash --tag localhost/strace-toolbox:40 /path/to/dir/with/Containerfile

Then you can create a container from this image:

$ toolbox create --image localhost/strace-toolbox:40

Then you can try to enter it with the debug toolbox RPM above and see what shows up in podman start --attach or podman logs.

from toolbox.

A6GibKm avatar A6GibKm commented on September 13, 2024

I was able to enter that container without any issues so there was nothing to strace :(. By the way, after removing the debug build of toolbox I am able to create and enter new toolboxes (After the podman system reset).

from toolbox.

alistair23 avatar alistair23 commented on September 13, 2024

For more context, this is the second time it happens this month. I recreated the the toolbox only a few days ago. I saw another report at one GNOME Matrix channel.

The exact same thing happened to me. Just re-built all the containers and now can't enter them again

from toolbox.

alistair23 avatar alistair23 commented on September 13, 2024

I was not able to create new toolboxes.

Why? What was the exact problem?

If you can't enter a container to install strace, then you can create a custom image using a Container/Dockerfile like this:

FROM registry.fedoraproject.org/fedora:40
RUN dnf --assumeyes install strace

... followed by:

$ podman build --squash --tag localhost/strace-toolbox:40 /path/to/dir/with/Containerfile

Then you can create a container from this image:

$ toolbox create --image localhost/strace-toolbox:40

Then you can try to enter it with the debug toolbox RPM above and see what shows up in podman start --attach or podman logs.

I tried to follow this, but newly created images work. It's just existing ones I can't enter.

For the last month it seems like the container images need to be rebuilt after every reboot on my Kinoite system

from toolbox.

nielsdg avatar nielsdg commented on September 13, 2024

(Un)fortunately, I can't reproduce this anymore after doing a Silverblue update and resetting my containers as recommended in that previous link. So I can't really help with this anymore, but hey, at least things work again :-)

from toolbox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.