Today we are moving bytes back and forth between std streams and master pty. The stdou

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

but but but I would love to be Go somehow <p dir="auto"

conmon: Handle logs,about cri-o/cri-o

Comments (20)

mrunalp commented on June 2, 2024 2

Logging plugins will come at kube level. We just have to write to the file specified in CRI.

On Oct 7, 2016, at 9:57 AM, Antonio Murdaca [email protected] wrote:

I don't want to make ocid depend on systemd.

but we will no? I mean, as docker has an option we would likely have the same since we want journald

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

from cri-o.

cyphar commented on June 2, 2024

Is this something that I need to consider as part of the runC console rewrite? Or do you think the --console-socket API won't make a difference in regards to this problem? Or do you just want to switch what fd conmon will pipe data to?

from cri-o.

mrunalp commented on June 2, 2024

@cyphar I think it could be a separate PR to pass additional fds to runc that it can dup stdout/stderr to. What do you think? Else, we will have to change the fds that we write to in conmon as you said.
The 5 problems that we need to solve in conmon are:

exit code: Capture the container exit code so we can pass it back as pass of status. (this kinda works but I want to look into signalfd).
tty handling: Holding the master pty that you pass back via console-socket.
attach: Allow remote clients to connect to the container stdio possibly over http2.
exec: Make exec work with tty support.
logs: Store the container logs so they could be accessed by the kubelet.

from cri-o.

cyphar commented on June 2, 2024

As we discussed in the --console-socket PR, I think that having an API to specify what fds to dup over stdio would be incredibly useful. We could use the same API to specify what fd runC should copy data to/from (if you're not detaching). The only downside is that you couldn't really use runc create...

from cri-o.

cyphar commented on June 2, 2024

FWIW, I'd be happy to help work on this. Though, I don't think that exec should be handled inside the same conmon process as the one actually running the container -- we can spin up a new conmon for exec. Though, I really want to avoid the "garbage collection" code that exists in Docker when doing docker exec. Actually, handling exec as part of a HTTP API might end up being cleaner -- a single conmon to manage a single container. 😛

I'll see if I can come up with a decent solution for logging before looking into attach / exec. As for the tty handling, this won't be massively changed from the current situation inside conmon. The actual handling code should be the same.

The only problem I see coming up is that if we end up making a HTTP API, we'll need to write the majority of the code in something more safe than C. But at least we're not doing anything dodgy that requires C anyway (AFAICS). 😉

from cri-o.

mrunalp commented on June 2, 2024

@cyphar Awesome. Logging, I think should be straight-forward. We need to pass -l parameter to conmon and pass the log config from CRI down and add it as an additional output to the epoll code. HTTP API is going to be tricky with C 😉 I think that libuv might be a good candidate if we want to keep this code in C. I have no objections to going away from C but whatever we choose will most likely need to have good support for epoll.

from cri-o.

cyphar commented on June 2, 2024

We need to pass -l parameter to conmon and pass the log config from CRI down and add it as an additional output to the epoll code.

Is log rolling meant to be handled by us? Or is that all part of CRI so we just have to implement that? I remember that Docker had some lovely issues with disk exhaustion due to lack of log rolling. Also, I'd like to make sure that we always provide a plaintext file-based log directory so administrators can have their own logging setup on minions.

I think that libuv might be a good candidate if we want to keep this code in C. I have no objections to going away from C but whatever we choose will most likely need to have good support for epoll.

I'll think about it. The main concern I have is that we'd have to multithread all the things, and I don't like threading in C. Go has kinda spoiled me in that regard. 😛 But we'll see how things go (pun not intended) -- since the kubelet will be running locally on the machine we can come up with some other ways of handling the problem.

from cri-o.

runcom commented on June 2, 2024

I think that libuv might be a good candidate if we want to keep this code in C. I have no objections to going away from C but whatever we choose will most likely need to have good support for epoll.

libuv sounds fine to me (used that in my nodejs days lol) - but but but I would love to be Go somehow 😢

from cri-o.

cyphar commented on June 2, 2024

but but but I would love to be Go somehow

My initial feeling was the same. But from my experience with Docker and how many memory issues it's had, I really don't want to get into the business of having long-running services written in Go that we cannot restart because they're tightly bound to containers -- containerd has the same problem from my testing. C is the "right" answer, the only question is how will we make the API work so that it's the least trouble.

libuv seems like a decent idea, but I'll have to read up on it (my only experience comes from NeoVim) after I've implemented the logging. We'll see how it goes!

from cri-o.

rhatdan commented on June 2, 2024

Is this also going to plugin to journald?

from cri-o.

cyphar commented on June 2, 2024

It will output text log files that kubernetes will then suck up. You could presumably add some way to read the logs using journald, but tbh it's not a personal priority for me to get journald integration -- I don't want to make ocid depend on systemd.

from cri-o.

runcom commented on June 2, 2024

I don't want to make ocid depend on systemd.

but we will no? I mean, as docker has an option we would likely have the same since we want journald

from cri-o.

rhatdan commented on June 2, 2024

From a security point of view having these logs recorded permanently on the host, is something we are interested in. I don't care if this is optional, but I believe we should allow administrators to know what the containers were doing on the host. Similarly we might need to add some kind of auditing eventually, even if it is of limited use,since the only one ever doing activity is k8s.

from cri-o.

cyphar commented on June 2, 2024

Anyway, I'm currently working on this in a branch of mine: https://github.com/cyphar/cri-o/tree/conmon-logging. But it looks like we're not using the CRI API for logs properly (logPath is relative to logDirectory which is set for the sandbox). I'm trying to fix it, but for some reason conmon isn't being cooperative. Namely I keep getting EISDIR, presumably because there's something wrong with logPath even after I "fixed" it.

from cri-o.

mrunalp commented on June 2, 2024

@cyphar Hmm, I will take a look at your branch.

from cri-o.

laijs commented on June 2, 2024

how about remove the --console[-socket] and --detach from runc
and add --monitor monitor_cmd_path instead?

the runc should launch the monitor via monitor [--pid pid] [--ptymaster fd] [--stdpipe{in,out,err} fd]

the runc has its builtin/default monitor, which just copies&forwards the stdio and watches the pid.
the runc waits the monitor instead of the container/exec. the detach mode is done via "the monitor fork() and the parent quits"

the non-default monitors can do all the things such as (or more)

exit code: Capture the container exit code

tty handling: Holding the master pty

attach: Allow remote clients to connect to the container stdio possibly over http2.

exec: Make exec work with tty support.

logs: Store the container logs so they could be accessed by the kubelet.

the monitor for containerd is (modified) containerd-shim.
the monitor for ocid is (modified) cri-o/conmon.

Benefit:

more flexible
move the current default monitor out from runc. (there are still in the same project.)
the way to handle the ptymaster is simplified.

from cri-o.

cyphar commented on June 2, 2024

@laijs Can you please stop commenting on random PRs with comments about --console-socket? It makes it difficult for everyone to keep track of the discussion and also derails threads that have nothing to do with --console-socket. The one and only correct place to discuss this is opencontainers/runtime-spec#513. That's it.

To answer your point about --monitor, it would require making runC a daemon. Or runC would have to spawn a daemon. I was reading through the runV code and from my perspective runV should see whether it is possible to implement the console handling without forcing everyone else to implement things as a daemon -- or at the very least come up with some API that doesn't require the existence of a daemon.

Also "watching the pid" is not possible with the POSIX process model unless you're the parent. Which might work but then is also quite ugly because you'd have to get co-operation from the monitor to set PRCTL_SET_CHILD_SUBREAPER and then you can double-fork underneath it inside a PID namespace. But it would just be super ugly.

EDIT: And I just realised that it would probably break exec.

from cri-o.

cyphar commented on June 2, 2024

@laijs

Also,

more flexible

I disagree because of the POSIX process model and the fact that you'd have to mandate how the runtime ran the monitor (which would almost certainly break things). Not to mention that it would be uncontained-RCE-as-a-service.

move the current default monitor out from runc. (there are still in the same project.)

runC doesn't have a monitor with runc create. Or run --detach. So you're asking us to add a monitor (which is actually a daemon).

the way to handle the ptymaster is simplified.

Not really, it just moves who has to deal with the ptymaster. To be fair, it is flexible enough that you could now have your runV implementation and ours both work with the API -- but it would break a lot of other things as a result.

from cri-o.

laijs commented on June 2, 2024

@laijs Can you please stop commenting on random PRs with comments about --console-socket?

I know the console-proxy can handle it in runv. but I was finding a "proper" way to handle it. I guess you are still finding a way to pass the console back, different purposes but they have some kinds of similar base.

I hope the cri-o use the same process to handle runc&runv until we reach some invincible obstacles(many obstacles did occur). but we should reduce the different processes when runtime is different. runc&runv are worth to be considered at the same time, it might also help for the windows hyperv container.

I had been finding a "proper" way to handle it, I thought many and killed many. any solution has its bad site. this one are the same.

I chose this issue to reply just because this issue discuss "The 5 problems that we need to solve in conmon".

Also "watching the pid" is not possible with the POSIX process model unless you're the parent.

only "waitpid(container_pid)" in my mind when I wrote the previous comment.
I know nothing about CHILD_SUBREAPER.
I thought the the init process of the infra-container(also the init process of the pod pid ns) have the responsibility to reap all the other processes in the containers/execs. it is the final reaper chosen by the kernel and is the same as CHILD_SUBREAPER.
I did forgot that the containerd don't have infra-container and the reaper is missing in this solution.

EDIT: And I just realised that it would probably break exec.

exec also have its own monitor in my mind when I wrote the previous comment.

please ignore this and I will stop post any un-tried solution.

from cri-o.

cyphar commented on June 2, 2024

@laijs

only "waitpid(container_pid)" in my mind when I wrote the previous comment.
I know nothing about CHILD_SUBREAPER.

waitpid(2) does not work with non-children (see the man page), this is a part of POSIX. You can try to get around it by using CHILD_SUBREAPER and then reparenting yourself to the process doing waitpid(2) but that's ugly, Linux-specific and won't work in most cases. From the man page:

All of these system calls are used to wait for state changes in a child of the calling process

So you'll need to figure out a way to deal with this. The big benefit (and trust me, I spent a long time considering different methods of solving this problem before writing a PR about it) of --console-socket is that it doesn't tie the lifetime of the runtime to the lifetime of the pty, and it doesn't add any extra restrictions on the way you have to implement a runtime. Yes, runV has problems but those problems are unique to VMs where you don't have a real pty to interact with in the host.

I would suggest writing kernel patches to make this simpler (because IMO inside runV you're always going to have limitations with any API you come up with since you haven't solved the core problem of "users want full control of the pty master but we can't just give them the fd"). But I understand that adding more VM hacks to the kernel is probably not a good thing overall. I'm willing to listen and discuss solutions, I just want to have a discussion that isn't based around the idea that what runC is (planning on) doing is "fundamentally broken" which is how your first message was written.

I chose this issue to reply just because this issue discuss "The 5 problems that we need to solve in conmon".

The reason I don't think this is the right place to discuss this is that conmon doesn't decide what the API is -- we are just trying to get kubelets to work with whatever API works. I understand that most of the people who work on runC are also here (so it might seem logical to split things), but it's a bit annoying to have to switch between repos to have discussions about one topic.

Not to mention that the problems conmon needs to solve (or rather, the features it needs to implement) are not a runC problem. We just haven't written the code to implement these things yet.

I hope the cri-o use the same process to handle runc&runv until we reach some invincible obstacles(many obstacles did occur).

I agree, however right now there is an obstacle: runV doesn't seem to implement a console API that is compatible with runC. While this can almost certainly be solved (my suggestions were examples there are certainly others) right now we're focused on getting Kubelet to work before focusing on making sure all runtimes work together. Specifically right now we are writing conmon around the runC API because there isn't a runtime-spec console API that we can use yet. If you want to have your opinions heard when the spec is being designed please comment there (I will still read it, it'll just be in the right context).

from cri-o.

conmon: Handle logs about cri-o HOT 20 CLOSED

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent