uber-go / automaxprocs Goto Github PK

View Code? Open in Web Editor NEW

3.8K 33.0 147.0 93 KB

Automatically set GOMAXPROCS to match Linux container CPU quota.

Home Page: https://godoc.org/go.uber.org/automaxprocs

License: MIT License

Go 98.17% Shell 0.39% Makefile 1.44%

go golang gomaxprocs container cpu

automaxprocs's People

Contributors

Stargazers

Watchers

Forkers

forging2012 etsangsplk gomodules karlmutch apersson doroginin crwnl3ss sjanulonoks jasonlai isgasho chaitanyaphalak xdlai daheige lwolf whalecold timestee junxie6 gersure donglei zn9f leonliu315 listr0ng orvice zhanglei cafxx ovesh gyf210 rayzyar yiliaofan up1 gerrylon xiaochen-2050 believewaitlove neurostep mirrorfork pavolloffay jeremywangjun juzipeek wikix kikyousky catlittlechen tgavriltg adammck yexm 0daryo happyhakka6 qunqiang linsicai yixingzhong hrz123 wuyongguo th3w4y duringnone dm-dv wangyanphp dongfanliang skytodmoon heartlock fckveza yoigo coco-roll mw-jn super-rain jangocheng alextjj thereyou huanjiesm valery-barysok jonike wnz27 alpody lppgo dumpmemory emadolsky gsenthilanand warmchang dacker-soul natsuboy dprotaso icoolchn maciej isabella232 daxmc99 developgo savetherbtz cuishuang youngsoonlee costela ahmetmircik rphillips helisong427 cccccarl vaclavhoblik xieyuschen hongshibao anima-os-dev yutaroyamanaka 3rdrepo woshizilong mr-nsh

automaxprocs's Issues

comment on CPUQuotaMinUsed need to be fixed

Suggest to fix the comment
Change
// CPUQuotaMinUsed is return when CPU quota is larger than the min value
to
// CPUQuotaMinUsed is return when CPU quota is smaller than the min value

automaxprocs/internal/runtime/runtime.go

Line 31 in 946a839

// CPUQuotaMinUsed is return when CPU quota is larger than the min value

How does it work? How does it automatically set the value of GOMAXPROCS based on the cgroup?

Where does it read the cpu resources for the cgroup setting?

Kubernetes VPA not working with fractional quotas

I found an issue when using automaxprocs in Kubernetes pods managed and autoscaled by a VPA.

For containers with a fractional CPU limit between 1 and 2 cores, the current implementation rounds down GOMAXPROCS to 1, which means the container will never use more than 1s of CPU because it has a single active thread, and the VPA won't scale it up because it still has resources available.

For some components that depend on automaxprocs, like prometheus, it would be helpful to have an option to round up the quota instead of down to allow triggering autoscales when reaching the threshold.

Technically, any fractional number of CPUs would have the same problem, but a higher number of cores increases the probability that the VPA threshold is below the quota rounded down. The problem is more frequent in pods with fractional CPUs between 1 and 2.

NewMountPointFromLine fails under WSL

If you use NewMountPointFromLine on WSL it will fail due to the spacing in paths in super options.

   --- FAIL: TestLogger/default (0.00s)
        uber-go/automaxprocs/maxprocs/maxprocs_test.go:69: 
            	Error Trace:	maxprocs_test.go:69
            	Error:      	Received unexpected error:
            	            	invalid format for MountPoint: "560 77 0:138 / /Docker/host rw,noatime - 9p drvfs rw,dirsync,aname=drvfs;path=C:\\Program Files\\Docker\\Docker\\resources;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio"
            	Test:       	TestLogger/default
            	Messages:   	Set failed

invalid format for MountPoint in WSL2

The most recent version of goreleaser includes this module and I have encountered an error running goreleaser in wsl. I use Docker Desktop using the WSL engine which creates local mounts inside linux to windows paths. The exact error is:

invalid format for MountPoint: "2619 80 0:302 / /Docker/host rw,noatime - 9p drvfs rw,dirsync,aname=drvfs;path=C:\\Program Files\\Docker\\Docker\\resources;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio"

I assume given the a quick scan over the code, that the way the mount line is split is breaking because of the weird spaces and double backspaces in the path option. The version I am using is 1.5.2 of this module.

[]

Doesn't work on ECS (nested cgroups)

Unlike Kubernetes, ECS only allows you to apply a CPU quota at the task (pod) level. Containers in the task are always unbounded.

For example, when cpu: 1024 (1 vCPU) is provided in the task definition it gets the expected quota:

cat /sys/fs/cgroup/cpu,cpuacct/ecs/1576650513ed4c5d9328a6d67a8a741b/cpu.cfs_quota_us
100000

But providing cpu: 1024 to a container inside the same task doesn't have the same effect:

cat /sys/fs/cgroup/cpu,cpuacct/ecs/1576650513ed4c5d9328a6d67a8a741b/2030454c61d157d4c38f0606fe99667bca8961ce4e0019d3667f67e625f40c12/cpu.cfs_quota_us
-1

(The container's cpu value is only used for placement and CPU shares, but doesn't actually affect CPU scheduling aws/containers-roadmap#1862.)

If the container is using automaxprocs it only sees a quota of -1 and defaults to using all of runtime.NumCPU, even though the task's cgroup clamps it to 1 vCPU.

(I'm using cgroups v1 as an example here but the same is true with v2 as well, if you happen to be using an AL2023 AMI.)

It seems like the library could climb up the mount point to find quotas belonging to parents, but this is suboptimal if the task has more than one container.

I'm mostly writing this down to help anyone else avoid this rabbit hole.

no cgroup mountinfo in /proc/self/mountinfo if shareProcessNamespace=true

when shareProcessNamespace=true, business container's pid is not equals to 1.
There is no cgroup mountinfo in /proc/self/mountinfo, but it is in /proc/1/mountinfo.

qurying memory quota not work when /proc/self/mountinfo match multiple keywords

/proc/self/mountinfo like this:

and cgroup MemoryQuota is not work. last matched mount path will replace the rel path.

The GOMAXPROCS environment variable is of type string rather than an number

automaxprocs/maxprocs/maxprocs.go

Line 100 in 946a839

cfg.log("maxprocs: Honoring GOMAXPROCS=%d as set in environment", max)

The above format flag of the environment variable should be %s rather than %d

Needs readme, with bonus screenshots

People shouldn't land on our repos and be expected to write/run code to see what it does. Take a look at any of our other big OSS projects for a sense of what we're looking for.

cc @prashantv not sure how this got past the internal review

v1.5.0 I am still getting that the CPU Quota is not defined.

Originally posted by @ardan-bkennedy in #44 (comment)

This change has not fixed the problem. I switched my code from "github.com/emadolsky/automaxprocs/maxprocs" to v1.5.0 and I am still getting that the CPU Quota is not defined.

I noticed that in v1.5.0

https://github.com/uber-go/automaxprocs/blob/v1.5.0/internal/runtime/cpu_quota_linux.go#L35

func CPUQuotaToGOMAXPROCS(minValue int) (int, CPUQuotaStatus, error) {[Jason Lai, 5 years ago: • Import `automaxprocs/x/runtime`](https://sourcegraph.com/github.com/uber-go/automaxprocs/-/commit/572239b375b1ca6b76babeb8b0ee139dc35fd4d8)
	cgroups, err := newQueryer()
	if err != nil {
		return -1, CPUQuotaUndefined, err
	}

	quota, defined, err := cgroups.CPUQuota()
	if !defined || err != nil {
		return -1, CPUQuotaUndefined, err
	}

	maxProcs := int(math.Floor(quota))
	if minValue > 0 && maxProcs < minValue {
		return minValue, CPUQuotaMinUsed, nil
	}
	return maxProcs, CPUQuotaUsed, nil
}

var (
	_newCgroups2 = cg.NewCGroups2ForCurrentProcess
	_newCgroups  = cg.NewCGroupsForCurrentProcess
)

func newQueryer() (queryer, error) {
	cgroups, err := _newCgroups2()
	if err == nil {
		return cgroups, nil
	}
	if errors.Is(err, cg.ErrNotV2) {
		return _newCgroups()
	}
	return nil, err
}

In the emadolsky repo, it's working. I don't know enough to tell you why.

func CPUQuotaToGOMAXPROCS(minValue int) (int, CPUQuotaStatus, error) {
	var quota float64
	var defined bool
	var err error

	isV2, err := cg.IsCGroupV2()
	if err != nil {
		return -1, CPUQuotaUndefined, err
	}

	if isV2 {
		quota, defined, err = cg.CPUQuotaV2()
		if !defined || err != nil {
			return -1, CPUQuotaUndefined, err
		}
	} else {
		cgroups, err := cg.NewCGroupsForCurrentProcess()
		if err != nil {
			return -1, CPUQuotaUndefined, err
		}

		quota, defined, err = cgroups.CPUQuota()
		if !defined || err != nil {
			return -1, CPUQuotaUndefined, err
		}
	}

	maxProcs := int(math.Floor(quota))
	if minValue > 0 && maxProcs < minValue {
		return minValue, CPUQuotaMinUsed, nil
	}
	return maxProcs, CPUQuotaUsed, nil
}

The log format should be configurable or customizable

After import automaxprocs, the process will print "2019/07/09 14:18:22 maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined" through STDOUT.
There is no information to identify this line is an INFO, WARNING or ERROR log.

If the log format is configurable or customizable, the service using automaxprocs can generate the log message properly.

go test -cpu

I want define number of cpu for go test via -cpu.
Is it possible to expose this library directly also as tool (with main package) that shows CPU quota in order to use it for testing?

For example, I want to do something like that:

- go install go.uber.org/automaxprocs@latest
- automaxprocs > GOMAXPROCS.txt
- go test -cpu $(cat GOMAXPROCS.txt) ./...

Provide an API to expose the read value without setting GOMAXPROCS

The logic to read the core count from cgroups is very valuable. In some settings it might be desirable to read the value but not set GOMAXPROCS with it. This issue requests that maxprocs subpackage provide a method that returns the read value without setting the runtime value.

i don't even know what's going on here send help

Contour calling into automaxprocs, leading to an ungoogleable abomination of an error message

time="2023-05-23T12:59:09Z" level=fatal msg="failed to set GOMAXPROCS" error="path \"/docker.slice/docker-519e5d83d83094f4e960534da76ca770702b022fbe0f96c91d99699ab507a292.scope\" is not a descendant of mount point root \"/docker.slice/docker-519e5d83d83094f4e960534da76ca770702b022fbe0f96c91d99699ab507a292.scope/kubelet\" and cannot be exposed from \"/sys/fs/cgroup/misc/kubelet\""

running in kind, fedora WSL host, daemon.json:

{
    "default-cgroupns-mode": "private",
    "cgroup-parent": "docker.slice"
}

booted with systemd.unified_cgroup_hierarchy=1

is not a descendant of mount point root

version
v1.5.3
error info
msg="failed to set GOMAXPROCS: path "/kubepods.slice/kubepods-pod3f8b57a9_829a_4d47_b220_b830da173956.slice/cri-containerd-2b7a54862ac13ec3bf45d691f6e900ba67b27ca3718dab124f6d04e1644ee53e.scope" is not a descendant of mount point root "/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod497e810a_3a2e_49d6_a6a3_ca9b49062993.slice/cri-containerd-f3ee784b971d2e8e2bff1714496b5c813a9f6dc0e726530b004a4a1a01bc470e.scope" and cannot be exposed from "/sys/fs/cgroup/systemd""

maxprocs: CPU quota undefined

This has not always been the case and I have tried to trace the code, but I don't know enough.

The project in question is https://github.com/ardanlabs/service and I am using KIND.

It seems to me /proc/self/cgroup is missing from the list of SuperOptions. Not sure why this is now happening.

Any guidance to debug or understand this would be appreciated. I can walk someone to run the code.

[go.mod] Bump Go version to Go 1.21

This is a placeholder issue to bump the Go version in the go.mod files and any other references, to ensure the repository references the latest versions. This is a desired change as it will include the latest security fixes and performance improvements.

As of today, the latest Go version is Go 1.21

Message change on upgrade to 1.2.0

Go 1.13 and go modules are being used.

I have used v1.1.0 using Set with a logger. v1.1.0 tells me that GOMAXPROCS has been reduced from 4 to 1 because of Cgroups CPUQuta restrictions which is what I expect.

 {"level":"info","ts":1574415010.177917,"caller":"logger/logger.go:77","msg":"Cores allocation","GOMAXPROCS":4}                                                                                                      
 {"level":"info","ts":1574415010.1784537,"caller":"maxprocs/maxprocs.go:47","msg":"maxprocs: Updating GOMAXPROCS=1: determined from CPU quota"}

I tried v1.2.0 and now the log message tells me GOMAXPROCS is set to one because this is the minimum. This seems incorrect. Note also that the log message has changed format.

 maxprocs/maxprocs.go:47 maxprocs: Updating GOMAXPROCS=1: using minimum allowed GOMAXPROCS.

I have reverted to v1.1.0.

The code fragment is:

    logger.Sugar.Infow(
            "Cores allocation",
            "GOMAXPROCS", runtime.GOMAXPROCS(-1),
        )
        undoMaxProcs, err = maxprocs.Set(maxprocs.Logger(Sugar.Infof))
        if err != nil {
            logger.Sugar.Errorw(
                "Error for automaxprocs",
                "err", err,
            )
        }

Another improvement would be to emit in the log message the old value of GOMAXPROCS as well as what it is changed to.

version 1.5.2 increased CPU consumption

Hey folks,

I upgraded from 1.5.1 to 1.5.2 a few days ago, and I noticed an increase in CPU consumption of my service

I still haven't done much digging, but is there any change in this version that could be responsible for an increased CPU usage?

Switch go modules dependency management

Go modules is now the standard dependency for the go language. Package provider should switch the standard.

Support for soft RAM limit?

Hello 👋🏻

Are there any plans to support the soft RAM limit introduced in Go 1.19?

(Either in this library or a new one)

pls public the internal/cgroups package

i thx it very useful 😊

Support cgroups v2 and its unified hierarchy

It seems from the code that this only consults the v1 cgroup and its specific subsystem. It seems that more and more code will move to using v2 cgroups and the unified hierarchy. I'm creating this issue to track support for v2 cgroups.

how to disable log print ?

how to disable log print ? thx ~

is it not default behavior after version 1.5?

hi, I am curious that isn't it auto set GOMAXPROCS to the cpu core number？just as this StackOverflow answer says, from version 1.5, it is a default behavior

Add benchmarks to README?

Hi! I was wondering if it would possible to include some benchmarks in the README? We run some open-source software with cpu quotas and being able to link to some benchmarks in the README might go a long way to convincing other people to incorporate automaxprocs into their projects as well.

What happens if cpu quota on k8s side is less than 1 core?

Am i right that if you have a service with a cpu limit less than 1000m (eg 1 core) the go process would still think it has one core available and tries to utilize more than its limit to get throttled eventually. So my understanding here is that if you do use limits less than one core is a very unideal situation for the go-scheduler to work efficently:

a.) 4 replicas with 500m cpu limit
b.) 2 replicas with 1000m cpu limit

In total both cases use the same amount of total cores (2) but case b.) would be more efficient as the go scheduler knows how much it can utilize?

Sorry that i created a bug ticket for my simple question. But i think if my assumptions are correct, it would be good to make this clear in the README.

Thanks for the awesome library 👍