Comments (13)
@BenTheElder, thank you for some of this. Appreciated.
The rule that is catching this specific User-Agent is most definitely the:
rule {
action = "deny(403)"
priority = "920"
match {
expr {
expression = "evaluatePreconfiguredWaf('scannerdetection-v33-stable', {'sensitivity': 1})"
}
}
description = "Scanner detection"
preview = false
}
Per Google's own documentation:
This will lead us to the following:
- https://github.com/coreruleset/coreruleset/blob/main/rules%2FREQUEST-913-SCANNER-DETECTION.conf
- https://github.com/coreruleset/coreruleset/blob/main/rules%2Fscanners-user-agents.data
This is where OWSAP folks decided to block some of the tools, including the "BFAC" project. Albeit, some of the names of the projects from this list there do pass... so it's a bit puzzling which version of the ruleset Google is using exactly.
That said, I don't think there is anything that we could sensibly do here....
- Removing the ruleset from WAF will invite bots, spam and scams, so not ideal
- Trying to manually replicate the "scanner" ruleset from OWSAP would be an unmaintainable headache
- Adding a sort of an allowlist that includes most of the popular container runtimes would invite abuse eventually
Some of the projects already use different User-Agent strings, often to mimic curl or popular browsers, so there is no helping it here too, sadly.
As such, we on the CRI-O's side will strip the extra build and release information from the User-Agent, which should limit the possibility of running into some other combination of letters, like "bfac", that would match WAF rules.
On the note of the WAF rules... I wish these were a bit tighter, such that they would match User-Agents more precisely, rather than just a specific word or letter combination anywhere within the entire header. However, it's faster this way and requires less maintenance over time, so it is what it is.
So, this is it, I suppose. Unless you have some more thoughts?
from registry.k8s.io.
At least this appears to be limited to non-release-tagged versions? But that's still going to impact someone at some point.
I was thinking about this some more, I think we could actually write some pretty simple rules that just reject most garbage requests at the edge purely based on path and hope that's sufficient, drop the standard WAF rules.
WIP at kubernetes/k8s.io#6969
It will be a little bit more annoying to support additional endpoints in the future, but that seems OK
from registry.k8s.io.
Would you be able to verify Cloud Armor configuration, just out of curiosity and to make sure it is indeed it?
We're using standard rules, the full configuration is open source:
https://registry.k8s.io => https://github.com/kubernetes/registry.k8s.io
The community deployment configs are documented at in the k8s.io repo with the rest of the community infra deployments, but primarily here.
https://github.com/kubernetes/k8s.io/tree/main/infra/gcp/terraform/k8s-infra-oci-proxy-prod is the main deployment
The armor rules are here: https://github.com/kubernetes/k8s.io/blob/main/infra/gcp/terraform/modules/oci-proxy/cloud-armor.tf
from registry.k8s.io.
I'm not sure which ruleset contains this, but we can drop most of these.
We shouldn't disable armor entirely because we're using a custom policy for rate limiting but most of these rule sets are probably irrelevant.
We can iterate on the staging instance (DO NOT depend on this endpoint, but for testing purposes we can iterate at registry-sandbox.k8s.io).
from registry.k8s.io.
The other complication: The main reason we've kept these WAF rules is actually to deny spammy vuln scanner noise at the edge.
We get a TON of noisy requests from automated scanning (... and pull-through caches attempting to pull anything and everything) and any request we can deny at the loadbalancer saves the project funds versus letting them get through to the application we use to split load for valid requests between the different cloud storage endpoints ... funds we can use for CI etc instead.
So we'll want to still on balance block known "attack" requests with WAF, and it's much easier to use a pre-supplied ruleset than develop and maintain our own.
from registry.k8s.io.
we on the CRI-O's side will strip the extra build and release information from the User-Agent
Yes please. That's it. (Rules are going to be constantly updated no matter what to keep up with new spam/bot crap)
from registry.k8s.io.
/cc @AkihiroSuda
So Suda-san can take a look at User-Agent in containerd.
This was once discussed and rejected
from registry.k8s.io.
This is now deployed, though I can't make promises about the behavior of any leaky backend hosts we redirect to.
We're considering handling that differently but it would be more of a long term project.
I don't think anything we currently use would block requests purely based on header substrings anymore, only invalid request paths, or excessive usage.
from registry.k8s.io.
@kwilczynski thanks for digging in deep into this. You can see all the code we use for responding to the curl command here - https://github.com/kubernetes/registry.k8s.io/tree/main/cmd/archeio
it's a cloud run application running in google infra. While we do get the client IP, we do not try to parse User-Agent
, you can see some of the code here:
registry.k8s.io/pkg/net/clientip/clientip.go
Lines 27 to 37 in 5443169
please feel free to clone the repo and peek if you spot something!
Scanning the github-verse quickly, the 403 may be an attempt by some application firewall (Cloud Armor) to reject traffic from some tools they consider hostile?
https://github.com/mazen160/bfac/blob/18fb0b5dc05005d4f39c242609bbf2347ca0d421/bfac#L257-L259
(No, i have no clue what other strings may be considered in the same fashion!)
from registry.k8s.io.
[...]
it's a cloud run application running in google infra. While we do get the client IP, we do not try to parse
User-Agent
, you can see some of the code here:
[...]
Scanning the github-verse quickly, the 403 may be an attempt by some application firewall (Cloud Armor) to reject traffic from some tools they consider hostile? mazen160/bfac@18fb0b5
/bfac#L257-L259
@dims, since the registry service itself is very simple, and we didn't expect it to be anything but, what blocks these requests is probably set up somewhere as part of the infrastructure that Google donates that runs and supports the registry itself.
You mentioned Cloud Armor—we were thinking that there perhaps is some sort of a transparent proxy or WAF (Web Application Firewall) deployed somewhere or even that the registry is perhaps fronted by Cloudflare or such (which is also popular).
The IP address 34.96.108.209 we get back for registry.k8s.io, which also resolves to the same IP from different networks/locations, is within Google's 34.64.0.0/10 network. As such, I bet it's the WAF/Cloud Armor setting of sorts, and Cloud Armor is quite sophisticated, that is looking for the string "bfac" anywhere within the User-Agent value it gets as part of the request.
whois for 34.96.108.209
NetRange: 34.64.0.0 - 34.127.255.255
CIDR: 34.64.0.0/10
NetName: GOOGL-2
NetHandle: NET-34-64-0-0-1
Parent: NET34 (NET-34-0-0-0-0)
NetType: Direct Allocation
OriginAS:
Organization: Google LLC (GOOGL-2)
RegDate: 2018-09-28
Updated: 2018-09-28
Ref: https://rdap.arin.net/registry/ip/34.64.0.0
OrgName: Google LLC
OrgId: GOOGL-2
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US
RegDate: 2006-09-29
Updated: 2019-11-01
Comment: *** The IP addresses under this Org-ID are in use by Google Cloud customers ***
Comment:
Comment: Direct all copyright and legal complaints to
Comment: https://support.google.com/legal/go/report
Comment:
Comment: Direct all spam and abuse complaints to
Comment: https://support.google.com/code/go/gce_abuse_report
Comment:
Comment: For fastest response, use the relevant forms above.
Comment:
Comment: Complaints can also be sent to the GC Abuse desk
Comment: ([email protected])
Comment: but may have longer turnaround times.
Comment:
Comment: Complaints sent to any other POC will be ignored.
Ref: https://rdap.arin.net/registry/entity/GOOGL-2
OrgAbuseHandle: GCABU-ARIN
OrgAbuseName: GC Abuse
OrgAbusePhone: +1-650-253-0000
OrgAbuseEmail: [email protected]
OrgAbuseRef: https://rdap.arin.net/registry/entity/GCABU-ARIN
OrgNOCHandle: GCABU-ARIN
OrgNOCName: GC Abuse
OrgNOCPhone: +1-650-253-0000
OrgNOCEmail: [email protected]
OrgNOCRef: https://rdap.arin.net/registry/entity/GCABU-ARIN
OrgTechHandle: ZG39-ARIN
OrgTechName: Google LLC
OrgTechPhone: +1-650-253-0000
OrgTechEmail: [email protected]
OrgTechRef: https://rdap.arin.net/registry/entity/ZG39-ARIN
Would you be able to verify Cloud Armor configuration, just out of curiosity and to make sure it is indeed it?
Re: https://github.com/mazen160/bfac —the project has an option to randomly pick other user agent to make it appear as a popular browser, etc., as such, I am not sure how much "bad traffic" simply blocking "bfac" sheds, perhaps not a lot.
from registry.k8s.io.
@dims I think containerd also includes git commit for pre-release builds, but that's maybe less concerning since tagged releases don't (I think??) ... we should probably take a look at how likely we are to run into this again with other common tools.
I don't love any of the options here. We could invest in custom rules but I think it would take a lot of time and effort to maintain, at the moment this is pretty hands-off and we're spending a lot of time on other sustainability areas.
from registry.k8s.io.
/cc @AkihiroSuda
So Suda-san can take a look at User-Agent in containerd.
from registry.k8s.io.
[...]
I don't love any of the options here. We could invest in custom rules but I think it would take a lot of time and effort to maintain, at the moment this is pretty hands-off and we're spending a lot of time on other sustainability areas.
@BenTheElder, yeah. Like I said, it would be a headache, indeed.
Protecting the registry, whichever way we can, takes the precedence here. This goes without saying.
from registry.k8s.io.
Related Issues (20)
- Not able to install K8s Cluster using kubeadm init command due to x509: certificate signed by unknown authority. HOT 3
- Unable to download docker images from registry.k8s.io due to x509: certificate is not valid for any names, but wanted to match prod-registry-k8s-io-ap-south-1.s3.dualstack.ap-south-1.amazonaws.com HOT 2
- Unable to download docker images from registry.k8s.io due to x509: certificate is not valid for any names, but wanted to match prod-registry-k8s-io-ap-south-1.s3.dualstack.ap-south-1.amazonaws.com HOT 1
- kubeadm init clone project to local compilation error HOT 11
- TLS handshake timeout pulling registry.k8s.io/kube-apiserver:v1.25.4 HOT 8
- Unable to pull registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.2 HOT 5
- Error response from daemon: Head "https://europe-west3-docker.pkg.dev/v2/k8s-artifacts-prod/images/sig-storage/csi-provisioner/manifests/v4.0.0": Forbidden HOT 3
- regional outage due to GCP us-west1 incident HOT 17
- Consider blocking some invalid requests at the edge HOT 1
- enable outlier detection HOT 5
- K8's registry block server IP HOT 4
- investigate switching to signed URLs HOT 4
- how to configure kubernetes registry to pull images i am getting image pull back issue HOT 1
- PULL REQUEST FAIL WHEN TRYING TO PULL IMAGE FROM registry.k8s.io HOT 4
- Disconnected Environments and "mirroring to a location you control" HOT 1
- Handle missing referrers API HOT 4
- switch to aws-sdk-go-v2 HOT 3
- Please use subdomains for wildcard dns entries instead of path substitution HOT 5
- missing ingress-nginx on this repository HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from registry.k8s.io.