Git Product home page Git Product logo

webrisk's Introduction

Web Risk Client App | Container & Go

Web Risk is the enterprise version of Google's Safe Browsing API that protects 5 Billion devices globally from dangerous URLs including phishing, malware, unwanted software, and social engineering.

This client implements the Web Risk Update API, which allows for URLs to be checked for badness via privacy-preserving and low-latency API. It works out-of-the-box via either Docker or Go.

This README provides a quickstart guide to running a client either with Docker or as Go binaries. It also serves as a reference implementation of the API. The GoDoc and API documentation in the .go source files provide more details on fine-tuning the parameters if desired.

Supported clients:

  • wrserver runs a thin HTTP client that can query URLs via a POST request or a redirection endpoint that diverts bad URLs to a warning page. This is the client wrapped by Docker.
  • wrlookup is a command line service that takes URLs from STDIN and outputs results to STDOUT. It can accept multiple URLs at a time on separate lines.

Supported blocklists:

The client is originally forked from the Safebrowsing Go Client.

Enable Web Risk

To begin using Web Risk, you will need a GCP Account and a project to work in.

  1. Enable the Web Risk API.

  2. Create an API Key.

  3. Enable Billing for your account and make sure it's linked to your project.

Install Docker and/or Go

To use the Container App, you will need Docker. To compile binaries from source or run tests install Go.

Docker Quickstart (recommended)

We have included a Dockerfile to accelerate and simplify onboarding. This container wraps the wrserver binary detailed below.

Clone and Build Container

Building the container is straightforward.

First, clone this repo into a local directory.

git clone https://github.com/google/webrisk && cd webrisk

Build the container. This will run all tests before compiling wrserver into a distroless container.

docker build --tag wr-container .

Run Container

We supply the APIKEY as an environmental variable to the container at runtime so that the API Key is not revealed as part of the docker file or in docker ps. This example also provides a port binding.

docker run -e APIKEY=XXXXXXXXXXXXXXXXXXXXXXX -p 8080:8080 wr-container

wrserver defaults to port 8080, but you can bind any port on the host machine. See the Docker documentation for details.

See Using wrserver below for how to query URLs or use the redirection endpoint.

Go Binary Quickstart | wrlookup example

The Go Client can be compiled and run directly without Docker. In this example we will use that to run the wrlookup binary that takes URLs from STDIN and outputs to STDOUT.

Before compiling from source you should install Go and have some familiarity with Go development. See here for a good place to get started.

Clone Source & Install Dependencies

To download and install this branch from the source, run the following commands.

First clone this repo into a local directory and switch to the webrisk directory.

git clone https://github.com/google/webrisk && cd webrisk

Next, install dependencies.

go install .

Build and Execute wrlookup

After installing dependencies, you can build and run wrlookup

go build -o wrlookup cmd/wrlookup/main.go

Run the binary and supply an API key.

./wrlookup -apikey=XXXXXXXXXXXXXXXXXXXXXXX

You should see some output similar to below as wrlookup starts up.

webrisk: 2023/01/27 19:36:46 database.go:110: no database file specified
webrisk: 2023/01/27 19:36:53 database.go:384: database is now healthy
webrisk: 2023/01/27 19:36:53 webrisk_client.go:492: Next update in 30m29s

wrlookup will take any URLs from STDIN. Test your configuration with a sample:

http://testsafebrowsing.appspot.com/s/social_engineering_extended_coverage.html #input
Unsafe URL: [SOCIAL_ENGINEERING_EXTENDED_COVERAGE] # output

Using wrserver

wrserver runs a WebRisk API lookup proxy that allows users to check URLs via a simple JSON API. This local API will use the API key supplied by the Docker container or the command line that runs the binary.

First start the wrserver by either running the container or binary.

To run in Docker:

docker run -e APIKEY=XXXXXXXXXXXXXXXXXXXXXXX -p 8080:8080 <container_name>

To run from a CLI, compile as wrlookup above and run:

./wrserver -apikey=XXXXXXXXXXXXXXXXXXXXXXX

With the default settings this will start a local server at 0.0.0.0:8080.

The server has a lightweight implementation of a Web Risk Lookup API-like endpoint at v1/uris:search. To use the local endpoint to check a URL, send a POST request to 0.0.0.0:8080/v1/uris:search with the a JSON body similar to the following.

{
  "uri":"http://testsafebrowsing.appspot.com/s/social_engineering_extended_coverage.html"
}

A sample cURL command:

curl -H 'Content-Type: application/json' \
	-d '{"uri":"http://testsafebrowsing.appspot.com/s/social_engineering_extended_coverage.html"}' \
	-X POST '0.0.0.0:8080/v1/uris:search'

See Sample URLs below to test the different blocklists.

wrserver also serves a URL redirector listening on /r?url=... which will show an interstitial for anything marked unsafe.

If the URL is safe, the client is automatically redirected to the target. Otherwise an interstitial warning page is shown as recommended by Web Risk.

Try some sample URLs:

http://0.0.0.0:8080/r?url=https://testsafebrowsing.appspot.com/s/social_engineering_extended_coverage.html
http://0.0.0.0:8080/r?url=https://testsafebrowsing.appspot.com/s/malware.html
http://0.0.0.0:8080/r?url=https://www.google.com/

Differences from Web Risk Lookup API

There are two significant differences between this local endpoint and the public v1/uris:search endpoint:

  • The public endpoint accepts GET requests instead of POST requests.
  • The local wrserver endpoint uses the privacy-preserving and lower latency Update API making it better suited for higher-demand use cases.

Sample URLs

For testing the blocklists, you can use the following URLs:

Troubleshooting

4XX Errors

If you start the client without proper credentials or project set up, you will see an error similar to what is shown below on startup:

webrisk: 2023/01/27 19:36:13 database.go:217: ListUpdate failure (1): webrisk: unexpected server response code: 400

For 400 errors, this usually means the API key is incorrect or was not supplied correctly.

For 403 errors, this could mean the Web Risk API is not enabled for your project or your project does not have Billing enabled.

About the Social Engineering Extended Coverage List

This is a newer blocklist that includes a greater range of risky URLs that are not included in the Safebrowsing blocklists shipped to most browsers. The extended coverage list offers significantly more coverage, but may have a higher number of false positives. For more details, see here.

WebRisk System Test

To perform an end-to-end test on the package with the WebRisk backend, run the following command after exporting your API key as $APIKEY:

go test github.com/google/webrisk -v -run TestWebriskClient

webrisk's People

Contributors

damonfstr avatar hananothman avatar jiayuanmasc avatar rvilgalys avatar thatjiaozi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

webrisk's Issues

JSON message is not unmarshalling correctly into protobuf message

It seems that when the database is updated and receives a new version token the unmarshalling is not done correctly.

Expected: Cg0IARAGGAEiAzAwMTABEP/7BhoCGAnR73ir
unmarshalled: [10 13 8 1 16 6 24 1 34 3 48 48 49 48 1 16 255 251 6 26 2 24 9 209 239 120 171]

This could be an issue with https://godoc.org/github.com/golang/protobuf/jsonpb itself.

I created the branch json-marshalling-bug for future investigation, meanwhile i am going to revert the changes that introduced this bug.

Wrong doc for search browsing

The documentation is not correct for the wrserver as /v4/threatMatches:find and /v4/threatLists are not implemented anymore.

// API endpoints:
//	/v4/threatMatches:find
//	/v4/threatLists
//	/status
//	/r

As I understood, SB API are not usable for commercial user, so we must use webrisk API that doesn't include these features (threatMatches)

Maybe you should add a deprecation or warning in the readme; I spent some time figuring It out.

Unable to build wr-container

when following the instructions here --

https://github.com/google/webrisk#clone-and-build-container

I'm unable to build the wr-container, I'm getting the below error

corey@CROBINSON:~/webrisk$ docker build --tag wr-container .
[+] Building 10.1s (4/4) FINISHED                                                                                                                                                                           docker:default
 => [internal] load .dockerignore                                                                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                                                                       0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                                                  0.0s
 => => transferring dockerfile: 766B                                                                                                                                                                                  0.0s
 => ERROR [internal] load metadata for gcr.io/distroless/static-debian11:latest                                                                                                                                      10.1s
 => [internal] load metadata for docker.io/library/golang:1.19                                                                                                                                                        5.4s
------
 > [internal] load metadata for gcr.io/distroless/static-debian11:latest:
------
Dockerfile:19
--------------------
  17 |     RUN CGO_ENABLED=0 go build -o /go/bin/wrserver cmd/wrserver/main.go
  18 |     
  19 | >>> FROM gcr.io/distroless/static-debian11 as wrserver
  20 |     
  21 |     COPY --from=build /go/bin/wrserver /
--------------------
ERROR: failed to solve: gcr.io/distroless/static-debian11: failed to do request: Head "https://gcr.io/v2/distroless/static-debian11/manifests/latest": net/http: TLS handshake timeout

I've also verified that docker is able to build containers from other Dockerfiles.

corey@CROBINSON:~/getting-started-app$ docker build -t getting-started .
[+] Building 40.0s (13/13) FINISHED                                                                                                                                                                         docker:default
 => [internal] load build definition from Dockerfile                                                                                                                                                                  0.0s
 => => transferring dockerfile: 182B                                                                                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                                                                       0.0s
 => resolve image config for docker.io/docker/dockerfile:1                                                                                                                                                            6.5s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                                                                                                                      0.0s
 => docker-image://docker.io/docker/dockerfile:1@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021                                                                                              2.7s
 => => resolve docker.io/docker/dockerfile:1@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021                                                                                                  0.0s
 => => sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 8.40kB / 8.40kB                                                                                                                        0.0s
 => => sha256:657fcc512c7369f4cb3d94ea329150f8daf626bc838b1a1e81f1834c73ecc77e 482B / 482B                                                                                                                            0.0s
 => => sha256:a17ee7fff8f5e97b974f5b48f51647d2cf28d543f2aa6c11aaa0ea431b44bb89 1.27kB / 1.27kB                                                                                                                        0.0s
 => => sha256:9d9c93f4b00be908ab694a4df732570bced3b8a96b7515d70ff93402179ad232 11.80MB / 11.80MB                                                                                                                      2.5s
 => => extracting sha256:9d9c93f4b00be908ab694a4df732570bced3b8a96b7515d70ff93402179ad232                                                                                                                             0.2s
 => [internal] load metadata for docker.io/library/node:18-alpine                                                                                                                                                     5.7s
 => [auth] library/node:pull token for registry-1.docker.io                                                                                                                                                           0.0s
 => [1/4] FROM docker.io/library/node:18-alpine@sha256:435dcad253bb5b7f347ebc69c8cc52de7c912eb7241098b920f2fc2d7843183d                                                                                               9.0s
 => => resolve docker.io/library/node:18-alpine@sha256:435dcad253bb5b7f347ebc69c8cc52de7c912eb7241098b920f2fc2d7843183d                                                                                               0.0s
 => => sha256:51490771aba658439d29b1b03b60fc31e67bf0da3e01cb5903716310df4be1c1 1.16kB / 1.16kB                                                                                                                        0.0s
 => => sha256:d1517ab6615b781f3b81f339100063d1b2b41f1a32a9efb8563ecd1375311c22 6.78kB / 6.78kB                                                                                                                        0.0s
 => => sha256:96526aa774ef0126ad0fe9e9a95764c5fc37f409ab9e97021e7b4775d82bf6fa 3.40MB / 3.40MB                                                                                                                        0.9s
 => => sha256:3130715204cf4a9be94608d180505f50862416589a8f03eba7b664f15b9c0283 47.88MB / 47.88MB                                                                                                                      5.5s
 => => sha256:b06de8ab1c4feccaf7b687bb7ebd5180c2bd1f59d91749619d52af77fd38ea13 2.34MB / 2.34MB                                                                                                                        5.9s
 => => sha256:435dcad253bb5b7f347ebc69c8cc52de7c912eb7241098b920f2fc2d7843183d 1.43kB / 1.43kB                                                                                                                        0.0s
 => => extracting sha256:96526aa774ef0126ad0fe9e9a95764c5fc37f409ab9e97021e7b4775d82bf6fa                                                                                                                             0.1s
 => => sha256:90ef3ffc51561ffa7dfafd2dc93f44601f8d4d4273ad8d54dbf34326c746a142 448B / 448B                                                                                                                            5.3s
 => => extracting sha256:3130715204cf4a9be94608d180505f50862416589a8f03eba7b664f15b9c0283                                                                                                                             3.1s
 => => extracting sha256:b06de8ab1c4feccaf7b687bb7ebd5180c2bd1f59d91749619d52af77fd38ea13                                                                                                                             0.1s
 => => extracting sha256:90ef3ffc51561ffa7dfafd2dc93f44601f8d4d4273ad8d54dbf34326c746a142                                                                                                                             0.0s
 => [internal] load build context                                                                                                                                                                                     0.1s
 => => transferring context: 6.87MB                                                                                                                                                                                   0.1s
 => [2/4] WORKDIR /app                                                                                                                                                                                                0.1s
 => [3/4] COPY . .                                                                                                                                                                                                    0.0s
 => [4/4] RUN yarn install --production                                                                                                                                                                              14.5s
 => exporting to image                                                                                                                                                                                                1.1s
 => => exporting layers                                                                                                                                                                                               1.1s
 => => writing image sha256:248f5ebbe002147aa6959857ad43f5b3850291153eae8a1e41cea5c8297f8a04                                                                                                                          0.0s
 => => naming to docker.io/library/getting-started

Anyone have any suggestions on what I'm doing wrong?

Local DB setup

Hi, i'm trying to use it on local machine but can't figure out how to setup local database file. As I understood from code it must be gzip archive but what must be inside archive?

ListUpdate failure in database.go

Hi, I just tried to run wrserver but I got the following error:

go get github.com/google/webrisk/cmd/wrserver

wrserver -apikey MY_API_KEY
webrisk: 2022/02/16 16:25:41 database.go:111: no database file specified
webrisk: 2022/02/16 16:25:41 database.go:218: ListUpdate failure (1): webrisk: unexpected server response code: 403
webrisk: 2022/02/16 16:25:41 webrisk_client.go:496: Next update in 29m6.45817924s
Starting server at localhost:8080

Probably something wrong with the API client implementation. Could you take a look at this issue, plase?

Make cloud request configurable

Hello Team,

could you make the request that is sent to the Cloud API configurable? What I mean is: If a URL lookup via hash is unsure via the local database and a request to the cloud has to be made (via sb.api.HashLookup), this request could also be made by passing the full URL instead, right? This should obviously not be the default behaviour, but hidden behind a feature flag.

Let me know what you think.

Allow use of the Instance Metadata Service instead of using an API Key

Hello,

I realized when browsing this project that this will require an API Key to work whereas the Webrisk API seems to be compatible with tokens generated by the IMDS service.
More specifically, using the library https://godoc.org/cloud.google.com/go/webrisk/apiv1beta1 that is referenced in the Webrisk doc doesn't require an API Key.

Is this a feature that is planned for this project?

Or would you be willing to accept a PR to add a new API struct that would implement the api interface and leverage the webrisk library above to use the IMDS instead of an API Key?

Local DB setup

Hi,
Is there any specific format to give the DB path..?(To use Update API). I tried with empty dbpath. It's not working. After saw the code, I tried with .gz(which is in gob format). Even after that also, it shows me inconsistent db. My API key is working fine with Google webrisk api. What I'm missing..?

Make cache configurable

I am wondering if the cache within WebriskClient can be made extendable and configurable.

The specific use case I have is to back the cache by Redis/Memcache which can be shared by multiple replicas of WebriskClient running on separate boxes.

IP address with space may not be canonicallzed as intended.

In urls.go there is this comment:

           // The Windows resolver allows a 4-part dotted decimal IP address to have a
           // space followed by any old rubbish, so long as the total length of the
           // string doesn't get above 15 characters. So, "10.192.95.89 xy" is
           // resolved to 10.192.95.89. If the string length is greater than 15
           // characters, e.g. "10.192.95.89 xy.wildcard.example.com", it will be
           // resolved through DNS.

This is really important as it’s yet another way an attacker could potentially bypass the system. There is a test that covers this case by testing the string "10.192.95.89 xy". However, this test only covers the parseIPAddress function. When this same string is passed through the canonicalURL function it fails because the host is escaped and thus the string passed to parseIPAddress is "10.192.95.89%20xy". It might be a good idea to run all test cases through canonicalURL (which is what I did.)

Client may not be handling percent escape the right way

Per Webrisk documentation:

In the URL, percent-escape all characters that are <= ASCII 32, >= 127, #, or %. The escapes should use uppercase hex characters.

Currently urls.go percent escapes things with a lower case hex:

urls.go:164 b.WriteString(fmt.Sprintf("%%%02x", c))

We need to investigate that percent escaping is done correctly

doRequest does not provide enough information when the response is not a 200

Google provides a reason why for some requests when they come back so it would be helpful to provide that response when debugging issues.

We have an issue right now that we can't figure out where after a few days weeks of our app running we start getting 403s from the api but all we get is

webrisk: 2023/01/11 21:01:51 database.go:218: ListUpdate failure (3882620): webrisk: unexpected server response code: 403

Which is not really useful at all and I've tracking it down to line 96 in api.go.

Maybe the return error could contain the body as a string or something until a more developed response error can be created?

I figure this isn't handled yet because the documentation doesn't provide a definition of the possible response bodies when it's not a 200 so without just breaking random parts of the request to see what the responses are it's impossible to know what some type of ErrorResponse would look like.

`No Database File Specified` error when first setting up

Howdy! I'd emailed this in to web-risk-support last week, figured I may get more traction here as I'd not heard back.

pi@amethyst:~ $ wrserver -apikey $APIKEY
webrisk: 2019/10/21 15:17:04 database.go:111: no database file specified
webrisk: 2019/10/21 15:17:05 database.go:218: ListUpdate failure (1): webrisk: unexpected server response code: 403
webrisk: 2019/10/21 15:17:05 webrisk_client.go:496: Next update in 21m33.942768468s
Starting server at localhost:8080
^C
pi@amethyst:~ $

When I try to run it, I'm both getting a 403 error (which may be out of scope of the client library, unless something's malformed?), as well as no database file specified -- which seems odd as according to the config, if it's not specified / empty, the tool should just operate in a non-persistent manner.

db.log.Printf("no database file specified")

webrisk/webrisk_client.go

Lines 156 to 160 in 1694bb2

// DBPath is a path to a persistent database file.
// If empty, WebriskClient operates in a non-persistent manner.
// This means that blacklist results will not be cached beyond the lifetime
// of the WebriskClient object.
DBPath string

Have I missed something in setting it up, or is there something that could be added to the docs to simplify?

How to compute hash prefixes

I read the docs but really didn't understand how to compute hash prefixes. It says it should be between 4-32 bytes but how can we decide the size?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.