Git Product home page Git Product logo

dnsgrep's Introduction

DNSGrep

A utility for quickly searching presorted DNS names. Built around the Rapid7 rdns & fdns dataset.

How does it work?

This utility assumes the file provided is presorted (both alphabetical, and symbols).

The algorithm is pretty simple:

  1. Use a binary search algorithm to seek through the file, looking for a substring match against the query.
  2. Once a match is found, the file is scanned backwards in 10KB increments looking for a non-matching substring.
  3. Once a non-matching substring is found, the file is scanned forwards until all exact matches are returned.

Limits

There is a built-in limit system. This prevents 2 things:

  1. scanning too far backwards (MaxScan)
  2. scanning too far forwards after scanning backwards (MaxOutputLines)

This allows for any input while stopping requests that are taking too long.

Additionally, this utility does not handle the edge cases(start/end) of files and will return an error if encountered.

Install

go get the following packages:

# used for dnsgrep cli flags
go get "github.com/jessevdk/go-flags"
# used by the experimental server for http routing
go get "github.com/gorilla/mux"
# pull in a string reversal function
go get "github.com/golang/example/stringutil"

Run

The following steps were tested with Ubuntu 16.04 & go 1.11.5.

Generate fdns_a.sort.txt and rdns.sort.txt first using the scripts found in the scripts/ folder:

# Each of these scripts requires:
# * 3 hours+ on an SSD
# * 300GB+ temp disk space (under the same folder)
# * ~65GB  for output output (under the same folder)
# * jq to be installed
./scripts/fdns_a.sh
./scripts/rdns.sh

Run the command line utility:

go run dnsgrep.go -f DNSBinarySearch/test_data.txt -i "amiccom.com.tw"

Run the experimental server in the same folder as fdns_a.sort & rdns.sort.txt:

go run experimentalServer.go

Docker

You can also run the command line utility using Docker:

docker build -t dnsgrep .
docker run --rm -it -v "$PWD"/DNSBinarySearch:/files dnsgrep -f /files/test_data.txt -i ".amiccom.com.tw"

Data Source

The source of this data referenced throughout this repository is Rapid7 Labs. Please review the Terms of Service: https://opendata.rapid7.com/about/

https://opendata.rapid7.com/sonar.rdns_v2/

https://opendata.rapid7.com/sonar.fdns_v2/

Stack Overflow References

via https://unix.stackexchange.com/a/35472

  • we need to sort with LC_COLLATE=C to also sort ., chars

via https://unix.stackexchange.com/a/350068

  • To sort a large file: split it into chunks, sort the chunks and then simply merge the results

License

See LICENSE file.

dnsgrep's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dnsgrep's Issues

No regular expressions?

Looking at the code, it doesn't appear to do regular expressions? It looks like it just does a binary search for something containing the string and finds other matches nearby.

The name seems to imply it should do something similar to the command 'grep', which if I recall was named from the g(lobal)/(regular expression)/p(rint) command sequence in ed. If it doesn't do grep, it probably shouldn't be called grep...

Getting this error (gzip: fdns_a.gz: not in gzip format)

/opt/DNSGrep# ./scripts/fdns_a.sh
--2019-08-14 15:38:21-- https://opendata.rapid7.com/sonar.fdns_v2/2019-01-25-1548417890-fdns_a.json.gz
Resolving opendata.rapid7.com (opendata.rapid7.com)... 52.200.143.45, 34.225.120.26
Connecting to opendata.rapid7.com (opendata.rapid7.com)|52.200.143.45|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /sonar.fdns_v2/ [following]
--2019-08-14 15:38:21-- https://opendata.rapid7.com/sonar.fdns_v2/
Reusing existing connection to opendata.rapid7.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 496866 (485K) [text/html]
Saving to: ‘fdns_a.gz’

fdns_a.gz 100%[===================>] 485.22K 2.24MB/s in 0.2s

2019-08-14 15:38:22 (2.24 MB/s) - ‘fdns_a.gz’ saved [496866/496866]

gzip: fdns_a.gz: not in gzip format
sort: cannot read: 'fileChunk*': No such file or directory

Change split from bytes to lines

Hi.
Change this.
from
split -b100M rdns.rev.lowercase.txt fileChunk
to
split -l2000000 rdns.rev.lowercase.txt fileChunk
in scripts/fdns_a.sh and scripts/rdns.sh
because lose some records when sorting.

I have got error cannot find package "dnsgrep/DNSBinarySearch"

When i type command: go run experimentalServer/experimentalServer.go
I have got the error:
experimentalServer/experimentalServer.go:4:2: cannot find package "dnsgrep/DNSBinarySearch".
The same as error result to type command line: go run dnsgrep.go -f DNSBinarySearch/test_data.txt -i "amiccom.com.tw"

What should i do and how to fix it?

go: inconsistent vendoring in /usr/local/go/src:

First, I did the following

root@pentest:~/internet/rapid# https://github.com/erbbysam/DNSGrep
-bash: https://github.com/erbbysam/DNSGrep: No such file or directory
root@pentest:~/internet/rapid# git clone https://github.com/erbbysam/DNSGrep
Cloning into 'DNSGrep'...
remote: Enumerating objects: 53, done.
remote: Counting objects: 100% (25/25), done.
remote: Compressing objects: 100% (18/18), done.
remote: Total 53 (delta 10), reused 19 (delta 6), pack-reused 28
Unpacking objects: 100% (53/53), 1.47 MiB | 3.18 MiB/s, done.
root@pentest:~/internet/rapid# go get "github.com/jessevdk/go-flags"
root@pentest:~/internet/rapid# go get "github.com/gorilla/mux"
root@pentest:~/internet/rapid# go get "github.com/golang/example/stringutil"
go get: github.com/golang/example@none updating to
        github.com/golang/[email protected]: parsing go.mod:
        module declares its path as: golang.org/x/example
                but was required as: github.com/golang/example
root@pentest:~/internet/rapid# cd DNSGrep/
root@pentest:~/internet/rapid/DNSGrep# go run dnsgrep.go
dnsgrep.go:6:2: package dnsgrep/DNSBinarySearch is not in GOROOT (/usr/local/go/src/dnsgrep/DNSBinarySearch)
dnsgrep.go:11:2: no required module provides package github.com/jessevdk/go-flags: go.mod file not found in current directory or any parent directory; see 'go help modules'
root@pentest:~/internet/rapid/DNSGrep#

Second , I tried to clone dnsgrep to GOROOT src as mentioned here in the last comment.
#3

root@pentest:/usr/local/go/src# git clone https://github.com/erbbysam/DNSGrep
Cloning into 'DNSGrep'...
remote: Enumerating objects: 53, done.
remote: Counting objects: 100% (25/25), done.
remote: Compressing objects: 100% (18/18), done.
^[[A^[[A^[[Aremote: Total 53 (delta 10), reused 19 (delta 6), pack-reused 28
Unpacking objects: 100% (53/53), 1.47 MiB | 1.19 MiB/s, done.
root@pentest:/usr/local/go/src# lsmv DNSGrep dnsgrep^C
root@pentest:/usr/local/go/src# mv DNSGrep dnsgrep
root@pentest:/usr/local/go/src# cd dnsgrep/
root@pentest:/usr/local/go/src/dnsgrep# go run dnsgrep.go -f DNSBinarySearch/test_data.txt -i "amiccom.com.tw"
go: inconsistent vendoring in /usr/local/go/src:
        github.com/gorilla/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        github.com/jessevdk/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod

        To ignore the vendor directory, use -mod=readonly or -mod=mod.
        To sync the vendor directory, run:
                go mod vendor
root@pentest:/usr/local/go/src/dnsgrep#

The problem is

go: inconsistent vendoring in /usr/local/go/src:
        github.com/gorilla/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        github.com/jessevdk/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
        golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod

        To ignore the vendor directory, use -mod=readonly or -mod=mod.
        To sync the vendor directory, run:
                go mod vendor


I tried to run go mod tidy , go mod vendor as mentioned in the error and https://stackoverflow.com/questions/58511588/fixing-go-inconsistent-vendoring-in-c-go-src/58512507

Same error occur.

Any thoughts on how to solve this ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.