Git Product home page Git Product logo

dank-dns's People

Contributors

mrpickles avatar zxlin avatar

Watchers

 avatar  avatar  avatar  avatar

dank-dns's Issues

Write query request/response pairs to disk with the ELK stack

The third way of storing the query data is by using the ELK stack. The ELK stack is the combination of Elasticsearch, Logstash, and Kibana, which conveniently happens to be optimized for big data analytics in the cloud (and DNS query processing).

TOQ Analysis

Analysis - Types of Queries (TOQ)

Records the total number of each type of query sent to both the old and new IP addresses. Malformed queries (in terms of structure, not due to the use of invalid types/classes) are placed in their own category.

SCS Analysis

Analysis - Source Counts and Swaps (SCS)

Records the number of queries sent to both the new and old IP addresses before/after the old server started to advertise the new IP address. In addition, this notes the number of times (if any) the source switched between querying the old and new IP addresses.

In addition, records the time when a source switches from initially querying the old IP address to new one. Note that this will not count a transition such as New -> Old -> New.

Also captures unique IP addresses that queried the old server more than 3 times after the switch occured. This lower bound (3) is used to avoid priming queries, which occur under correct resolver operation.

Write query request/response pairs to disk as C structs

One way of having persistent and quick-to-read query data is to put all the data into C structs which will be written to disk. If a user wants to access the data quickly for scripting/analysis purposes, the user can write code to read from disk and manually set the data as the C structs.

Multi-process or multi-thread pcap processing

In its current state, the executable processes pcaps/packets one at a time on a single thread. This is utterly unacceptable when we are crunching big data on a beefy big data machine.

We can utilize the multiple cores via concurrent pcap processing or concurrent packet processing. Doing each file concurrently is probably a simpler task, though making the concurrency work on the packet level is probably better, especially if the number of files is low.

Database benchmarks

MongoDB aggregation for:

  • Queries per second
  • Top N source IP (note: IP is stored as binary in DB)
  • Top N questions (note: questions are arrays)

Traffic Validity Analysis

Analysis - Traffic Validity

Determining the amount of valid/invalid traffic for the both the new and old IP address, before and after the old server started announcing the new IP address. In addition, for any invalid traffic breaking the TLD rule (3), we keep track of the number of infractions for each unique TLD.

I am basing the rules on valid/invalid traffic from the papers "A Day at the Root of the Internet" (SIGCOMM 08) and "Wow, That's a Lot of Packets" (PAM 03). The rules are in sorted order of priority in the structure below, meaning that we attribute only 1 invalidation reason to each query.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.