Git Product home page Git Product logo

Comments (5)

enotspe avatar enotspe commented on July 20, 2024 1

Please make that PR!!!!

I had in mind adding this kind of integration later on (very far away), or wait for Elastic to add it.
I was thinking on using Minemeld as a Threat Intelligence aggregator, I also noticed that there is a MISP module.

Anyway, I review it a litle bit and found several challenges:

  1. Threat Intel aging: we had a Minemeld instance getting IOCs from around 15 feeds, and the numbers kept growing and growing, after a couple of weeks we had around 500K IOCs. Just to put some perspective, Fortiguard manages less than 100k. I think MISP can provide some value for aging already on the feed. With Minemeld it will have to be set manually (how long is too long?)
  2. Logstash vs Ingest Pipeline: For such a heavy lookup, I am not sure what can perform better, if logstash or ingest pipeline with enrich processors. Maybe it would help some benchmark tests (if somebody can share or find one)
  3. Backwards lookup: with threat intel, you not only want to enrich your current log feed, but also to look backwards if an old logs matches some new IOC, so you can find past traces of malicious activity. This could be done with logstash with the ES plugin, but again, performance might be an issue. Maybe a better way would be with the new async searches.

Let us know what you think and what challenges do you see. Also what strategy do you plan to use.

Again, many thanks for your contributions and ideas

from fortinet-2-elasticsearch.

Cyb3rSn0rlax avatar Cyb3rSn0rlax commented on July 20, 2024

Hello,
Sorry for the late response.
I was thinking about basically 3 milestones :

  • - Short-term Milestone :
    Make repository user manually add some blacklisting lists whether they are usernames in vacations, bad IPs or malicious domains..etc. The intention here is to make it easy for people to enrich events in real time in correlation with a pre-defined lists (yaml dictionaries in our case). This is something can be added today as a pipeline between geo_enrichment and logstash_enrichment. Here is an example I am currently using with this project:
input {
    pipeline {
        address => "blacklist"
    }
}

filter {
    if "Firewall" in [observer][type] and "public" in [destination][locality] {
        translate {
            field => "[destination][ip]"
            destination => "BLACKLISTED"
            dictionary_path => '/home/ubuntu/intel/blacklist/BlackListIP.yml'
            fallback => "NO"
        }
    }
}

output {

    pipeline {
        send_to => "logstash_enrichment"
    }

}

Where BlackListIP.yml is something like :

#"IP":"YES for Blacklisted"
"103.129.98.17" : "YES"
"103.253.73.77" : "YES"
"103.83.81.144" : "YES"
"104.18.36.98" : "YES"
"107.175.64.210" : "YES"
"108.171.216.194" : "YES"
"110.4.45.119" : "YES"
"184.168.221.43" : "YES"
"185.104.45.20" : "YES"
"185.174.100.116" : "YES"
"185.193.38.74" : "YES"
"192.138.20.112" : "YES"
"217.174.152.68" : "YES"
"50.63.202.57" : "YES"
"62.173.145.104" : "YES"
"69.167.178.28" : "YES"
"79.96.191.147" : "YES"
"79.98.28.30" : "YES"
"85.93.145.251" : "YES"
"91.189.114.7" : "YES"
"94.23.64.40" :  "YES"

Then we can use some field coloring and even scripted field to automate the analysis even further make extra correlation:
image

  • Mid-term Milestone :

Threat aging and Backwards Lookup are really some great challenges due to the design of Elasticsearch that deep down it creates new documents instead of updating them.

For threat aging it really depends on the consumer, their needs of Threat Intel and most importantly the size of their cluster because you can't just keep all the data. Each organization should define their needs in this matter and if we are being real, according to the pyramid of pain, IOCs are one of the most easily changeable data
However, we can handle this challenge by creating dedicated indexes under the ecs-* index with special mapping based on the timestamp of the IOC creation dates and with ILM index life cycle management in a HOT/Warm/cold architecture.

For the Backwards Lookup, I think we can use some fingerprinting with Logstash in order to avoid duplicates and use it to update the documents instead of creating new one since the challenge here is that an IP might be good today but after a while it might be malicious and instead of creating a new document we should update the first one by replacing the id of every ingested threat intel document by its fingerprint. This is something that was used in this project (https://git.deepaknadig.com/deepak/sdn-threat-intelligence/-/tree/master/)

mutate {
    add_field => { "occurrences" => 1 }

# Fingerprinting to remove duplicates
  fingerprint {
    concatenate_sources => true
    source => ["[threat-data][type1]", "[threat-data][value1]"]
    target => "[@metadata][fingerprint]"
    method => "MURMUR3"
  }

and the output it like this :

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    action => "update"
    doc_as_upsert => true
    script => "if (ctx._source.occurrences != null) {ctx._source.occurrences += 1}"
    # Replace "document_id" with "fingerprint"
    document_id => "%{[@metadata][fingerprint]}"
    index => "threats"
  }
}

The other key for this to work properly is to create the threat intel index under ecs index so that we can create scripted fields to correlate IOCs with the observers fields (source.ip, detination.ip, hash value, domain name...), Maybe we can then build our queries based on the date of the ingestion (not the creation of the ioc) and once its old enough with our ILM policy we make it warm than delete. (this theory definitly have some flaws but we should try and test everything).

  • Long-term Milestone:

Create a web application (FLASK, python or anything simple) to automate the upload and download and the ingestion of these tasks for SIEM engineers and Threat Intelligence Analysts like for example :

  • Add a list of users on vacation.
  • Update the list by deleting or adding a user.
  • Add a temporary list of internal IPs of possibly infected workstations.

Basically this application is gonna replace the manual effort of creating yaml dictionaries or cron jobs that download CTI feeds.

Tell me what you think of the validity of these actions

from fortinet-2-elasticsearch.

enotspe avatar enotspe commented on July 20, 2024

wow! that looks super interesting. Mid-term solutions seems quite interesting.

Just as a suggestion, I think we should work with the ingest processor instead of logstash dictionaries lookups. That way you could also visualize your threat intel data. Actually, next steps for the project are to move all lookups to enrich processors.

About short-term, as I said, a threat intel database can go easly as big as 100k IPs, so I don´t know how a logstash lookup could impact performace, specially because firewalls logs can get very heavy (if you enable log-all on implicity deny rule for example). You can easly go above 2k EPS. It is worth to try it out though, let´s push the limits until it breaks.

from fortinet-2-elasticsearch.

nicpenning avatar nicpenning commented on July 20, 2024

I have some general comments/questions:

Does a dictionary file get updated in near real time when changed or does it require a deploy to take affect?

What about SIEM detection rules and tripping on not only the Fortinet logs but rather any log that makes it into Elastic? A simple PowerShell or Python script in conjunction with some webhook functionality should be able to create these rules.

I do like the idea of having an event tagged though because even then you could using the alerting when those tags trip.

Good ideas all around.

from fortinet-2-elasticsearch.

enotspe avatar enotspe commented on July 20, 2024

looks like these guys are on step ahead: https://www.youtube.com/watch?v=8yf9DJ_TO6o

spcecilly when taking into consideration scalability. Blacklists can grow a lot, and we all know firewalls generate tons of logs as well.

@nicpenning when you put new data into a dictionary, logs get enriched automatically with it. No need for restarting your logstash service.

from fortinet-2-elasticsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.