fmadio / pcap2json Goto Github PK
View Code? Open in Web Editor NEWHigh Speed PCAP to JSON conversion utility
License: Other
High Speed PCAP to JSON conversion utility
License: Other
Hi,
this may simply be an incompletion:
g_CPUCore
is an array of size 2, but the command line arg --cpu-core
can only set g_CPUCore[0]
, g_CPUCore[1]
is generally unusedg_CPUOutput
is completely unused, output threads seem not to be implementedHi,
Need help in execution of the below command line to Upload packet data directly into Elastic stack. Getting "Unknown command line option" when using the pcap2json utility.
I have cloned the project on a Ubuntu 20.04 VM. and used make command to build the pcap2json utility. Let me know if anything is amiss.
root@es7:~/pcap2json# cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
root@es7:/pcap2json# cat /home/student/ELK/http.cap | ./pcap2json --json-packet --capture-name http --output-espush --es-compress --es-host 192.168.1.248:9200/pcap2json# cat /home/student/ELK/http.cap | ./pcap2json --json-packet --output-espush --es-compress --es-host 192.168.1.248:9200 pcap2json https://www.github/fmadio/pcap2json build:Mar 31 2023 06:07:57
pcap2json https://www.github/fmadio/pcap2json build:Mar 31 2023 06:07:57
[--json-packet]
Write JSON Packet meta data
[--capture-name]
Unknown command line option [--capture-name]
root@es7:
[--json-packet]
Write JSON Packet meta data
[--output-espush]
Unknown command line option [--output-espush]
root@es7:/pcap2json# cat /home/student/ELK/http.cap | ./pcap2json --json-packet --es-compress --es-host 192.168.1.248:9200/pcap2json# cat /home/student/ELK/http.cap | ./pcap2json --json-packet --es-host 192.168.1.248:9200
pcap2json https://www.github/fmadio/pcap2json build:Mar 31 2023 06:07:57
[--json-packet]
Write JSON Packet meta data
[--es-compress]
Unknown command line option [--es-compress]
root@es7:
pcap2json https://www.github/fmadio/pcap2json build:Mar 31 2023 06:07:57
[--json-packet]
Write JSON Packet meta data
[--es-host]
Unknown command line option [--es-host]
root@es7:~/pcap2json# ./pcap2json --help
pcap2json https://www.github/fmadio/pcap2json build:Mar 31 2023 06:07:57
[--help]
fmad engineering all rights reserved
http://www.fmad.io
pcap2json is a high speed PCAP meta data extraction utility
example converting a pcap to json:
cat /tmp/test.pcap | pcap2json > test.json
Command Line Arguments:
--index-name : capture name to use for ES Index data
--verbose : verbose output
--config : read from config file
--cpu-core : cpu map for core thread
--cpu-flow <cpu0..cpu n-1> : cpu count and map for flow threads
--cpu-output <cpu0..cpu n-1> : cpu map for output threads
--json-packet : write JSON packet data
--json-flow : write JSON flow data
Instance Info
--instance-id : instance id of this pcap2json FE
--instance-max : total number of pcap2json FE instances
Output Mode
--output-stdout : writes output to STDOUT
--output-espush : writes output directly to ES HTTP POST
--output-histogram : Enable histogram output and writes it to file
--output-buffercnt : number of output buffers (default is 64)
--output-keepalive : enable keep alive (persistent) ES connection
--output-filterpath : reduce data back from the ES cluster
--output-threadcnt : number of worker threads for ES push (default is 32)
--output-mergemin : minimum number of blocks to merge on output
--output-mergemax : maximum number of blocks to merge on output
Flow specific options
--flow-samplerate : scientific notation flow sample rate. default 100e6 (100msec)
--flow-index-depth : number of root flow index to allocate defulat 6
--flow-max : maximum number of flows (default 250e3)6
--flow-top-n : only output the top N flows
--flow-top-n-circuit <sMAC_dMAC> : output top N flows based on specified src/dest MAC
--flow-template "" : Use a customized template for JSON output
--flow-roll-read "temp file" : Capture roll read parital snapshot to disk
--flow-roll-write "temp file" : Capture roll write parital snapshot to disk
Elastic Stack options
--es-host hostname:port : Sets the ES Hostname
--es-timeout : Sets ES connection timeout in milliseconds (Default: 2000 msec)
--es-compress : enables gzip compressed POST
--es-null : use ES Null target for perf testing
--es-queue-path : ES Output queue is file backed
ICMP options
--icmp-overwrite : overwrite IP Proto info for ICMP packets
TopN currently calculated for all flows. How to specify multiple TopN circuits is not clear.
e.g. filtering it so
--flow-top-n-circuit 00:11:22:33:44:55_66:77:88:99:aa:bb
would create a TopN flow list just for that MAC pair, with all other flows going into a generic TopN list
multiple circuits could be specified, please advise best approach
instead of filtering based on MAC src/dst also add the option for TopN filtering using a VLAN tag
Add 100usec microburst information per flow.
Need to work out what an memory efficient implementation of that would look like.
support round robbin multiple ES host/ports so it can scale up
using the FCS flag in the chunked output, add a FCS counter per flow. wont be possible with the regular PCAP mode but thats ok
current bulk upload script is in-efficient as it stalls the output of pcap2json while the HTTP transfer is in progress.
modify this so multiple upload processes can be run in parallel while pcap2json is still running.
e.g. make it fully pipelined with mutliple con-current pushes
Idea is to remove the MPLS tag from the flow calculation. however it means the JSON flow record will no longer be correct. e.g. multiple MPLS tags will be aggregated into a single flow record.
Is this ok?
Hi,
pcapng being the new default format for Wireshark and tshark tools, is there any plan to make pcapng format compatible with pcap2json?
For the time being, I am unable to convert my pcapng files back to pcap, and I get the following error when stding my pcapng into pcap2json : "invaliid PCAP format 0a0d0d0a".
By default, tcp rst has a window size of 0. Since we only want to track window size of regular tcp packets, can we have tcp rst window size be excluded?
FCS error counter per flow is important to look at physical layer issues
HTTP ES push need to be concerend with
lack of output disk space for temporary files
correctly handle retry push in the case ES rejects the upload
when parsing the raw JSON without using ES, no need to send the index info
The output format dosent not have to be in JSON format, as the runtime code only writes strings and integers into fixed spaced addresses.
As such any template could be used for the output format. Enabling more variation on output formats, both JSON, CSV, and pure binary output. Technically could output IPFIX in this way.
The overhead should be quite minimal
using a round robbin scheduler for ES push allows for better ES utilization + adds redundancy should a single ES node fail.
See here:
Lines 1580 to 1581 in 9a58a2d
This crashes when I do a live capture on my local network because I get sizes of up to 64374 bytes. So why limit it to 14 bits instead of 16?
add DSCP flags as part of the flow Hash + output the IP.DSCP into the flow data
use case is
blk upload packet and flow data to separate indexs
option to reduce the total upload bandwidth by disabling the packet data (but keep sending flow data)
flow data is ~ x100 reduction to the PCAP
packet data is ~ x4 reduction to the PCAP
Need to work out a way to calculate some TCP re-transmission stats . its problem because all worker threads run in-dependently
ES uses netty which supports HTTP 1.1
https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html
in theory we should be able to use persistent connections for the ES push side of things. Requires some investigation on whats possible
in the upload script enable HTTP compression to reduce the bandwidth
add space for MPLS.3 for tripple tagging
When reading a little through the code I found the command line option --output-pipe
, which is great! But it's missing in the help text :-)
add command line option to only output the top N flows per snapshot
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.