Git Product home page Git Product logo

Comments (8)

JarryShaw avatar JarryShaw commented on August 21, 2024 1

Sure, the PyPCAPKit has an interface called pcapkit.foundation.traceflow.TraceFlow. It will follow each TCP stream and either save each stream into a target file or store them as an attribute of the instance. You can simply specify trace=True when calling the pcapkit.foundation.extraction.Extrator constructor. For more information, pls refer to the document.

from pypcapkit.

JarryShaw avatar JarryShaw commented on August 21, 2024

Please kindly provide the source PCAP file and corresponding JSON output. Supposedly, this should be an issue within the JSON serialiser of DictDumper. I shall look into it ASAP.

Thanks for your cooperation and support to the PyPCAPKit project.

from pypcapkit.

evandrix avatar evandrix commented on August 21, 2024

any of these should work, doesn't really matter, you're looking for literal, unescaped hex in the payload, which should be pretty easy to make up

from pypcapkit.

evandrix avatar evandrix commented on August 21, 2024

meanwhile, being unsatisfied that dictdumper absolutely requires a file to write output to (plus your custom implementation is not readily exposed

class DictDumper(output):
), I wrote my own json object hook / serializer class:

import json
import datetime
import ipaddress
class json_serialize(json.JSONEncoder):
	def encode(self,o):
		if isinstance(o,bytes):
			return json.encoder.py_encode_basestring(o.hex())
		elif isinstance(o,datetime.datetime):
			return json.encoder.py_encode_basestring(o.isoformat())
		elif isinstance(o,(ipaddress.IPv4Address,ipaddress.IPv6Address)):
			return json.encoder.py_encode_basestring(str(o))
		elif isinstance(o,dict):
			xs = []
			for k,v in o.items():
				xs.append(":".join([json.encoder.py_encode_basestring(k),self.encode(v)]))
			xs = ",".join(xs)
			return "{%s}"%xs
		elif isinstance(o,list):
			"[%s]"%(",".join(self.encode(v) for v in o))
		return super().encode(o)

USAGE: print(json.dumps(frame.info.info2dict(),ensure_ascii=False,sort_keys=True,separators=(",",":"),cls=json_serialize))

from pypcapkit.

evandrix avatar evandrix commented on August 21, 2024

also, the default engine doesn't seem to be able to deal with ARP packets (attached)
switching to dpkt works, but perhaps is there a way to fix it for engine=default?
I saw ...protocol...arp is defined in this repository, but I don't know where to modify the engine=default code

capture.0.pcap.gz

from pypcapkit.

JarryShaw avatar JarryShaw commented on August 21, 2024

Could you pls confirm that you're using the lastest version of DictDumper, i.e. dictdumper==0.8.4.post2? The JSON serialiser should've been fixed in that release.

As to the initial of the DictDumper project, it was designed to be a steam output handler to the filesystem, so that we can write a structual file piece by piece, instead of dump -> load -> dump loop (JSON itself for instance).

You can of course customise your own handler to the output stream in PyPCAPKit, as well customised serialiser based on the DictDumper project. The simpliest way, actually, is just to use the pcapkit.foundation.extraction.Extractor in auto=False mode as an iterator, and to handle each extracted packet with your own code.

About the extrator engine, PyPCAPKit supports several different modules, you may just provide your own selection through the CLI and/or API interfaces.

For the ARP parsing issue, please create a pull request on the pcapkit.protocols.link.arp.ARP class should you have some idea.

Thanks for your cooperation and support to the PyPCAPKit project.

from pypcapkit.

evandrix avatar evandrix commented on August 21, 2024

Okay

  • confirmed the version of DictDumper is what you stated
$ sudo python3 -B -u -m pip install -U dictdumper
Requirement already up-to-date: dictdumper in /usr/local/lib/python3.8/site-packages (0.8.4.post2)
  • confirmed that the dumping is working properly now i.e. jq can parse it
$ jq -rc . out.json
{"Global Header":{"magic_number":{"data":{"type":"bytes","value":"�ò�","hex":"d4c3b2a1"},"byteorder":"little","nanosecond":false},"version_major":2,"version_minor":4,"thiszone":0,"sigfigs":0,"snaplen":262144,"network":{"enum":"LinkType","desc":"[LinkType] Link-Layer Header Type Values","name":"ETHERNET","value":1},"packet":{"type":"bytes","value":"�ò�\u0002\u0000\u0004\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004\u0000\u0001\u0000\u0000\u0000","hex":"d4c3b2a10200040000000000000000000000040001000000"}},"Frame 1":{"frame_info":{"ts_sec":1573042247,"ts_usec":848399,"incl_len":60,"orig_len":60},"time":"2019-11-06T20:10:47.848399","number":1,"time_epoch":1573042247.848399,"len":60,"cap_len":60,"packet":{"type":"bytes","value":"\u0000\f)}\u001d�\u0000\f)\u0019�a\b\u0006\u0000\u0001\b\u0000\u0006\u0004\u0000\u0001\u0000\f)\u0019�a\n\u0014\u001e�\u0000\f)}\u001d�\n\u0014\u001e�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000","hex":"000c297d1db4000c2919dc6108060001080006040001000c2919dc610a141e83000c297d1db40a141e82000000000000000000000000000000000000"},"error":"TypeError: expected string or bytes-like object","raw":{"packet":{"type":"bytes","value":"\u0000\f)}\u001d�\u0000\f)\u0019�a\b\u0006\u0000\u0001\b\u0000\u0006\u0004\u0000\u0001\u0000\f)\u0019�a\n\u0014\u001e�\u0000\f)}\u001d�\n\u0014\u001e�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000","hex":"000c297d1db4000c2919dc6108060001080006040001000c2919dc610a141e83000c297d1db40a141e82000000000000000000000000000000000000"},"error":"TypeError: expected string or bytes-like object"},"protocols":"Raw"},"Frame 2":{"frame_info":{"ts_sec":1573042247,"ts_usec":848533,"incl_len":60,"orig_len":60},"time":"2019-11-06T20:10:47.848533","number":2,"time_epoch":1573042247.848533,"len":60,"cap_len":60,"packet":{"type":"bytes","value":"\u0000\f)\u0019�a\u0000\f)}\u001d�\b\u0006\u0000\u0001\b\u0000\u0006\u0004\u0000\u0002\u0000\f)}\u001d�\n\u0014\u001e�\u0000\f)\u0019�a\n\u0014\u001e�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000","hex":"000c2919dc61000c297d1db408060001080006040002000c297d1db40a141e82000c2919dc610a141e83000000000000000000000000000000000000"},"error":"TypeError: expected string or bytes-like object","raw":{"packet":{"type":"bytes","value":"\u0000\f)\u0019�a\u0000\f)}\u001d�\b\u0006\u0000\u0001\b\u0000\u0006\u0004\u0000\u0002\u0000\f)}\u001d�\n\u0014\u001e�\u0000\f)\u0019�a\n\u0014\u001e�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000","hex":"000c2919dc61000c297d1db408060001080006040002000c297d1db40a141e82000c2919dc610a141e83000000000000000000000000000000000000"},"error":"TypeError: expected string or bytes-like object"},"protocols":"Raw"}}
  • re:ARP parsing issue, I just switched over to dpkt, and dpkt is able to parse Ethernet/ARP packet successfully

By the way, is PyPCAPKit able to parse TCP streams (just like how Wireshark has the feature to "Follow TCP stream"), instead of frame-by-frame?

from pypcapkit.

JarryShaw avatar JarryShaw commented on August 21, 2024

Fixed in v0.15.3.

from pypcapkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.