Git Product home page Git Product logo

rawproto's Introduction

rawproto

Guess structure of protobuf binary from raw data

Very similar to protoc --decode_raw, but for javascript.

You can use this to reverse-engineer a protobuf protocol, based on a binary protobuf string.

See some example output (from the demo message in this repo) here.

installation

npm i rawproto will add this to your project.

You can also use npx rawproto to run the CLI.

If you just want the CLI, and don't use node, you can also find standalone builds here.

usage

In ES6;

import { readFile } from 'fs/promises'
import { getData, getProto } from 'rawproto'

const buffer = await readFile('data.pb')

// get info about binary protobuf message
console.log( getData(buffer) )

// print proto guessed for this data
console.log( getProto(buffer) )

In plain CommonJS:

var fs = require('fs')
var rawproto = require('rawproto')

var buffer = fs.readFileSync('data.pb')

// get info about binary protobuf message
console.log( rawproto.getData(buffer) )

// print proto guessed for this data
console.log( rawproto.getProto(buffer) )

You can do partial-parsing, if you know some of the fields:

import { readFile } from 'fs/promises'
import protobuf from 'protobufjs'
import { getData, getProto } from 'rawproto'

const proto = await protobuf.load(new URL('demo.proto', import.meta.url).pathname)
const Test = proto.lookupType('Test')
const buffer = await readFile('data.pb')

// get info about binary protobuf message, with partial info
console.log(getData(buffer, Test))

You can use fetch, like this (in ES6 with top-level await):

import { getData } from 'rawproto'
import { fetch } from 'node-fetch'

const r = await fetch('YOUR_URL_HERE')
const b = await r.arrayBuffer()
console.log(getData(Buffer.from(b)))

getData(buffer, stringMode, root) ⇒ Array.<object>

Turn a protobuf into a data-object

Returns: Array.<object> - Info about the protobuf

Param Type Description
buffer Buffer The proto in a binary buffer
root Object protobufjs message-type (for partial parsing)
stringMode string How to handle strings that aren't sub-messages: "auto" - guess based on chars, "string" - always a string, "binary" - always a buffer

getProto(buffer, stringMode, root) ⇒ string

Gets the proto-definition string from a binary protobuf message

Returns: string - The proto SDL

Param Type Description
buffer Buffer The buffer
root Object protobufjs message-type (for partial parsing)
stringMode string How to handle strings that aren't sub-messages: "auto" - guess based on chars, "string" - always a string, "binary" - always a buffer

cli

You can also use rawproto to parse binary on the command-line!

Install with npm i -g rawproto or use it without installation with npx rawproto.

If you just want the CLI, and don't use node, you can also find standalone builds here.

Use it like this:

cat myfile.pb | rawproto

or

rawproto < myfile.pb

or

npx rawproto < myfile.pb
Usage: rawproto [options]

Options:
      --version     Show version number                                [boolean]
  -j, --json        Output JSON instead of proto definition     [default: false]
  -m, --message     Message name to decode as (for partial raw)
  -i, --include     Include proto SDL file (for partial raw)
  -s, --stringMode  How should strings be handled? "auto" detects if it's binary
                    based on characters, "string" is always a JS string, and
                    "binary" is always a buffer.
                         [choices: "auto", "string", "binary"] [default: "auto"]
  -h, --help        Show help                                          [boolean]

Examples:
  rawproto < myfile.pb                      Get guessed proto3 definition from
                                            binary protobuf
  rawproto -i def.proto -m Test <           Guess any fields that aren't defined
  myfile.pb                                 in Test
  rawproto -j < myfile.pb                   Get JSON represenation of binary
                                            protobuf
  rawproto -j -s binary < myfile.pb         Get JSON represenation of binary
                                            protobuf, assume all strings are
                                            binary buffers

limitations

There are several types that just can't be guessed from the data. signedness and precision of numbers can't really be guessed, ints could be enums, and my auto system of guessing if it's a string or bytes is naive (but I don't think could be improved without any knowledge of the protocol.)

You should definitely tune the outputted proto file to how you think your data is structured. I add comments to fields, to help you figure out what scalar-types to use, but without the original proto file, you'll have to do some guessing of your own. The bottom-line is that the generated proto won't cause an error, but it's probably not exactly correct, either.

todo

  • Streaming data-parser for large input
  • Collection analysis: better type-guessing with more messages
  • getTypes that doesn't mess with JS data, and just gives possible types of every field
  • partial-parsing like protoc --decode. It basically tries to decode, but leaves unknown fields raw.

rawproto's People

Contributors

dependabot[bot] avatar glitchwizard avatar konsumer avatar musab-mk avatar rpgwaiter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rawproto's Issues

Can’t install or update to version 0.7.2

I had version 0.7.1 installed, when I run “npm update” I get the following error:
npm ERR! code ENOENT
npm ERR! syscall chmod
npm ERR! path /[RMOVED]/node_modules/rawproto/rawproto.cjs
npm ERR! errno -2
npm ERR! enoent ENOENT: no such file or directory, chmod '/[REMOVED]/node_modules/rawproto/rawproto.cjs'
npm ERR! enoent This is related to npm not being able to find a file.
npm ERR! enoent

I tried removing the package and installing, I get the same error.

~# npm -v
8.0.0
~# node -v
v16.11.0

found error when use demo code.

Error [ERR_REQUIRE_ESM]: require() of ES Module /{filepath}/node_modules/rawproto/dist/rawproto.modern.js from /{filepath}/example.js not supported.
Instead change the require of rawproto.modern.js in /{filepath}/example.js to a dynamic import() which is available in all CommonJS modules.
at Object. (/{filepath}/example.js:2:29) {
code: 'ERR_REQUIRE_ESM'
}

node : 16.12.0
npm: 8.19.4

SyntaxError: Cannot use import statement outside a module

I find error both CLI and NPX in version 0.7.13

$ npx rawproto
/Users/x/.npm/_npx/05c2c072889f8a67/node_modules/rawproto/dist/rawproto.cjs:1
import e from"protobufjs";const{Reader:s}=e,t=e=>Array(e).join("  "),r=(e,s="MessageRoot",a=1)=>{const o=[],n=[],u=e.map(e=>{const u=Object.keys(e).pop();switch(Array.isArray(e[u])?"array":typeof e[u]){case"object":return`${t(a+1)}bytes field${u} = ${u}; // could be a repeated-value, string, bytes, or malformed sub-message`;case"string":return`${t(a+1)}string field${u} = ${u}; // could be a repeated-value, string, bytes, or malformed sub-message`;case"number":return(e=>Number(e)===e&&e%1!=0)(e[u])?`${t(a+1)}float field${u} = ${u}; // could be a fixed64, sfixed64, double, fixed32, sfixed32, or float`:`${t(a+1)}int32 field${u} = ${u}; // could be a int32, int64, uint32, bool, enum, etc, or even a float of some kind`;case"array":if(-1===o.indexOf(u))return o.push(u),`\n${r(e[u],s,a+1)}\n${t(a+1)}\n${t(a+1)}Message${u} subMessage${u} = ${u};`;n.push(u)}}).filter(e=>e),i=[];return n.forEach(e=>{u.forEach((s,t)=>{-1!==s.indexOf(`subMessage${e}`)&&-1===i.indexOf(e)&&(u[t]=s.replace(`Message${e} subMessage${e}`,`repeated Message${e} subMessage${e}`),i.push(e))})}),`${t(a)}message ${s} {\n${u.join("\n")}\n${t(a)}}`};function a(e,t,r="auto",o=""){const n=s.create(e),u=[];for(;n.pos<n.len;){const e=n.uint64(),s=7&e,i=o+(e>>>3).toString();switch(s){case 0:u.push({[i]:n.uint32()});break;case 1:u.push({[i]:n.fixed64()});break;case 2:const e=n.bytes();try{const s=a(e,t,r,o);u.push({[i]:s})}catch(s){if("binary"===r)u.push({[i]:e});else if("string"===r)u.push({[i]:e.toString()});else{let s=!1;e.forEach(e=>{e<32&&(s=!0)}),u.push(s?{[i]:e}:{[i]:e.toString()})}}break;case 5:u.push({[i]:n.float()});break;default:n.skipType(s)}}return t&&t.decode(e),u}function o(e,s,t="MessageRoot",o="auto"){const n=a(e,s,o);let u='syntax = "proto3";\n\n';return u+=r(n,t),u}export{a as getData,o as getProto};
^^^^^^

SyntaxError: Cannot use import statement outside a module
    at Object.compileFunction (node:vm:360:18)
    at wrapSafe (node:internal/modules/cjs/loader:1094:15)
    at Module._compile (node:internal/modules/cjs/loader:1129:27)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1219:10)
    at Module.load (node:internal/modules/cjs/loader:1043:32)
    at Function.Module._load (node:internal/modules/cjs/loader:878:12)
    at Module.require (node:internal/modules/cjs/loader:1067:19)
    at require (node:internal/modules/cjs/helpers:103:18)
    at Object.<anonymous> (/Users/jairochen/.npm/_npx/05c2c072889f8a67/node_modules/rawproto/rawproto.cjs:3:18)
    at Module._compile (node:internal/modules/cjs/loader:1165:14)

Adding 'message' before Message name.

What do you think if we add 'message'. On this line:

return `${indent(level)}Message${m} {\n${lines.join('\n')}\n${indent(level)}}`

It'll be

  return `${indent(level)}message Message${m} {\n${lines.join('\n')}\n${indent(level)}}`

It makes exported protobuf files usable. I tested and it worked. I can open a PR if you want

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.