Git Product home page Git Product logo

Comments (6)

silvergasp avatar silvergasp commented on June 16, 2024

Do you think this functionality could be extended to an optional part of the API? i.e. it would be great to programmatically fetch from github. This could be done by just exposing the following in the all-functions endpoint;

  • line_number (e.g. 29)
  • commit (e.g. 660d8bf

Then you can create a link like the following;

Having it as part of the API would also mean you can do other things like download the file using the github API. The only downside is that not all projects use git (so the commit field might need to be something more generic or optional).

from fuzz-introspector.

DavidKorczynski avatar DavidKorczynski commented on June 16, 2024

If I understand your thought correctly then I think it would be neat -- namely to have an API available that'll provide you a link to the source code, or, perhaps the source code of each harness.

However, I'm unsure what you meant by commit? Which commits are you referring to for each harness?

My thoughts are:
Take the llhttp project with a single fuzzer: https://introspector.oss-fuzz.com/project-profile?project=llhttp -- in this case, I'd like for the profile page to have a "fuzzers table" with 1 row (because there's a single fuzzer) with a link to https://github.com/nodejs/llhttp/blob/main/test/fuzzers/fuzz_parser.c#L8-L45 for the fuzz_parser harness.

To make it an API, I would make either the above URL accessible. We should be able to provide references to other repo websites e.g. gitlab and more, and, in the worst case we can provide a URL to the code coverage report for where the fuzzer is as we'll always (or when coverage is working at least) have a link to the code coverage reports.

Are you perhaps thinking instead of https://github.com/nodejs/llhttp/blob/main/test/fuzzers/fuzz_parser.c#L8-L45 the right link to provide is https://github.com/nodejs/llhttp/blob/8498ef9d8b0e9539c8c331cf59213529287789e1/test/fuzzers/fuzz_parser.c#L8-L45?

from fuzz-introspector.

DavidKorczynski avatar DavidKorczynski commented on June 16, 2024

We may run into some issue with having to predict branch names. Hmm, I'm not sure if there are many edge cases we'll have to handle.

One option is to reduce this to links to the location in the code coverage reports. I think that's also useful in and of itself, but, I also think having the source repo URLs provide high value, and even the source code itself.

from fuzz-introspector.

silvergasp avatar silvergasp commented on June 16, 2024

Are you perhaps thinking instead of https://github.com/nodejs/llhttp/blob/main/test/fuzzers/fuzz_parser.c#L8-L45 the right link to provide is https://github.com/nodejs/llhttp/blob/8498ef9d8b0e9539c8c331cf59213529287789e1/test/fuzzers/fuzz_parser.c#L8-L45?

Yeah that's pretty close to what I was saying, I guess what I was getting at is that you need 3 bits of information to reproducibly find a function;

  • The specific version (my suggestion being the specific git sha from the project that introspector is being run on). If it's not already this could be extracted using the following commands;
$ cd llhttp # Insert project here
$ git rev-parse HEAD
8498ef9d8b0e9539c8c331cf59213529287789e1
  • The file path e.g. test/fuzzers/fuzz_parser.c
  • The line range e.g. L8-L45

If you stitch all of these peices together you can reproducibly find the specific function again, and it will always remain the same e.g.

https://github.com/nodejs/llhttp/blob/8498ef9d8b0e9539c8c331cf59213529287789e1/test/fuzzers/fuzz_parser.c#L8-L45
https://github.com/nodejs/llhttp/blob/{---------------commit-sha-------------}/{-----------path----------}#{range}

My suggestion is to just include those peices of information in the API, and leave the URL building up to the user. For example the same thing would be reproducible from the command line. e.g.

$ git clone https://github.com/nodejs/llhttp.git
$ cd llhttp
$ git checkout 8498ef9d8b0e9539c8c331cf59213529287789e1
$ # Snip out lines 8-45
$ sed -n '8,45 p' test/fuzzers/fuzz_parser.c 
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
 // Truncated ...
}

The latter being closer to what I would likely be doing.

We may run into some issue with having to predict branch names. Hmm, I'm not sure if there are many edge cases we'll have to handle.

I don't think branch prediction would be an issue with the above approach. A branch in itself is just a stream of sequential commits. Whereas a commit itself is an atomic representation of a git repository. So as long as you collect the git sha, when you run introspector you should be able to reproducibly restore that commit (or use the github api, to view the file at that commit).

from fuzz-introspector.

silvergasp avatar silvergasp commented on June 16, 2024

That's assuming you meant git branch and not some other definition of a branch :)

from fuzz-introspector.

silvergasp avatar silvergasp commented on June 16, 2024

Also worth noting that gitlab, bitbucket and others have a similiar api structure available as well e.g.

https://gitlab.com/gnuwget/wget2/-/blob/8271687e29568e9a271afa1b3112325611f48183/fuzz/libwget_atom_url_fuzzer.c#L31-53
https://bitbucket.org/snakeyaml/snakeyaml/src/a4df9e7d7ffdc0c21fe268f872a1e30d03aa8f02/src/main/java9/module-info.java#lines-14:44

from fuzz-introspector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.