Git Product home page Git Product logo

lain's Introduction

NOTE: As of September 2022, this repository is no longer maintained.

To continue using lain, please use the lain repository at https://github.com/landaire/lain.

lain

This crate provides functionality one may find useful while developing a fuzzer. A recent nightly Rust build is required for the specialization feature.

Please consider this crate in "beta" and subject to breaking changes for minor version releases for pre-1.0.

crates.io docs.rs

Documentation

Please refer to the wiki for a high-level overview.

For API documentation: https://docs.rs/lain

Installation

Lain requires rust nightly builds for specialization support.

Add the following to your Cargo.toml:

[dependencies]
lain = "0.5"

Example Usage

extern crate lain;

use lain::prelude::*;
use lain::rand;
use lain::hexdump;

#[derive(Debug, Mutatable, NewFuzzed, BinarySerialize)]
struct MyStruct {
    field_1: u8,

    #[lain(bits = 3)]
    field_2: u8,

    #[lain(bits = 5)]
    field_3: u8,

    #[lain(min = 5, max = 10000)]
    field_4: u32,

    #[lain(ignore)]
    ignored_field: u64,
}

fn main() {
    let mut mutator = Mutator::new(rand::thread_rng());

    let mut instance = MyStruct::new_fuzzed(&mut mutator, None);

    let mut serialized_data = Vec::with_capacity(instance.serialized_size());
    instance.binary_serialize::<_, BigEndian>(&mut serialized_data);

    println!("{:?}", instance);
    println!("hex representation:\n{}", hexdump(&serialized_data));

    // perform small mutations on the instance
    instance.mutate(&mut mutator, None);

    println!("{:?}", instance);
}

// Output:
//
// MyStruct { field_1: 95, field_2: 5, field_3: 14, field_4: 8383, ignored_field: 0 }
// hex representation:
// ------00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
// 0000: 5F 75 00 00 20 BF 00 00 00 00 00 00 00 00         _u...¿........
// MyStruct { field_1: 160, field_2: 5, field_3: 14, field_4: 8383, ignored_field: 0 }

A complete example of a fuzzer and its target can be found in the examples directory. The server is written in C and takes data over a TCP socket, parses a message, and mutates some state. The fuzzer has Rust definitions of the C data structure and will send fully mutated messages to the server and utilizes the Driver object to manage fuzzer threads and state.

Contributing

This repo is no longer maintained, and therefore is not accepting new contributions.

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License: MIT

lain's People

Contributors

comcma avatar dependabot[bot] avatar drchat avatar landaire avatar microsoft-github-policy-service[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lain's Issues

Outdated Rand crate makes it hard to replace Impl

We're using lain in an example for LibAFL, a fuzzing library that supports multiple faster (non-cryptographic) random implementations, but we are unable to update our rand crate dependency, as it breaks with Lain's version, see AFLplusplus/LibAFL#327.
As the rand API, as far as I can tell, did not have any drastic changes, maybe you can update it at some point. Thanks

fuzz enum variance

Why enum variations fuzzing is not supported?

#[derive(NewFuzzed)]
   |          ^^^^^^^^^ variant or associated item cannot be called on `ACCESS` due to unsatisfied trait bounds

UnstableEnum

What is the purpose is this type? It seems to always cast the variant to Invalid as well as just punting on mutation to the primitive value within the original variants.

In the example, the mutator does not only mutate between the provided variants, it passes a u32 to mutate and wraps the result in UnstableEnum::Invalid(u32). Which isn't what I would expect from an Enum mutation except for maybe very rarely (like ignoring tags).

Field mutation issues

I am seemingly having some issues getting fields to mutate within a defined structure. The simplest way to reproduce this is to remove the Fixup implementation on the example_fuzzer and comment out the connecting, and sending packets.

The mutator will also ignore any #[lain(ignore)] flags on the first field of the structure and still do a walking bit-flip pass. While this is just odd behavior, the real problem is the lack of mutation on the fields within the structure even during Havoc mode mutations. I have run this through 300 million iterations looking for a change in length with this setup and have yet to see one. Additionally, typ does not seem to mutate after the initial run through bit flips. Looking through the code it looks like there may be an issue with duplicated unsigned fields? I am not sure how this could happen, and I am still looking around the code, but obviously offset is being mutated properly. Based on the trace! info, typ is getting into the macro generated mutate function, but then failing to call mutate_from_mutation_mode. Meanwhile the length field seems to be placed under what looks to be an ignore flag, by trying to get the 1% chance to ignore the flag.

Obviously this could be chalked up to the mutator focusing on a single field first, but that is not my understanding of how Havoc works, so I am curious if there is any other insight on this issue, or if it is reproducible for other people.

Sending packet: PacketData { typ: Invalid(2), offset: 17328406766963615603, length: 16032434953803588567, data: [] }
[2019-09-04T00:26:03Z TRACE lain::mutator] generating number between 0 and 2
[2019-09-04T00:26:03Z TRACE lain::mutator] got 0
[2019-09-04T00:26:03Z INFO  lain::mutator] Havoc
[2019-09-04T00:26:03Z INFO  lain::mutator] num is: 2
[2019-09-04T00:26:03Z INFO  lain::mutator] 0, 1, 1, 543
[2019-09-04T00:26:03Z TRACE lain::mutatable] Mutating unsigned value
[2019-09-04T00:26:03Z INFO  lain::mutator] Havoc
[2019-09-04T00:26:03Z INFO  lain::mutator] num is: 17328406766963615603
[2019-09-04T00:26:03Z INFO  lain::mutator] 1, 1, 1, 543
[2019-09-04T00:26:03Z TRACE lain::mutator] Operation selected: BitFlip
[2019-09-04T00:26:03Z TRACE lain::mutator] xoring bit 33
[2019-09-04T00:26:03Z TRACE lain::mutatable] Mutating unsigned value
[2019-09-04T00:26:03Z INFO  lain::mutator] Havoc
[2019-09-04T00:26:03Z INFO  lain::mutator] num is: 16032434953803588567
[2019-09-04T00:26:03Z INFO  lain::mutator] 2, 1, 1, 543
[2019-09-04T00:26:03Z TRACE lain::mutator] generating number between 0 and 100
[2019-09-04T00:26:03Z TRACE lain::mutator] got 85.98688
[2019-09-04T00:26:03Z TRACE lain::mutator] generating 1% chance. got 85.98688, so returning false
[2019-09-04T00:26:04Z TRACE lain::mutator] generating number between 0 and 100
[2019-09-04T00:26:04Z TRACE lain::mutator] got 72.75632
[2019-09-04T00:26:04Z TRACE lain::mutator] generating 1% chance. got 72.75632, so returning false
[2019-09-04T00:26:04Z DEBUG example_fuzzer] getting serialized size of PacketData
[2019-09-04T00:26:04Z TRACE lain::buffer] getting serialized size for Vec
[2019-09-04T00:26:04Z TRACE lain::buffer] returning 0 since there's no elements
[2019-09-04T00:26:04Z DEBUG example_fuzzer] size of PacketData is 0x14
[2019-09-04T00:26:04Z DEBUG example_fuzzer] getting serialized size of PacketData
[2019-09-04T00:26:04Z TRACE lain::buffer] getting serialized size for Vec
[2019-09-04T00:26:04Z TRACE lain::buffer] returning 0 since there's no elements
[2019-09-04T00:26:04Z DEBUG example_fuzzer] size of PacketData is 0x14
Sending packet: PacketData { typ: Invalid(2), offset: 17328406775553550195, length: 16032434953803588567, data: [] }

Deserialization

Overview

Deserialization would be desired so we can take real "over-the-wire" data sent to a target server, application, or library, and deserialize it back into a Rust structure as defined in the fuzzer. This would help with reproducibility and using real corpuses to get started with fuzzing.

Blockers

lain does not use a 1:1 mapping between Rust objects and target objects. The expectation is that the user defines a struct with a layout that's similar to how the object looks while serialized. This includes the user defining things like the padding as part of the data structure. Bitfields are an exception where the Rust struct would appear larger than the target structure since these are defined as a whole-sized type (e.g. if you have two bitfields with 4 bits each, it'd be defined as two u8 fields).

Given the nature of fuzzing these data structures as well, it's not reliable to decode from a C type to Rust types for things like dynamic-sized arrays, strings, or things that rely on run-length encoding. For example, given the following struct:

struct Foo {
    len: u8,
    items: DynamicArray<Bar>,
}

How would we know the true length of items given that len will most of the time not exactly match the true length of items? Or null-terminated strings with null bytes embedded before the actual end of the string?

Possible solutions

The only realistic solution is to always assume well-formed data. Fuzzer-generated data will produce errors and cannot be deserialized. All fields will be considered trusted and well-formed. Users should not rely on decoding C structs for corpuses and instead should convert to a Rust-native serializing solution such as bincode.

Support for shrinking/growing Vec<T>

When calling Mutatable::mutate on an object that contains a Vec, the current impl Mutatable<T> for Vec<T> will treat itself as a slice and mutate all objects in-place. Shrinking/growing the Vec is desirable and should, for a baseline, require the following work:

  1. Fix mutate methods not passing down Constraints.
  2. Add two new impls:
impl<T> Mutatable for Vec<T>
where
    T: Mutatable

Where only shrinking a Vec would be performed

and:

impl<T> Mutatable for Vec<T>
where
    T: Mutatable + NewFuzzed + SerializedSize

Where shrinking or growing a Vec would be performed. The separation here allows us to call NewFuzzed::new_fuzzed() on the new elements in the vec, while also respecting the max size constraint if it's provided.

Another desired feature would be to randomly mutate fields or slice elements to be the length of any known vectors in the data structure (including encapsulating structures), but this is a pretty large undertaking.

General future improvements

Known issues to fix:

  1. SerializedSize and NewFuzzed are oddly coupled since the former is used when a Constraint with max_size is supplied.
  2. String mutation isn't very well-tested.
  3. NewFuzzed is slower than it needs to be as a result of initializing fields in random order. This could be fixed with a const fn that checks if any of the fields (recursively) contain dynamic-size members and if not, use static initialization ordering. The randomness only matters when we have dynamic fields in order to try influencing the number of items generated for a dynamic field when working with a max_size constraint.. Done as part of the release of lain
  4. When things fail, they panic. This is mostly by-design to have things fail-fast, but some errors are just swallowed (e.g. serializing to a buffer that's too small).
  5. Add support for unnamed struct fields and named enum fields

From an overall cleanup perspective the following changes should probably be made at some point as well:

  • The mutator's state machine should possibly be decoupled from the mutator itself. I'm not a huge fan of calling mutator.begin_new_iteration() in the harness driver or mutator.begin_new_corpus() when fuzzing a new item from my corpus.
  • I found myself in my corpus management checking against MutatorMode::Havoc to switch between items in my corpus since this is the last state in the state machine, which may be misleading or not future-proof.
  • Cleanup of API exports.
  • Work on verifying reproducible runs. This is loosely verified through tests, but isn't very elaborate.
  • Proc macros are somewhat of a mess to follow.
  • Driver isn't super well thought-out and is mostly for convenience.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.