Git Product home page Git Product logo

vehicle-lang / vehicle Goto Github PK

View Code? Open in Web Editor NEW
75.0 10.0 5.0 23.46 MB

A toolkit for enforcing logical specifications on neural networks

Home Page: https://vehicle-lang.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Haskell 86.37% Emacs Lisp 0.17% Agda 0.38% VCL 1.79% Python 9.46% Shell 1.53% Roff 0.09% SWIG 0.09% TeX 0.06% Dockerfile 0.07%
agda haskell neural-network specification verification

vehicle's Introduction

PyPI GitHub tag (latest by date) GitHub Workflow Status readthedocs status pre-commit.ci status

Vehicle

Vehicle is a system for embedding logical specifications into neural networks. At its heart is the Vehicle specification language, a high-level, functional language for writing mathematically-precise specifications for your networks. For example, the following simple specification says that a network's output should be monotonically increasing with respect to its third input.

Example specification

These specifications can then automatically be compiled down to loss functions to be used when training your network. After training, the same specification can be compiled down to low-level neural network verifiers such as Marabou which either prove that the specification holds or produce a counter-example. Such a proof is far better than simply testing, as you can prove that the specification holds for all inputs. Verified specifications can also be exported to interactive theorem provers (ITPs) such as Agda. This in turn allows for the formal verification of larger software systems that use neural networks as subcomponents. The generated ITP code is tightly linked to the actual deployed network, so changes to the network will result in errors when checking the larger proof.

Documentation

Examples

Each of the following examples comes with an explanatory README file:

  • ACAS Xu - The complete specification of the ACAS Xu collision avoidance system from the Reluplex paper in a single file.

  • Car controller - A neural network controller that is formally proven to always keep a simple model of a car on the road in the face of noisy sensor data and an unpredictable cross-wind.

  • MNIST robustness - A classifier for the MNIST dataset that is proven to be robust around the images in the dataset.

In addition to the above, further examples of specifications can be found in the test suite and the corresponding output of the Vehicle compiler can be found here.

Support

If you are interested in adding support for a particular format/verifier/ITP then open an issue on the Issue Tracker to discuss it with us.

Neural network formats

Dataset formats

Verifier backends

Interactive Theorem Prover backends

Related papers

vehicle's People

Contributors

bobatkey avatar dependabot[bot] avatar katyakom avatar matthewdaggitt avatar ndslusarz avatar pre-commit-ci[bot] avatar tgl70 avatar wenkokke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vehicle's Issues

Improve machine name generation based on types

Name generation for machine variables should be dependent on the type, e.g. _n1, _n2, ... for natural, _xs1, _xs2, ... for lists/tensors and only fall back to _x1, _x2, ... by default...

Integrate with Marabou

Duplicate the VNNLib file for now, and rename it Marabou, strip out the meta-network steps and make the minor tweaks necessary for it to work.

Support properties of multiple networks and/or multiple inputs

The steps required to support these are as follows:

  1. Let-lift all network applications in a property to the top-level (#26)
  2. If we have multiple applications of the same network to different inputs then update the onnx file to contain that number of independent copies of the network, and label the inputs/outputs X1 .... XN, X[N+1] ... X[2N], ... etc
  3. If we have multiple networks, then merge the onnx files to contain both networks independently and again label the inputs X1 .... XN, X[N+1] .... XM

Think more carefully about compilation of rational literals to SMTLib

Currently when we compile rational literals to SMTLib we simply cast the Rational value back to a double and then print the result. Unfortunately this obviously runs into floating point precision issues, and e.g. in andGate.vnnlib we currently end up with -3.1365920000000003 rather than -3.136592.

The right way to do this is to distinguish between rational and floating point literals in the frontend (e.g. 1.31 and 1.31f) and then make a similar distinction in SMTLib (which also supports that).

de Bruijn -> named needs to process annotations at the same time as the rest of the AST

This issue arises from trying to get the following to type check:

polyId : forall t. t -> t
polyId = \{t} -> \x -> x

Vehicle currently crashes with Error: DeBruijn index 0 greater than current context size 0 (Line 3, Columns 20-21) (perhaps with different line numbers). This error is thrown by the conversion from de Bruijn back to concrete named representation.

Currently the de Bruijn to concrete names conversion works in two stages:

  1. de Bruijn indices in the AST are converted to Names, which either copies over the user's original name, or uses Machine for machine generated binders
  2. Then a second pass converts all Names to Symbols (so they can be printed) and simultaneously converts all the annotations (which contain the types of each sub term) from de Bruijn to names (and then symbols).

The problem is that when the second pass converts annotations from de Bruijn to Names, it operates on each annotation in isolation without taking into account the binding context. The variable occurrence x at the end is annotated by the type checker with the type TVar 0, where the de Bruijn index 0 makes sense in the content of the binder {t}. However, the conversion of the annotation TVar 0 does not know about this context yielding the error above.

The solution therefore is to convert de Bruijn to Names in the main AST and the annotations in the same past.

However, this violates the current setup of the DataflowT monad transformer. Processing of type and expression arguments (sorts 'TARG 'EARG) is assumed to only result in bindings being added to the context. But arguments are annotated with their kind/type, so they need converting from de Bruijn to Names. To do this, processing needs access to the current context which is not allowed by the current definition of DataflowT.

Two solutions come to mind:

  1. Do not treat arguments as a special kind, so they get handled by their surrounding AST node.
  2. Tweak DataflowT to be more permissive, and use a state monad for arguments.

Generalize the getter type classes using a trick from WGP'14

We can generalize our getter type classes using the following code:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}

import Data.Proxy (Proxy(..))

data Pos = Here | Left Pos | Right Pos

type family Elem (a :: *) (b :: *) :: Maybe Pos where
  Elem a a      = 'Just 'Here
  Elem a (l, r) = Choose (Elem a l) (Elem a r)
  Elem a b      = 'Nothing

type family Choose (l :: Maybe a) (r :: Maybe a) :: Maybe a where
  Choose ('Just x) _         = 'Just x
  Choose _         ('Just y) = 'Just y
  Choose _         _         = 'Nothing

class Get (mp :: Maybe Pos) (a :: *) (b :: *) where
  get :: Proxy p -> a -> b

instance Get (Just Here) a a where
  get _p a = a

instance Get (Just p) a b => Get (Just (Left p)) a (b, c) where
  get _p (b, c) = get Proxy b

instance Get (Just p) a c => Get (Just (Right p) a (b, c) where
  get _p (b, c) = get Proxy c

type Has a b = Get (Elem a b) a b

This should let us define classes such as HasProvenance in a way that doesn't restrict them to being strictly left or right biased, and which happens to also generalize over the type that we are getting.

Some adaptation may be required to let this search through the AST.

The trick is taken from the following paper:

Implementation of property level quantifiers

Thrashed out with @wenkokke this morning.

  • Use keywords every and some to avoid the type-level forall
  • Two types of quantifiers
    1. Quantification over types: every x . intruderFarAway x => advisesClearOfConflict x desugars to primEvery (\x -> intruderFarAway x => advisesClearOfConflict x)
    • Will need to do some domain analysis when generating loss functions to avoid naively implementing rejection sampling
    1. Quantification over containers: every x in theDataset . robustAround x desugars to fold and true (map robustAround dataset)

Implement "check" mode

The command line check DATABASE_FILEPATH PROPERTY_UUID mode already exists. Therefore need to:

  • read in the provided database file
  • lookup the property from the UUID and lookup any networks it involves.
  • check the stored hash of the network model against it's current hash

Extend normalization with an allow/block list

Different backends may or may not support different operators, so it would be a good idea to extend the normalization algorithm to optionally normalize/not normalize certain built-ins (or even defined functions).

BNFC printing bug with layout braces

Switching back to the BNFC printing code has revealed that BNFC does not determine between layout braces and non-layout braces. Printing anything with implicit arguments results in the following mess:

(Tensor Real) ((([((5 {
                      ?2
                    }
                    ) {{ ?3 }})] {
                      ?5
                    }
                    ) {
                      ?4
                    }
                    ) {{ ?6 }});

instead of

(Tensor Real) ((([((5 {?2}) {{ ?3 }})] {?5}) {?4}) {{ ?6 }});

TODO: submit an upstream issue to BNFC

Document the general design principles behind Vehicle

We should document:

  • the naming schemes behind short variable names should be documented;
  • design decisions such as "everything that doesn't influence the semantics of an expression should be in the generic ann field";
  • programming guidelines such as "don't pattern match on Binder or Ann, rather use their accessors";
  • etc.

Let bindings don't type check

When running the andGate example we end up trying to type the type of the let binding (initially a hole) in infer mode and therefore we error out with:

Error: the type of '_' could not be resolved (Line 16, Columns 8-9)

Improve normalisation of arithmetic

Currently we only normalise things like 2 + 3 but not (2 + x) + 3.

@wenkokke has found some work which she reckons might be relevant.

First off, there's this:
https://gallais.github.io/pdf/icfp13.pdf

Secondly, there's the extended abstract by Conor and Fred for their language feet: adapters.pdf

The feet project is where Fred and I muck around with this stuff, and there are loads of goodies in the pipe: free abelian groups, categories, presheaves and profunctors! Once the penny drops that walking open terms can be designed totally differently from running closed terms, fabulous possibilities emerge. You can have as much type information as you could wish for. You can do algebra as well as arithmetic. You can sneakily impose a total ordering on the terms of any type.

Scope checking bug, causing type checking to fail

The following vehicle code should type check:

network A : Type 0
network B : A -> Type 0

network fn : forall {x : A}. B x -> B x

However, it fails with:

Error: Could not solve the constraint: A ~ (B (i1)) (Line 4, Columns 39-40)
Fix: Check your types

Checking the debugging output, it seems that the problem is that something is going wrong with the names to de Bruijn conversion. The translation of the final line to core syntax with de Bruijn is:

(declare-network fn : (pi {x :type A} (pi (_ :type (B (i0))) (B (i0))))) 

but it ought to be:

(declare-network fn : (pi {x :type A} (pi (_ :type (B (i0))) (B (i1))))) 

(the last variable reference on the line ought to be i1, not i0 which refers to the B x argument.

Export AST to JSON

As discussed, to export to json we will use the Aeson Haskell library. We don't want to export the full Vehicle AST as 1) it contain lots of redundant information (annotations etc.) and 2) many of the built operations (map/folds etc.) will have been normalised out and therefore will never be exported to json.

Therefore the first step to doing so is to define a new cut down AST for expressions containing only the information and operations we want to translate to json in a new file Vehicle.Compile.Backend.JSON and define a translation to it from our core AST found in
Vehicle.Language.AST.Core.

For starters @ndslusarz the json expression type JExpr should only be very basic. Let's include:

  • App
  • Var
  • Literal
  • BooleanOp2 builtins (see Vehicle.Language.AST.Builtin)
  • Order builtins

When trying to translate anything that is not in that list, the translation function of type CheckeExpr -> JExpr should throw a developerError if it is a type-related construct (Pi, Meta, Hole) or it should have been normalised out (e.g. Ann, list operations etc.) and otherwise throw a sensible user error that says the operation is not yet supported by the json backend.

Allow chaining of boolean operators

e.g. allow the user to write 2 <= x < y < 3.

  • Adjust precedence of RHS of comparison operator grammar rules, change Expr8 ::= Expr9 TokLe Expr9; to Expr8 ::= Expr9 TokLe Expr8;
  • Decide how we're going to fold and unfold and in a invertible way.

Work out what we're doing with type classes

Currently we just stick the PrimDict stuff in there and run away never using it. Came up across this when compiling literals to Agda. For example even if we have a Nat literal, if it has type rational then we need to resolve it and convert it to a rational. At the moment I've hacked it into the Agda complilation stage, but we should really do something principled probably involving inserting conversion functions.

Implement "verify" mode

Need to:

  • add command line "verify" mode
  • create a database file if it doesn't already exist
  • assign every property a unique ID and hash
  • assign every network a unique ID and hash
  • compile to chosen verifier
  • call chosen verifier
  • store result in database

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.