vmware / differential-datalog Goto Github PK

DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.

License: MIT License

Haskell 24.65% Makefile 0.06% Rust 27.00% Shell 1.43% Python 2.53% C 2.88% Java 37.86% Yacc 0.91% HTML 0.15% Dockerfile 0.05% Go 1.42% Nix 0.07% JavaScript 0.01% CSS 0.03% TypeScript 0.88% Vim Script 0.06%

datalog ddlog incremental programming-language rust

differential-datalog's People

Contributors

Stargazers

Watchers

Forkers

justinpettit argc0 gfour laojade remysucre hessammehr lalithsuresh libinliu0189 jkljkm hartsock bef0 arronmabrey ddlog-dev dolphincc unshorn mihaibudiu ibuystuff fkalim z-zhiqiang vada-oxford hehuatang antoninbas martinweindel danbst falzberger paralax rabraham hacker0912 gitter-badger d4hines haroldlim yjiayu rayokota abhijitvnera krs85 graydon escapingbug c-nixon srenatus vestigej gatowololo kixiron eqv ajnsit longjohncoder apsaltis kichjang microsvuln itsenov-personal silvanshade qishen davidspies aziemchawdhary-gs jonsecchis desharchana19 silky smadaminov liamolucko sumeet debnil ddlog-dev2 shreyasdarkin gz patmosxx-v2 convolvatron nandbarkin booxter happinessyeah arkivm dgoldstein1 amytai lykahb andreabedini weiqiangt margaretdorothy68 qsguo price1999a epompeii kalpanadixit mjepronk joshuahhh hellblazer michaelrauh ajwdev hraesvel tobindekorne sunilmallireddy iamfork davidpichardie fuzavecta yhack offset64 sovereignj quinnwilton meme bnikolic doug-galvanick michael-swan ulfbissbort gongmingchen

differential-datalog's Issues

differential-dataflow optimizations

use distinct_total instead of distinct where possible.
don't convert external collections to Variables in the nested scope (variables use a lot of memory)

TODOs

Basic Datalog (Leonid)

FTL (Mihai)

Generate Datalog schema from JSON
Generate input and output adapters
Adapters for external C functions invoked from FTL
Test workload

Future improvements

Debugging facility
CLI interface to ddlog
RPC interface to ddlog

Cannot have antijoins with wildcards

Also, the error message is not very good.

error: file.dl:211:234-211:235: Argument _method must be specified in this context
 
RBasic_MethodLookup(_simplename, _descriptor, _type, _method) :- RDirectSuperclass(_type, _supertype), RBasic_MethodLookup(_simplename, _descriptor, _supertype, _method), not RBasic_MethodImplemented(_simplename, _descriptor, _type, _).

No symbol table?

Currently one can have a type that depends on an undefined identifier

typedef t = x

Allow different-width arguments of << and >>

... so that we can write x << 5 vs x << 32'd5

OVN integration TODOs

TODO list for DDlog/OVN integration:

Escape sequences in interpolated strings

One should be able to escape symbols such as {, }, $, | in an interpolated string.

Add a built-in method to convert a type to a string

Also, there should be a way for users to specify the conversion to strings for user-defined types.
This will probably require string concatenation at the very least.

add support for constants

At the moment, constants can be simulated by 0-arity functions. The main downside is the need to use parenthesis every time a constant is referenced. Also, function names start with lower-case letters, whereas it is customary to capitalize constant names.

Change integer literal syntax

Use conventional C syntax instead of Verilog syntax for integer literals

Statically evaluate constant expressions

E.g., parts of Compile.hs assume this

Implement bit slice logic in Compile.hs

This is a little tricky, as we use different representations depending on the size of a bit vector (BitUint or uXX) depending on its width. Bit slicing may involve conversion between these representations

Tutorial section on FTL

Type inference using unification

Implement a proper type inference algorithm based on unification. The current ad hoc implementation is hacky and fragile. No idea why I did not do the right thing straight away...

Automatic string conversion for type variables

At the moment, the compiler refuses to convert type variables to strings.

This can be fixed by:

Infering that a function expects a type variable to be printable
In the generated Rust function, add the Display trait for this function
Whenever the function is called, check that its concrete arguments satisfy the trait
Implement Display trait for types that have a user-defined 2String method

Optimizations

string interning
use arrangements for relations that are used as the first atom in multiple rules (a special case of common subexpression elimination)
use Box or Arc type for Value to save memory
avoid redundant cloning in the generated code
use per-key distinct when possible
extract "by-self" arrangement from distinct
32-bit weights
use unreachable() instead of panic!() in Compile.hs (see TODO in Compile.hs)

https://github.com/frankmcsherry/differential-dataflow/issues/113

Speed up compilation, reduce disk footprint of multiple datalog programs

Currently every test in tests/datalog_tests downloads and compiles its own Rust dependencies, as well as makes its own copy of differential_dataflow, taking a couple of gigabytes of space per test. There must be a way to share common parts across all tests.

Mechanism to track provenance of output records

Complete the language reference document

Add examples
Add semantic constraints
Ref and & syntax
Standard library reference

fix goldenVsFiles

goldenVsFiles behaves funny when more than one files has changed: it first complains about the first file only. Deleting the corresponding .expect file causes it to report that a new golden file has been generated (without an error). Finally, the third run reports a mismatch in the second golden file.

Support for namespaces

Seems important for very large Datalog programs

DDlog lints

Detect unused variables in rules and functions
Detect unused intermediate relations (i.e., relations that are neither output nor used to compute other relations).
Grouping by complex variables or variables wrapped in Ref<> is inefficient. Ideally, one should group by one or several small identifiers.
???

differential-dataflow doesn't compile for compiler tests

test.log
stack test log attached. Might be related to #84 - might be fixed if one tracks down which git version of timely & abomonation worked, and track that in rust/differential_datalog/Cargo.toml

Rules that express all constants are not allowed

RHeapAllocation_Type(_heap, _type) :- var _heap = "<<main method array>>", var _type = "java.lang.String[]".

Installation procedure

Command to install differential datalog binary.

String interpolation

There should be a way to perform string interpolation.
https://en.wikipedia.org/wiki/String_interpolation
I suggest the C# syntax:
$"The value between braces ${expression + other} is treated like an expression"

`return` statement

to structure complex control flow

Syntax improvements

Rename int -> bigint
rename ground relation -> input relation
add return statement

Incomplete implementation of assignments in Compile.hs

We currently don't have a way to compile arbitrary l-values to Rust, e.g.,

Constructor{var x, var y} = z

is ok, but

Constructor{x, y} = z

where x and y are previously introduced variables is illegal.

Either implement the logic to support this translation (e.g., by assigning to intermediate variables) or modify validation logic to disallow such programs

Test and document parsec's string parser

We rely on parsec's standard parser for strings, which supports unicode and escaping.

TODO:

check and document its exact functionality
can we handle all OVSDB strings?

mpsc interface to ValMap

ValMap is a data structure that stores the content of output tables. It is updated by each change callback invoked by the differential inspect() operator. ValMap is protected by a mutex, potentially introducing contention in workloads where outputs are frequently updated. One possible solution is to maintain ValMap in a separate thread and use mpsc queue to communicate with this thread from workers. We must be careful though to flush the queue before transaction_commit() returns. Another (even better?) option is to keep a ValMap per worker and only merge them when the client requests a copy of an output table.

Allow the use of constants in patterns

syntax to select tuple fields by number

x == (x,y).0
y == (x,y).1
z == (x,y,z).2
...

indentation for statements

statements are printed without any indentation

Support underscores in integere literals

0x123456_abcd_abcd_abcd. Allow this in:

command language
DDlog

@blp

Should strings be allowed to contain newlines?

E.g., start on a line and end on another one.
Also, if that is allowed, can the newline be escaped?

Fix string interpolation syntax

Replace {} with ${}.

foreign keys for more compact FTL

Ben's version of FTL allows traversing cross-table pointers, e.g.:

let O = lspip.lsp.dhcpv6_options.option_args

To support this without using nested for loops, we'd need to add some notion of foreign keys and possibly other SQL constraints to the language

Generate a meaningful erro message when type unification fails

An error message that looks like this is generated whenever type unification fails:

Expression parameterized(string, int) has unknown type in CtxTop

Set up CI

Should we set up travis CI?

Get rid of `let` syntax

We use let in FTL, and var in the rest of the language. There is not reason for these two forms of variable declaration.

extern function syntax

use extern keyword to make it explicit that a function is defined outside of Datalog

Allow access to enum fields

The Rust backend currently does not allow access to fields of types with multiple constructors (i.e., types compiled to enums), even if all constructors of a type have the same field

The Rust code generated for split does not compile

The unmerged tutorial has an example which does not seem to generate correct Rust code.
This is the dl code:

extern function split(s: string, sep: string): set<string>
function split_ip_list(x: string): set<string> =
   split(x, " ")

The generated code does not compile:

fn split_ip_list(x: &String) -> set<String> {
    split(x, (&r###" "###.to_string()))
}

since there is no set type in Rust.

Recursive type definitions

This is allowed by the parser

typedef t = (t, t)

Test calling DDlog from Java

It should be possible to load a compiled DDlog program and execute transactions from Java.

This requires a couple of preparatory steps:

DDlog currently generates static libraries. We have to change the crate type to build .so instead.
DDlog still does not have a complete C API.

Background:

The generated library file is written to the target/release directory, e.g., tests/datalog_tests/path/target/release/libpath.a.
Associated header file is tests/datalog_tests/path/ddlog.h