Git Product home page Git Product logo

ergo's Introduction

example workflow license

Design Goals

Ergo brings first-order logic to the .NET world through a lightweight and extensible Prolog implementation written entirely in C#. It is a relatively young project, so it's neither ISO-compliant nor stable, but it's been consistently improving over the past few years.

Its main design goals are to be flexible and customizable, to handle interop with C# seamlessly, and to be efficient enough to be worthwhile as a scripting language in high-demand applications such as games. Thanks to its versatile syntax and extensible architecture, Ergo can be adapted to any use case and lends itself well to the creation of domain-specific languages. Unification allows for very complex pattern-matching, and users can even implement their own parsers for their own abstract types that override standard unification, or add their own built-ins.

Ergo already supports several advanced features, including:

  • Compilation (Ergo targets a VM -- the ErgoVM)
  • Optimization (Custom built-ins can be further optimized at compile time)
  • Libraries (C# entry points for various Ergo extensions; linked to Ergo modules)
  • Hooks (compiled queries that call specific predicates, one-way bridge between C# and Ergo)
  • Tail Call Optimization (for the execution of tail recursive predicates)
  • Predicate Expansions (macros/term rewriting)
  • Dynamic Predicates (assert/retract)
  • Lambdas & Higher-Kinded Predicates
  • "Virtual" Predicates (which execute custom VM instructions, a quick and dirty alternative to BuiltIns)
  • Tabling (memoization)
  • Abstract Terms & Abstract Term Parsers (for custom types implemented on top of canonical terms)
    • Dictionaries (akin to SWI-Prolog)
    • Ordered Sets
    • Lists
    • Tuples (comma-lists)
  • Marshalling of CLR objects, delegates and events to/from Ergo terms (both complex-style and dictionary-style)
  • Unbounded Numeric Types (i.e. BigDecimal as the underlying numeric type)
    • In the future, optimizations for integer and float arithmetic could be added, but performance-critical codepaths can be delegated to C#

Example: Shell

Setting up an Ergo shell is easy:

// The "Standard" Ergo Facade is the recommended pre-configured default environment.
// You can extend it, modify it, or start from an empty facade.
var facade = ErgoFacade.Standard;
// You can use the Ergo VM directly or through the Ergo Shell, which provides a
// bunch of useful commands to interact with knowledge bases and module files.
// The Shell manages its own interpreter scope, knowledge base, and VM(s).
var shell = facade.BuildShell();
await foreach (var _ in shell.Repl())
{
    ;
}

With it, you can start experimenting in seconds! Refer to the wiki for more details.

Example: Marshalling

This example showcases several advanced features and demonstrates how to bridge Ergo with C# and vice-versa.

using Ergo.Facade;
using Ergo.Interpreter;
using Ergo.Interpreter.Libraries;
using Ergo.Lang.Compiler;

// The main way to customize Ergo is to add custom Libraries to your environment.
// Libraries are where you export and scope your BuiltIns and Directives.
var facade = ErgoFacade.Standard;
//   .AddLibrary(() => new MyLibrary());

// The main component is the ErgoInterpreter, which reads source files into a module tree.
var interpreter = facade.BuildInterpreter(InterpreterFlags.Default);
// The "scope" that it creates represents the state of the module tree, which is modified at load time by directives.
var interpreterScope = interpreter.CreateScope();
// From the InterpreterScope we can then create a KnowledgeBase containing all the predicates that were loaded, 
// as well as any that we want to define directly in C#. These are then compiled into ErgoVM.Ops which run on the VM.
var kb = interpreterScope.BuildKnowledgeBase(CompilerFlags.Default, beforeCompile: kb =>
{
    // Predicates can be defined from compiled code (Ops), making them a quick and dirty alternative to BuiltIns.
    // Note that BuiltIns support optimizations, while virtual predicates don't. Regardless, they both allow Ergo
    // code to call arbitrary C# code, acting as a one-way bridge between the two languages. Hooks are the opposite.
    if (interpreterScope.Parse<Complex>("message_sent_delegate(M)")
        .TryGetValue(out var head))
        kb.AssertA(Predicate.Virtual(WellKnown.Modules.User, head, vm =>
        {
            BuiltInNode.SetArgs(head.Arguments)(vm);
            // ITerm.IsClr() can be used to extract raw C# objects contained within atoms.
            // It is not idiomatic to use non-primitive types in such a way, but it is allowed.
            if (vm.Arg(0).IsClr(out string value)) // in this case, we know that our atom will wrap a string, which is a primitive in Ergo
                Console.WriteLine($"D: {value}");
        }));
    if (interpreterScope.Parse<Complex>("message_sent_event(M)")
        .TryGetValue(out head))
        kb.AssertA(Predicate.Virtual(WellKnown.Modules.User, head, vm =>
        {
            BuiltInNode.SetArgs(head.Arguments)(vm);
            // ITerm.Match() is much more powerful because it uses the TermMarshall behind the scenes, which allows for complex pattern matching,
            // as well as custom serialization of C# objects. This is the idiomatic way to share data between the two languages.
            if (vm.Arg(0).Match(out string value)) // in this case, no complex pattern matching is necessary so it is equivalent to calling ITerm.IsClr()
                Console.WriteLine($"E: {value}");
        }));
});
// With a compiled KnowledgeBase, we can now create the VM and run queries on it! 
var vm = facade.BuildVM(kb);
// Ergo supports advanced marshalling to and from C#. Hooks provide a complete example that showcases how
// objects, delegates and events can be marshalled easily. Hooks are essentially precompiled queries, and
// they can be used to call arbitrary Ergo code from C#, acting as a one-way bridge between the two langauges.
var eventTest = new EventTest();
var sendMessage = Hook.MarshallDelegate(eventTest.SendMessage, new Atom("message_sent_delegate"), WellKnown.Modules.User, insertAfterCall: true)(vm);
var messageEvent = typeof(EventTest).GetEvent(nameof(EventTest.MessageSent));
using (var hook = Hook.MarshallEvent(messageEvent, eventTest, new Atom("message_sent_event"), WellKnown.Modules.User)(vm))
{
    // these will call message_sent_event/1 as well as message_sent_delegate/1
    sendMessage("hello,");
    sendMessage("world!");
}
// this will only call message_sent_delegate/1
sendMessage("don't read this!");
// Expected output:
// E: hello,
// D: hello,
// E: world!
// D: world!
// D: don't read this!

sealed class EventTest
{
    // When this event fires, a hook will call message_sent_event/1 automatically,
    // by marshalling the string argument into a term, in this case an Atom(string).
    public event Action<string> MessageSent = _ => { };
    // When the combined delegate `sendMessage` is called, a hook will call message_sent_delegate/1 automatically,
    // same as for MessageSent, except that in the case of delegates it works via composition instead of subscription.
    public void SendMessage(string msg)
    {
        MessageSent?.Invoke(msg);
    }
    // When either (virtual) predicate is called, the control will pass to its execution graph, which we implemented as a custom Op.
    // This Op will need to handle the VM's arguments and it is at that point that it can marshall them back into CLR objects.
}

Roadmap

At the time of writing, Ergo is a fully interpreted partially compiled toy language with much room for optimization.

For a rough roadmap, please refer to: https://github.com/users/G3Kappa/projects/1/views/1

ergo's People

Contributors

giovinazzo-kevin avatar gkevin-kappas avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

ergo's Issues

Merge BuiltIn resolution and KnowledgeBase match steps into a single step

The Evaluations produced by built-ins are similar to the KBMatches produced by knowledge bases.

They are both IAsyncEnumerables that produce objects containing a SubstutionMap.

However the semantics are a bit different. Built-ins can return a different value than what was passed in, as opposed to pure predicates that always return true or false. For instance, math:eval(3.14) is effectively replaced by the result of the evaluation, 3.14.

As of now, built-ins are evaluated recursively. So, as long as the evaluation of a built-in is another built-in call, the evaluation will be performed immediately and the result will be passed on. But as it turns out, this isn't needed by anything. After all, built-ins are supposed to behave like regular predicates in most scenarios.

When eliminating this recursive resolution, it becomes evident that evaluations are pretty much the same thing as matches where the Lhs is mutable instead of immutable, and where the Rhs is a stubbed version of the built-in that simply returns true.

Term Expansions As Predicates

Term expansions exist first to support literal definitions, and then to support macros.

Literal definitions are simple, in that they associate two ground terms.

Macros are fundamentally represented as a predicate with an outbound variable that stores the expansion.
Thus, a term may unify with several macros, and a single macro may produce several expansions.

It follows that an expansion should create multiple choice points. The missing piece is how to bind the expansion variable to the actual expansion. The most logical choice is to define a directive expand/2 where the first argument is the predicate itself, and the second argument is the outbound variable.

Proposed Syntax

:- expand(
    (dict(_, Args)$Key :- 
        dict_key_value(dict(_, Args), Key, Outbound)),
    Outbound
).

Parsing errors

Long term issue for all parsing errors. There are currently no outstanding parsing errors to look at.

Load-time Expansions

At the time of writing, expansions work fine but they do unnecessary work because they are called during the solving step.
Expansions should only happen when the solver is initialized and when a query is parsed.

Right now the head of predicates is expanded correctly at initialization time, but the body isn't. This is because some macros, such as :=/1, replace themselves with the result of their computation.

Either the macro semantics are limited such that predicate expansion is no longer a problem, or some corner cases need to be addressed:

  • Expansions that happen inside complex terms in the body of a predicate need to be bound to an intermediate variable in a clause that is resolved right before the complex term.
    • Expansions that happen inside the head of a predicate don't benefit from this.
  • Expansions that happen inside complex terms in the head of a predicate need to be bound to an intermediate variable in a clause that is resolved after any other clause in the predicate.

Abstract Terms as proper first-class ITerms

Technically once a parser returns an abstract term, there should no longer be any need to check if any term is an abstract one. Additionally, during unification, abstract terms can be preserved so there's no longer a need to enforce the canonical form even if it's still useful to implement.

Thus by implementing ITerm directly and doing away with explicit canonical forms, abstract terms will become real first class citizens, and parsing should become much faster too.

This issue implies a complete removal of the problematic abstract term cache, which has always been a hack. This change will eliminate the question of whether the cache should be static or lugged around by all consumers.

Instrumentation and Tooling

I started working on some simple instrumentation for profiling and debugging. In the long run, it's imperative to tooling for:

  • Interactive debugging
  • Profiling of all engine functions
  • Knowledge base visualization and manipulation
  • Text editor integration (language server)
  • Static Analysis

And, maybe, for:

  • Automatic test generation?
  • CI?

Improve trace mode

Trace mode is supposed to be of help during debugging, but as it is right now it merely complicates things further by printing walls of incomprehensible text.

It should also be interactive, like SWIPL's trace mode.

Refactor TryGet methods and Maybe<T> methods to be more consistent across the board

Sometimes the TryGet API feels more intuitive than the Maybe monad, partly because there are situations in which Map and Reduce don't work well, such as within a readonly struct.

On the contrary, the Maybe monad allows for easier composition. It has currently been extended to be a mix between the proper Maybe monad and the Option monad, to facilitate pattern matching in situations where Map and Reduce don't work.

The goal is to draw a boundary between the two cases in order to make the API more consistent, and possibly to restore the Maybe monad to its proper usage.

Lambdas/Higher-Order Predicates

  • Create abstract type Lambda that parses the forms {Free}/[List]>>Lambda and [List]>>Lambda.
  • Create builtin that applies the lambda and forwards the arguments to call.
  • Figure out if macro expansion is necessary, because it would be problematic since macros are to be implemented as lambdas.

Hooks

https://www.swi-prolog.org/pldoc/man?section=hooks

Hooks are Ergo predicates that will be called asynchronously by engine-internal C# code during the normal course of program execution. They give Ergo code the ability to have a say in how it's interpreted and, specifically, they provide a mechanism on top of which attributed variables can be built.

Hooks should be a general mechanism that can be plugged in whenever the engine wants to query a well-known, user-defined predicate.

TCO

The current solver architecture doesn't provide the flexibility of a virtual machine.
In particular, it makes it complicated to reason about choice points and stack frames, obstructing several optimization paths.

A blocking issue is that, with the current architecture, I can't find a way to implement last call optimization. The Solve() method is implemented as a pair of mutually recursive IAsyncEnumerables: one acting on queries and one acting on terms. It's reasonable to assume that the first method could be refactored away, leaving only the iteration constructs. At that point LCO is only a matter of reassigning some variables and issuing a goto.

There are more elegant solutions. The next logical step would be refactoring the mess of loops into a state machine. This is where we can introduce concepts such as the stack and the trail in a way that mirrors the WAM without mirroring its internals. After all, the WAM works on compiled Prolog while Ergo is still completely interpreted. This step should however lay the groundwork for what will eventually become the Ergo Abstract Machine, so it is very important.

String/List Duality

In Prolog, strings are represented as proper lists of characters. In Ergo, they are plain atoms, since those are represented internally as CLR strings. This solution is simple but incorrect, as predicates that are supposed to work on lists won't work on strings.

Strings could be implemented as abstract types on top of regular lists, with their own modified parser.

Shell UX Improvements

The shell is not a core component of Ergo by itself, but it's the one a developer would spend the most time on, before interfacing with Ergo directly from C#.

So it needs to be functional and modern. Theming is one concern, another is ease of use.

SWI-Prolog's shell isn't so amazing in that regard. Pasting snippets usually goes very awry, since there's not a dedicated "insert mode" such as Vim's. Asserting predicates is also a pain.

Save current module to disk

The current knowledgebase can't be exported to disk from the shell, because the code that would do that is commented out and needs to be rewritten.

Runtime Variables / Caching Execution Graph Delegates

Ergo is built on an immutable base layer. Terms, predicates, scopes. Everything is immutable by default -- and this plays nice with the interpreter as per the original architecture.

The switch to a VM-based architecture, however, is pointing out one obvious flaw with the immutable model: efficiency. Profiling reveals that most of the time spent while tail-recursing over for/3 is actually spent instantiating and substituting predicates (and their associated knowledge graphs). The time spent actually compiling the graph is rather small by comparison, but still significant.

And this makes sense to a degree, because when substituting multiple copies of the same variable a lot of redundant operations have to be performed. Additionally, terms have to be traversed fully every time.

A runtime-friendly model would represent variables as mutable so that multiple copies can be updated with one operation and so that terms are updated automatically without ever having to traverse them, by just applying a substitution to the corresponding RuntimeVariable and having the effect propagate through all terms that reference it.

One issue with this approach is that Variables were never meant to be mutable in the initial architecture. Therefore they are readonly structs that can't be inherited from. If I made a RuntimeVariable wrapper that inherited AbstractTerm, then there would be a few problems:

  • Some places in the code check, e.g., if(term is Variable { Ignored: true }). Since RuntimeVariable can't inherit from Variable, this would require a workaround, possibly an IVariable interface.
  • Dictionaries have an Either<Atom, Variable> functor, and this presents similar challenges as the previous point.
  • A mechanism to separate runtime terms from immutable terms would still be required, encapsulating them inside predicates (or a runtime version of predicates) might work as they are the entry point for compilation.

Cuts As Cancellation Tokens

  1. Create a CancellationTokenSource when Solve is called
  2. Attach its token to the SolverScope
  3. Check whether the cancellation for that token was requested instead of using a volatile bool

Knowledge Base Refactor

The current knowledge base is a somewhat static object that needs to be updated manually every time.

This works fine in simple cases but the integration with Fiero has shown that sometimes you want to partially share a knowledge base, and update it after its creation.

The new knowledge base needs to:

  • Be event-driven: assertion, retraction, matching etc. need to raise events that can be intercepted. This would enable compiling predicates on the fly, for instance, as well as notifying instrumentation like the static analyzer that can then update the dependency graph and so forth.
  • Be concurrent or immutable: multiple threads may want to access the kb at the same time. Ideally, the static part of a knowledge base can be treated as an immutable object, which makes reusing it across multiple contexts easy and safe. Dynamic predicates can be stored in a concurrent database, effectively separate from the main knowlege base, that can be shared on a different basis. This raises the question of whether a transactional model is needed.
  • Be FAST: while most calls are compiled away, some may still need to be interpreted. In these cases the knowledge base needs to provide matches instantly.

Databases

Despite the simplicity with which Ergo can interface to C#, it would be nice to provide a default database-like API to handle persistent state idiomatically. This has become especially evident during the integration of Ergo with Fiero, the game I'm working on, where something as simple as incrementing an integer needs to be done rather inefficiently via dynamic predicates.

Associate more strongly abstract types and their canonical forms

Abstract types are working, but it's clear that they're still not a definitive improvement over the previous system.

Specifically, it seems that terms that are obtained via unification are not always converted back to their abstract form because either operand is not recognized as an abstract term. Also, terms that are written directly in canonical form have the same issue.

Extending the abstract parsers to consider canonical forms is not guaranteed to work in scenarios where unification is the culprit, although it would solve some problems.

The previous system, and to a degree the current one, made extensive use of term unfolding, i.e. manually unwrapping the AST in order to reconstruct the abstract form. This is fine if it's guaranteed to happen once per term because terms are immutable, but there's currently no cache to keep track of these things and therefore it's a very expensive operation. Still, a cache feels like a band-aid. A better solution could be lurking in the shadows.

Also, some results of unification could be represented more efficiently than they currently are: for instance, sub-lists could be represented as slices of a parent list.

Abstract First-Class Citizens

All abstract terms can be represented canonically as a base term.

There should be an interface that allows defining new types in C# as long as a conversion to and from their canonical form is provided.
This interface would also expose the parser, which is currently not extensible, for the purposes of parsing the new type.

After this change is implemented, Dicts and possibly Lists should be rewritten to be abstract first class citizens.

Streamline exception throwing and handling

The Problem

Several engine components need to throw exceptions: the Lexer, the Parser, the Shell, the Interpreter and the Solver are first among them; but it shouldn't matter where the exception originates from.

Given the IAsyncEnumerable-based structure of the engine, catching and logging exceptions is not something you can hide from the user: any exception naturally bubbles up to the outermost scope, where it invariably stops the enumeration even when caught.

The current system consists of a random assortment of functions that will pipe any exception through a default ExceptionHandler that's defined at the ShellScope level.

Ideally, this handler should be defined at the lowest possible level, so that it can be reused painlessly up the stack, and so that any component in the stack should be able catch any exception that bubbles up to its own level.

It could also be defined at the highest possible level, so that there's no need to worry about passing it up the stack at all. This would require a buffering mechanism however, because you can't yield from within a try block.

NOTE: USE ExceptionDispatchInfo.Capture(ex)

Warren Abstract Machine

While Ergo is expressed rather idiomatically in C#, it would be beneficial to study the WAM and to figure out what sorts of optimizations are available. This would extend to the task of compiling Ergo to IL.

Optimize substitution maps for variables that are only used once

Some queries that should run in constant space are creating GBs of unnecessary substitutions.

A simple example is: for(I, 0, 100000) which will probably crash the system with an OOM error before terminating.

My intuition is that these are generating "useless" substitutions that should be removed "as we go" instead of doing all the heavy lifting at the end.

Inlining

Work is already underway to implement inlining of predicates as a very first "compilation" step. Inlining will be manual at first, and eventually automatic with a manual override.

The way it currently works is as follows:

  • You mark a list of predicates for inlining with the :- inline/1 directive.
    • These predicates must be local to the current module.
  • The Compiler library waits for all other SolverInitializingEvent handlers, then performs the final step: building a dependency graph.
  • The graph is then walked as to inline each predicate in the correct order. Goals that cause cycles are not inlined.

Local Operators

https://www.swi-prolog.org/pldoc/man?section=operators

In SWI, operators are module-local and need to be exported/imported explicitly with module/2 and use_module/2, and possibly reexport/2.

Other implementations, like SICStus don't have local operators.

Are they worth implementing?

Reasoning

Given the scope of Ergo, they're not particularly important. But they would be nice to have as they would make it possible to explicitly avoid clashes.

Rollback IAsyncEnumerable support in favor of plain IEnumerables

AsyncEnumerables are very expensive. They were initially introduced for data sinks and sources, but seeing as those are a fringe feature that is not standard Prolog, they don't warrant a massive penalty to performance. Those will need to be implemented some other way, such as by blocking the calling thread.

"Bracy" Lists (Sets)

SWI-Prolog represents shared variables within Lambda expressions as a list enclosed by braces, denoted a "bracy list".

Ergo only supports regular lists at the moment. It would be useful to define new list types that differ only in the symbols used for parsing (opening token, separator, and closing token).

It would be great to define these new types as abstract first class citizens, so that their semantics can differ from those of regular lists.

Waiting on #4.

Refactor AbstractTermCache to be non-static (?)

The extension method Maybe<IAbstractTerm> IsAbstract<T>(this ITerm t) is sprinkled throughout the code in various places. Within, calls to the static AbstractTermCache are made. Despite the cache using concurrent collections, these still require a lock, sometimes, in order to work properly.

In any case, having a static AbstractTermCache seems like an issue. It should be part of the interpreter or the interpreter scope. What makes this refactoring non-trivial is that a reference to the cache needs to end up at all current call sites of IsAbstract, which can be ugly sometimes.

Static predicate rewriting pipeline for libraries

Waiting on #10

The idea here is to to define the predicate transformations required by some libraries, such as the one that will contain the tabling logic, within the libraries themselves.

This is similar to expansions, but it happens on the C# side.

Static Analyzer

Currently, all static analysis steps are performed haphazardly by the Compiler library through a variety of graphs.

Ideally all static analysis should be performed by a dedicated object that can be passed around.

This also raises the question of the coupling between Interpreter, Solver and KnowledgeBase. In the interpreted execution model, this makes sense somewhat. But in the executed and compiled models, the solver becomes almost a liability to lug around. Ideally, once cyclic and dynamic goals are compiled properly, the solver will no longer be necessary.

The new model should encompass all three execution modes while providing a clear interface to them, with the static analyzer playing a central role in transitioning from the interpreted to the executed modes.

This issue supersedes #11.

Libraries

https://www.swi-prolog.org/pldoc/man?section=reexport

Modules are the individual Ergo code unit. A module exposes a list of public predicates that become available to all derived modules. A module might also reserve special semantics, such as in the case of attribute modules.

A library is a special type of module that is used to group together other modules under one package that is accessible to both C# and Ergo. Libraries are also the entry point for several engine extensions that only become available once the library is imported, thus providing a mechanism to implement the aforementioned semantics while limiting their scope.

Structure

A library is an ordinary Ergo module. From C#, a matching class is added to the interpreter (if defined). That class contains:

  • A list of interpreter directives provided by that library
  • A list of solver built-ins provided by that library
  • An entry point for handling Ergo events

It should be possible to assert predicates in modules other than the current one

This is required by #27.

For instance, the following code should be possible to define in any module, not just the user module:

user:message_hook(Term, Kind, Lines) :-
    format('Term: ~w~nKind: ~w~nLines: ~w~n~n',[Term,Kind,Lines]),
    print_message_lines(current_output,kind(Kind),Lines),
    nl.

The parser currently ignores this information and instead creates a predicate with head ':'(user, message_hook)(Term, Kind, Lines) in the current module

Unary operators ignore precedence rules

This makes defining operators involving certain problematic functors, such as ., impossible because they clash with some other symbol.

The fix requires undefining the punctuation symbol . and defining a well-known postfix operator '.'/1 with precedence 0, so that it correctly encloses statements.

Unfortunately, the current expression parser is incapable of parsing unary expressions in such a way because it only cares about precedence and associativity when parsing infix expressions.

Dismantle Old Solver

Since the ErgoVM can do everything the old solver did and faster (pending #77), the solver has become somewhat redundant.

Pros of the old solver:

  • Guarantees correct execution since it interprets terms one at a time.
  • Uses an enumerator-based architecture that is very idiomatic to C# and similar to YieldProlog.
  • Writing built-ins that target it is pretty straightforward.
  • Has been proven to be pretty robust since all past development targeted it.
  • Can benefit from static analysis optimizations (as with the old ExecutionNodes providing an IEnumerable<ExecutionScope> Execute(SolverContext ctx, SolverScope solverScope, ExecutionScope execScope) method).

Cons of the old solver:

  • It's slow, in part because it consists of a ton of nested foreach loops.
  • It doesn't offer granular control over the state of execution like the VM does, and you have to think outside the box to achieve the same results.
  • It handles choice points with CancellationTokens by halting execution of all previous contexts whenever a cut is encountered. This emulates cuts correctly, but it makes reasoning about the stack harder and it's not conventional in other Prolog implementations, which makes it harder to port their features.
  • It operates on immutable terms and does a lot of redundant work.
    • This is also the case for the VM for the time being, however.
  • It has to "yield" a solution even for very simple operations like unification and evaluation, generating additional overhead.
  • It has to manipulate and pass around an immutable "scope" that represents the current state of execution.
    • Which tried to mirror the semantics of passing around "ref scope" in other contexts, but not as efficiently.

By comparison, pros of the VM:

  • Is idiomatic to Prolog and makes it easy to port optimizations found in other Prolog specifications.
    • To start with, it has an "environment" and a "choice point stack" exactly like you would expect.
      • Simple operations like unification and evaluation can operate on the environment directly instead of producing a solution.
      • Choice points can be cut, manipulated or discarded arbitrarily at any point in time and with fine control.
  • Runs "compiled" delegates (Ops) that are produced by an optimized execution graph.
    • These delegates can be composed in a functional manner, leaving only the code that actually needs to be executed on the VM and capturing all outside context in a closure.
  • Can run both in "interactive" mode and in standard or "batch" mode.
    • The former is good for the shell as it yields control exactly like the old solver does; the latter is better for high-performance scenarios where you don't care about enumerating solutions as they are produced.
  • Needs to lug around much less context:
    • The compiled Ops it works with are the result of static analysis, so all predicates that should have been qualified have been qualified with a module. Therefore, there's no need to dispatch predicates at runtime, and the InterpreterScope that the Solver uses to determine the possible qualifications for a goal is no longer necessary.
    • The SolverScope is no longer necessary since the VM handles its own internal mutable state. If a snapshot of that state is desired, it can be generated on demand.
    • The KnowledgeBase is all that's really required, but design work is pending on that front (#75 and #77). The eventual solution may either decouple or tightly couple the KB to the VM as to address parallelization and performance concerns.
    • With cuts being implemented traditionally, there's no need to have multiple nested "SolverContexts". It's just one VM instance with its own choice point stack.

Cons of the VM:

  • Is harder to debug as the control flow goes from delegate to delegate, and the delegate themselves represent an optimized graph so a lot of the context that's available with the solver just isn't immediately available on the VM.
  • Does not play nice with immutability, which is its current bottleneck.
    • Eventually, it will require a solution to #77 in order to be competitive.
  • Does not follow the same architectural style as the other components (parser, interpreter, solver).
    • But it's also to be expected as it needs to handle runtime state efficiently as opposed to compile-time state comprehensively.

Now, some profiling (release mode, no trace). 5 samples per comparison.

Old Solver w/ Execution Graph optimizations:

  • Runs 102 tests in [30.9/30.1/30.8/30.3/30.8]sec.
    VM w/ Execution Graph optimizations:
  • Runs 118 tests in [19.1/20.3/18.6/18.7/19]sec.

At a glance, it seems that the VM is about 33% faster even in its initial naive implementation. This may be due to other optimizations being made as part of developing the VM, but those weren't the primary focus. The real optimization effort is yet to begin. So overall the results are promising.

However, the hot path seems to run (much) hotter in one particular scenario, which is tail recursion of for/3. The other tail recursive cases all seem better:

image

image

I think the reason for this is that now whenever a predicate is instantiated and substituted, so is its graph. Which creates a ton of overhead. The fix is to solve #77.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.