fix-project / fix Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 3.0 1.14 MB

C++ 59.18% CMake 3.83% C 27.16% WebAssembly 4.46% Perl 0.04% Python 2.51% Rust 2.71% Shell 0.11%

fix's People

Contributors

Stargazers

Watchers

Forkers

chadbrewbaker rupanshusoi tweoss

fix's Issues

TODO: Runtime for laptop + one remote server

TODO: Add a tracing API to allow non-normative explanations for values

TODO: Make it possible for gdb to debug inside a Fix procedure when debugging Fixpoint

TODO: Web browser interface to see repository and evaluate things

Bootstrap should be self-hosting (again)

The bootstrap-in-fix script stopped working when we moved to clang 18.

Switch `DependencyGraph` to better data structures

For the sake of debuggability, DependencyGraph is (as of #98) using a bunch of off-the-shelf datastructures (both for storing true data and checking invariants). The true data structures can be replaced with versions more optimized to our usage patterns; the invariant-checking data structures can be made debug-mode only, so release-mode code doesn't have to pay the cost of checking them.

TODO: Helper library (usable from C/C++) for "borrowing" spare memories/tables

TODO: Visualize memory usage

I think we can track memory pages being read by using the idle page tracking API: https://www.kernel.org/doc/html/next/admin-guide/mm/idle_page_tracking.html

And we can track pages being written by using the soft-dirty API: https://www.kernel.org/doc/html/next/admin-guide/mm/soft-dirty.html

TODO: Extend Flatware enough to run the CPython interpreter

TODO: Update wasm-toolchain to latest LLVM tip-of-tree (and adapt our changes on top of that)

TODO: Project website

TODO: Make `fixpoint_equality` canonicalize names before comparison

TODO: Interlink & cleanup docs so they display nicely on GitHub

TODO: Test binaryen wasm-merge (instead of our wasm-combiner)

Send objects between nodes

Current status: only literal or canonical-and-present-in-both-local-repos data can be sent for evaluation to another node, since no objects are transferred.

MVP: Fixpoint determines what data to send and sends it automatically when an evaluation is scheduled.

Ideal: The scheduler decides how and when to transfer data between nodes.

TODO: Make an automatic mechanism for the "staging" branch to be merged to master if tests pass

TODO: Shell REPL

Should print a prompt and read a line of commands (via an unsafe Thunk), then return an unsafe Thunk that depends on a Thunk that evaluates the corresponding command.

Then the unsafe Thunk gets that value, prints it out, and prompts for a new command.

Define better hash functions

We use a lot of hashing data structures, but don't really have a principled way to hash things like Tasks. We also use both std::hash and abseil's hash for different things.

Schedule on more than one machine

The current scheduler always schedules tasks locally. We need to look at the task itself and make an educated decision about where to schedule it.

Required for #65 and #66.

TODO: Update wasm2c to support SIMD. Add a Fix test that uses SIMD.

TODO: Eliminate use of cryptopp/crypto++ in favor of... some other way to do SHA-256

See https://github.com/WebAssembly/wabt/blob/main/src/sha256.cc for some examples (with libcrypto and PicoSHA2), except both of these are kind of horrible, in the sense that they allocate memory, libcrypto frobs global state and calls into the pthreads API, etc. All we really need is a SHA-256 -- it shouldn't have to allocate anything, call any pthreads functions, etc.

Index the storage by canonical name when possible

Right now, everything is stored by local ID (and canonical names are looked up in a data structure that maps them to a local ID). We should flip this and allow canonical names to directly point to the object in question, and we should get more aggressive about canonicalizing names before display or storage, etc. (Which means, making it easy to canonicalize names...)

Add a full multi-node tester

This can be an addition to stateless-tester or a new version of distributed-tester, but we should have an easy way to connect to other nodes and start a distributed evaluation.

See #96, #97, and #98.

TODO: Make stack-overflow recovery in wasm2c thread safe

Right now there's just a single altstack for any SIGSEGV signal -- there needs to be one for every worker thread.

TODO: Currying example/test

TODO: Deserialize from disk by using mmap

Requires cleanup of serialization/deserialization logic so in-memory representations match the on-disk representations. Ideally this can also be done efficiently for Fixcache entries.

Eliminate use of wasm-toolchain to compile fixpoint (use wasi-sdk instead)

TODO: Blob names contain size, and Tree names contain total size of strict entries?

TODO: Runtime for laptop + AWS Lambda

TODO: Report on "overuse" of resources (either max memory < limit, or objects in ENCODE never attached)

Clean up canonicalization

Several issues with the current canonicalization:

canonical_to_local_ is not append-only
Canonicalizing a local handle to canonical handle should add entries to the DependencyGraph, such that finishing a task on one of the two resolves the dependencies of the task of the other of the two
Duplicate get_local_name/get_local_handle
All reads from canonical_to_local_ should use Names instead of Handles

TODO: update fixcache_ keys to struct

Moving away from using m256i values for clarity

Add support for "remote pending" to RuntimeStorage

This will likely involve modifying RuntimeStorage (and maybe fixcache) to support having jobs that are pending elsewhere, as well as queuing a local job if a remote job fails. This will probably also need some degree of integration with the scheduler program.

TODO: Get rid of abseil dependency?

TODO: Upstream our changes to lld (reading from virtual files)

TODO: Make the trusted toolchain produce objects where internal functions are relative 32-bit relocations

Investigate use of different LLVM code models (large/medium/kernel/small) for this. The only function calls that need to be absolute 64-bit are the Fix API, the wasm_rt API, and __sigsetjmp/siglongjmp/memcpy/memmove/memset/sqrt/ceilf/ceil/floor/nearbyint/__assert_fail.

TODO: Make "deterministic Trap" and "nondeterministic Trap" types, and allow cache to store the former

TODO: Support thread cancellation via signals

If Fixpoint learns the answer to a question that a worker thread is currently working on, it should be able to safely cancel an ongoing evaluation.

TODO: Garbage collection

TODO: Flatware can create new files

More threads causes slower tests

For some reason, switching to 32 or 256 threads (from the current 16) seems to make the tests run measurably slower on stagecast.org. Even though the tests probably won't benefit from the added parallelism, they shouldn't become significantly slower as a result.

This was first noticed in #98, but it wasn't really tested before then so it's not clear if it's a result of that change or an existing issue.

TODO: Submit PR to binaryen asyncify to work with reference types

Move to a non-allocating, intrinsic-aided version of SHA-256

We just switched to PicoSHA2 in #93, but it would be great to move to a non-allocating, inline assembly version that uses the Intel SHA extensions. I'm hopeful we can just copy the implementation out of OpenSSL's libcrypto (suitably tested).

TODO: Design and implement "in-place mutation"

The idea would be that if a procedure wants to "steal" the storage backing a Blob and use it to create a very similar Blob, Fixpoint could allow this as long as:

the Blob is not currently attached
Fixpoint knows how to recompute the Blob if it had to (maybe the procedure itself has to show up with an evaluation Tag demonstrating it knows how to recompute the Blob? or maybe Fixpoint looks for this itself? either way, obviously the Blob can't be part of the footprint of the Tag...)

The Blob's storage gets attached to a rw table, and meanwhile the Blob's name gets repointed at a tombstone that includes the Thunk needed to recreate the Blob.

If another procedure later tries to attach to the Blob that's been "stolen," Fixpoint will make it immediately return a Thunk that recomputes the missing Blob and then returns the original Thunk.

TODO: Enforce resource limits

TODO: Design and implement "unsafe thunks" (and "unsafe trees" to hold them?)

One idea is that "unsafe thunks":

are evaluated on a new thread (but there's some cap on the total number of these at a given time)
can link with syscall and make an arbitrary Linux syscall
can
- block
- attach to a thunk to print it out
- retrieve non-normative tags from the trace cache?
- directly lift lazy handles?
- read the raw contents of Names/Handles?

Maybe on starting Fixpoint, the user gives it one unsafe (or safe) Thunk to evaluate, and everything goes from there?

TODO: Get rid of fetchcontent and just use submodules

TODO: Implement "why" (e.g. explain how to compute this value)

Add multiple `Channel` implementations for different purposes.

We currently (since #97) have a generic multiple-producer multiple-consumer lockfree Channel, using cameron314/concurrentqueue (moodycamel). However, for certain tasks, we'd like to have slightly different queue implementations backing a Channel; e.g., a work-stealing-based Channel for the workers' work queues. Instead of patching in the necessary synchronization with atomics and locks on top of the existing Channel, we should have an interface IChannel with multiple implementations: LockFreeChannel, WorkStealingChannel, etc. The implementations should internally do whatever bookkeeping/synchronization they need to maintain thread-safety.

Move linking into the offline trusted toolchain

If we had a Fix jump table mapped to a known set of virtual addresses, and Runnables are otherwise position-independent, we could pre-link all the Runnables and then directly mmap them read-only (treating them the same as any other Blob). Which would let us get rid of the in-memory cache of "linked" programs.