fix's People
fix's Issues
TODO: Runtime for laptop + one remote server
TODO: Add a tracing API to allow non-normative explanations for values
TODO: Make it possible for gdb to debug inside a Fix procedure when debugging Fixpoint
TODO: Web browser interface to see repository and evaluate things
Bootstrap should be self-hosting (again)
The bootstrap-in-fix script stopped working when we moved to clang 18.
Switch `DependencyGraph` to better data structures
For the sake of debuggability, DependencyGraph
is (as of #98) using a bunch of off-the-shelf datastructures (both for storing true data and checking invariants). The true data structures can be replaced with versions more optimized to our usage patterns; the invariant-checking data structures can be made debug-mode only, so release-mode code doesn't have to pay the cost of checking them.
TODO: Helper library (usable from C/C++) for "borrowing" spare memories/tables
TODO: Visualize memory usage
I think we can track memory pages being read by using the idle page tracking API: https://www.kernel.org/doc/html/next/admin-guide/mm/idle_page_tracking.html
And we can track pages being written by using the soft-dirty API: https://www.kernel.org/doc/html/next/admin-guide/mm/soft-dirty.html
TODO: Extend Flatware enough to run the CPython interpreter
TODO: Update wasm-toolchain to latest LLVM tip-of-tree (and adapt our changes on top of that)
TODO: Project website
TODO: Make `fixpoint_equality` canonicalize names before comparison
TODO: Interlink & cleanup docs so they display nicely on GitHub
TODO: Test binaryen wasm-merge (instead of our wasm-combiner)
Send objects between nodes
Current status: only literal or canonical-and-present-in-both-local-repos data can be sent for evaluation to another node, since no objects are transferred.
MVP: Fixpoint determines what data to send and sends it automatically when an evaluation is scheduled.
Ideal: The scheduler decides how and when to transfer data between nodes.
TODO: Make an automatic mechanism for the "staging" branch to be merged to master if tests pass
TODO: Shell REPL
Should print a prompt and read a line of commands (via an unsafe Thunk), then return an unsafe Thunk that depends on a Thunk that evaluates the corresponding command.
Then the unsafe Thunk gets that value, prints it out, and prompts for a new command.
Define better hash functions
We use a lot of hashing data structures, but don't really have a principled way to hash things like Task
s. We also use both std::hash
and abseil's hash for different things.
Schedule on more than one machine
TODO: Update wasm2c to support SIMD. Add a Fix test that uses SIMD.
TODO: Eliminate use of cryptopp/crypto++ in favor of... some other way to do SHA-256
See https://github.com/WebAssembly/wabt/blob/main/src/sha256.cc for some examples (with libcrypto and PicoSHA2), except both of these are kind of horrible, in the sense that they allocate memory, libcrypto frobs global state and calls into the pthreads API, etc. All we really need is a SHA-256 -- it shouldn't have to allocate anything, call any pthreads functions, etc.
Index the storage by canonical name when possible
Right now, everything is stored by local ID (and canonical names are looked up in a data structure that maps them to a local ID). We should flip this and allow canonical names to directly point to the object in question, and we should get more aggressive about canonicalizing names before display or storage, etc. (Which means, making it easy to canonicalize names...)
Add a full multi-node tester
TODO: Make stack-overflow recovery in wasm2c thread safe
Right now there's just a single altstack for any SIGSEGV signal -- there needs to be one for every worker thread.
TODO: Currying example/test
TODO: Deserialize from disk by using mmap
Requires cleanup of serialization/deserialization logic so in-memory representations match the on-disk representations. Ideally this can also be done efficiently for Fixcache entries.
Eliminate use of wasm-toolchain to compile fixpoint (use wasi-sdk instead)
TODO: Blob names contain size, and Tree names contain total size of strict entries?
TODO: Runtime for laptop + AWS Lambda
TODO: Report on "overuse" of resources (either max memory < limit, or objects in ENCODE never attached)
Clean up canonicalization
Several issues with the current canonicalization:
canonical_to_local_
is not append-only- Canonicalizing a local handle to canonical handle should add entries to the DependencyGraph, such that finishing a task on one of the two resolves the dependencies of the task of the other of the two
- Duplicate
get_local_name/get_local_handle
- All reads from
canonical_to_local_
should use Names instead of Handles
TODO: update fixcache_ keys to struct
Moving away from using m256i values for clarity
Add support for "remote pending" to RuntimeStorage
This will likely involve modifying RuntimeStorage
(and maybe fixcache
) to support having jobs that are pending elsewhere, as well as queuing a local job if a remote job fails. This will probably also need some degree of integration with the scheduler program.
TODO: Get rid of abseil dependency?
TODO: Upstream our changes to lld (reading from virtual files)
TODO: Make the trusted toolchain produce objects where internal functions are relative 32-bit relocations
Investigate use of different LLVM code models (large/medium/kernel/small) for this. The only function calls that need to be absolute 64-bit are the Fix API, the wasm_rt API, and __sigsetjmp/siglongjmp/memcpy/memmove/memset/sqrt/ceilf/ceil/floor/nearbyint/__assert_fail.
TODO: Make "deterministic Trap" and "nondeterministic Trap" types, and allow cache to store the former
TODO: Support thread cancellation via signals
If Fixpoint learns the answer to a question that a worker thread is currently working on, it should be able to safely cancel an ongoing evaluation.
TODO: Garbage collection
TODO: Flatware can create new files
More threads causes slower tests
For some reason, switching to 32 or 256 threads (from the current 16) seems to make the tests run measurably slower on stagecast.org
. Even though the tests probably won't benefit from the added parallelism, they shouldn't become significantly slower as a result.
This was first noticed in #98, but it wasn't really tested before then so it's not clear if it's a result of that change or an existing issue.
TODO: Submit PR to binaryen asyncify to work with reference types
Move to a non-allocating, intrinsic-aided version of SHA-256
We just switched to PicoSHA2 in #93, but it would be great to move to a non-allocating, inline assembly version that uses the Intel SHA extensions. I'm hopeful we can just copy the implementation out of OpenSSL's libcrypto (suitably tested).
TODO: Design and implement "in-place mutation"
The idea would be that if a procedure wants to "steal" the storage backing a Blob and use it to create a very similar Blob, Fixpoint could allow this as long as:
- the Blob is not currently attached
- Fixpoint knows how to recompute the Blob if it had to (maybe the procedure itself has to show up with an evaluation Tag demonstrating it knows how to recompute the Blob? or maybe Fixpoint looks for this itself? either way, obviously the Blob can't be part of the footprint of the Tag...)
The Blob's storage gets attached to a rw table, and meanwhile the Blob's name gets repointed at a tombstone that includes the Thunk needed to recreate the Blob.
If another procedure later tries to attach to the Blob that's been "stolen," Fixpoint will make it immediately return a Thunk that recomputes the missing Blob and then returns the original Thunk.
TODO: Enforce resource limits
TODO: Design and implement "unsafe thunks" (and "unsafe trees" to hold them?)
One idea is that "unsafe thunks":
- are evaluated on a new thread (but there's some cap on the total number of these at a given time)
- can link with
syscall
and make an arbitrary Linux syscall - can
- block
- attach to a thunk to print it out
- retrieve non-normative tags from the trace cache?
- directly lift lazy handles?
- read the raw contents of Names/Handles?
Maybe on starting Fixpoint, the user gives it one unsafe (or safe) Thunk to evaluate, and everything goes from there?
TODO: Get rid of fetchcontent and just use submodules
TODO: Implement "why" (e.g. explain how to compute this value)
Add multiple `Channel` implementations for different purposes.
We currently (since #97) have a generic multiple-producer multiple-consumer lockfree Channel
, using cameron314/concurrentqueue (moodycamel). However, for certain tasks, we'd like to have slightly different queue implementations backing a Channel
; e.g., a work-stealing-based Channel
for the workers' work queues. Instead of patching in the necessary synchronization with atomics and locks on top of the existing Channel
, we should have an interface IChannel
with multiple implementations: LockFreeChannel
, WorkStealingChannel
, etc. The implementations should internally do whatever bookkeeping/synchronization they need to maintain thread-safety.
Move linking into the offline trusted toolchain
If we had a Fix jump table mapped to a known set of virtual addresses, and Runnables are otherwise position-independent, we could pre-link all the Runnables and then directly mmap them read-only (treating them the same as any other Blob). Which would let us get rid of the in-memory cache of "linked" programs.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.