sunjay / brainfuck Goto Github PK

View Code? Open in Web Editor NEW

41.0 2.0 7.0 229 KB

Brainfuck interpreter companion to the brain programming language

License: MIT License

Rust 95.54% Shell 4.46%

brainfuck-interpreter rust brain brainfuck

brainfuck's Introduction

brainfuck

brainfuck is an esoteric programming language with 8 very simple instructions.

The brain compiler only officially targets this brainfuck interpreter. You may experience varying results with other brainfuck interpreters/compilers. There really isn't a definitive spec on how brainfuck should behave so it is just easier to have a static compilation target that won't vary in how it behaves.

A brainfuck specification for this brainfuck interpreter is available in the brainfuck.md file.

Installation

For people just looking to use brainfuck, the easiest way to get it right now is to first install the Cargo package manager for the Rust programming language.

Then in your terminal run:

cargo install brain-brainfuck

If you are upgrading from a previous version, run:

cargo install brain-brainfuck --force

Usage

For anyone just looking to run brainfuck code with the interpreter:

Follow the installation instructions above
Run brainfuck yourfile.bf to run a brainfuck interpreter which will run your generated brainfuck code

For anyone looking to work with the interpreter source code:

To run brainfuck programs:

cargo run --bin brainfuck -- filename

where filename is the brainfuck program you want to run

Make sure you have rust and cargo (comes with rust) installed.

Examples

There are various brainfuck examples in the examples/ directory which you can run with the brainfuck interpreter using the instructions above.

brainfuck's People

Contributors

Stargazers

Watchers

Forkers

tronsie helvm hugolundin chrismit3s rektinator edlyr soulaimaneyahya

brainfuck's Issues

Extension: Arbitrary Runtime Move

One limitation of brainfuck is that you can't move left or right in the tape and arbitrary amount during runtime.

For example, you can't start from a cell, and move left or right by the amount given in that cell. This extension proposes two new instructions:

{ - move left by the unsigned 32-bit or 64-bit integer in the next 4 or 8 cells (the offset) starting at the current position of the pointer
} - move right by the unsigned 32-bit or 64-bit integer in the next 4 or 8 cells (the offset) starting at the current position of the pointer
The cells containing the offset are zeroed.

This is not a total solution at the moment. There are issues that need to be worked out. For example, if I move to a given position via { or }, how do I get back? How can these instructions be used for things like copying if they always use up the current cell? It may be that copies will need to allocate some extra room for a pointer so that the offset can be copied. But then how do you copy the offset in the first place if it's already gone when you use it?

Optimize by default

This Brainfuck interpreter is used for running code, so it isn't necessarily useful to simulate a Turing machine exactly. Some optimizations should be enabled by default.

In particular, grouping instructions before execution should be on by default. Batching instructions like this greatly speeds up interpretation. Grouping should be done once, lazily, as required for execution.

In debug mode, default to having optimizations off (unless an optimization flag is explicitly passed in). But make sure optimizations can be enabled afterwards if requested by the debugger.

Make sure optimizations can be turned off too if someone doesn't want them.

Log Debug Information Using A Better Format

Instead of the string formatting we use for JSON now, use a good logging crate like slog. Logging each message as JSON and supporting different output formats like plaintext will make this tool easier to use with or without brain-debug.

Note that these changes only effect stderr. The output of the running program is unchanged. Part of this should also be replacing panics with well formatted error log messages.

Remove debug logging code from Interpreter struct and move it into the executable
- Evaluate whether to use #15, #16, or something else
Support two debug formats: human readable (text) and machine readable (JSON)
Update brain-debug to support JSON instead of CSV
Update the --debug option to set the logging level appropriately
Add a logging level option to the CLI
Measure the runtime cost of this change (make the benchmark in #3 first)

Clarify Jump Descriptions in Spec

From the spec:

Jump If Zero ([)

Jumps to the matching ] instruction if the value of the current cell is zero.
...

Jump Unless Zero (])

Jumps to the matching [ instruction if the value of the current cell is not zero.
...

This is misleading because it implies that you should jump directly to the other jump instruction whenever you need to jump. This is incorrect. You should not jump to another jump instruction as that would result in the evaluation of the condition for a second time. This is wasteful, unnecessary and not what this reference implementation actually does. We clarified that the jump should be matching, but neglected to mention how to avoid that extra, unnecessary evaluation.

The examples in those sections are misleading as well.

Clarify that a [ instruction should jump to the instruction just after the matching ]
- Fix example for [
Clarify that a ] instruction should jump to the instruction just after the matching [
- Fix example for [
Clarify behaviour (or undefined behaviour) in case where no matching jump is found
Ensure implementation is compliant
Bump patch version of spec and release new version

Setup benchmark

Setup a benchmark to run in the Travis build (or elsewhere). Run this for every PR and successful commit. For PRs, report the result as a comment.

The benchmark should use two Brainfuck scripts at least:

One that is mostly simple - uses all the commands but is not trivial
Another that is very complex and time consuming like Mandelbrot

Starting resource: https://beachape.com/blog/2016/11/02/rust-performance-testing-on-travis-ci/

Can do this in an after_success hook if bench results are written to a file. Then just use the Github API to post a comment.

Test the PR and the branch it is based on

Create benchmark using cargo bench
Create bash script for PRs

Cargo benchcmp may solve many problems: https://github.com/BurntSushi/cargo-benchcmp/

Travis script will be a bash script which will do the following:

If this is not a pull request, quit
Run cargo bench on the PR branch - $TRAVIS_BRANCH (save results to a file)
Checkout the base branch for the PR - $TRAVIS_PULL_REQUEST_BRANCH
Run cargo bench on the base branch (with silent option so not too much build output is produced) (save results to a different file)
Create a comment on the pull request using $TRAVIS_PULL_REQUEST with the output from each branch clearly labeling what the old results were and what the new results are
- GitHub key is stored in env

This script will only run if the build and tests succeed. If this script fails, nothing happens and the error is printed in the build log.

Maybe do this as a GitHub integration?

Abort on Panic

There are very few cases where unwinding is really useful here.

Try adding the following to Cargo.toml and measure the difference it makes in the release binary size.

[profile.release]
panic = "abort"

From: http://doc.crates.io/manifest.html#the-profile-sections

More info: http://lucumr.pocoo.org/2014/10/30/dont-panic/

Debugger Support

Related to sunjay/brain-debug#1

msgpack-rpc
CLI arguments for setup (only valid when --debug is specified)
Supports reloading and restarting the code
- Hot reloading?
More from related issue

Idea for IPC

Since the interpreter is launched by the debugger, maybe the IPC scheme could be as simple as sending/receiving data on the stdin/stderr channels of the interpreter process. Normally, stdin is reserved for the interpreter to use as input for the interpreted program. stderr is already used for debug messages.

Since we already have a --debug-format option, we can simply use that to communicate when to activate command support:

--debug-format text - same behaviour as before, no debugger commands, input is directly used for the interpreted program
--debug-format json - when the debug format is json, all input is interpreted as a json command
- json messages sent from the debugger to the interpreter encode different debug commands like start, pause, reset, etc.
- There is a json command that can be sent from the debugger to the interpreter to provide input for the interpreted program
  - This is necessary since all input is now expected to be json
- The interpreter buffers the received input and sends a message to the debugger whenever it requires more input and has run out
- The debugger can send its input at any time using the input command, but must respond to requests for input in order for interpreter execution to continue
--debug-format msgpack - same as json but using the more compact msgpack format

Better Error Messages

The current precompilation step destroys most of the important information necessary to accurately report the position of instructions when we need to report an error. That's why this isn't already implemented properly.

We should store the instruction's position when we create an Instruction instance, before the other precompilation steps destroy the necessary information. Then we could report the position properly whenever an error occurs.

As far as error messages go, they are pretty rare. This interpreter is targeted towards the brain compiler which is designed to produce valid code.

Change Instruction tuple structs to have named fields
Add a pos field to each Instruction variant that contains a struct with a line and col field
- impl Display for the struct so printing is easy
Log this position when --debug is enabled instead of the index of the instruction
Update error messages

Precompile Cache

Much like the Python interpreter, we would benefit from caching the parsed, precompiled version of the program. This way, on multiple runs, loading the precompiled version would be much faster than reparsing and recompiling the same source over and over again.

We would have to check the file to see if it is older that the precompiled cache. If the precompiled cache isn't openable or parsable, it should be ignored and overwritten with a new cache. Failing to write the cache file is not an error and should not stop the program.

We should try to take what works and fix what doesn't work from our predecessors:

Disabled in debug mode
CLI flag to disable

Panics, Aborts and Custom Exit Codes

Consider whether this should be changed to [exit 0] instead of just [0] to be more forwards compatible if we decide to use infinite loops for other FFI things.
- May not be necessary and would impose some extra characters if used

In the Rust programming language, when a panic occurs, the program cleans up, prints an error message about the panic and then aborts (ends) the program.

Brainfuck has no sophisticated jump instruction, but these kinds of aborts are still necessary for writing useful programs.

To resolve this, I am proposing a new way to communicate panics using brainfuck instructions that is completely backwards compatible with current brainfuck. This is not a new instruction or anything that would require updating other brainfuck interpreters. If another interpreter does not implement this functionality, it can gracefully fallback to its default behaviour and everything will still function mostly as expected.

Aside: I should note that since brainfuck is turing complete, it is possible to implement this in brainfuck already. One naive implementation could be to simply wrap the remaining program in an if else statement (implemented in brainfuck of course). That way, if whatever the condition is fails, the rest of the program would not run at all. The downside to this is that it severely complicates the generated brainfuck. Imagine being inside several nested brainfuck loops and then having to somehow navigate your way out to the end of the program. It would be extremely messy and nearly impossible to do so. Every single thing from that point in the program onward would need to be wrapped in conditionals.

Proposal

If a brainfuck program needs to abort, it should enter an infinite loop by using the instructions:

[]

This is an infinite loop. From a syntax point of view, an infinite loop is defined as a brainfuck [ instruction, followed by any number of non-instruction characters, concluded by a ] instruction. Without any further instructions inside a loop, if the loop is entered, it is guaranteed to run forever. The runtime semantics of this are a bit more complicated and are described below.

Several important points about this:

The word abort was used instead of panic. Panicking is actually only analogous to aborting. When panicking you need to clean up and output a message. Those things can already be done in brainfuck as it is. The only necessary addition is the last part of panicking, quitting the program and thus ending its execution.
Normal loop semantics apply. If the cell prior to the first jump instruction is zero, the program will continue onward without aborting. When integrating this into your own brainfuck code, if you have a place where you always want to abort, you can sometimes get away with using +[] as a replacement for the code above. The reason this is not specified that way above is because there is no guarantee that + will not cause a overflow and reset the current cell back to zero. It is up to the programmer to ensure that the current cell is non-zero before the [] instructions if they wish for the abort to occur.
- This is important because it makes it possible to conditionally abort when necessary but otherwise move on in the program.

Error messages and clean up

Printing out an error message and doing whatever clean up is necessary is already possible in brainfuck. Those things are left up to the programmer to do on their own.

Recommendation: once it is decided that the panic will occur, use some other cells to print the error message, then return back to the cell that was used to decide to panic and place the [] instructions.

Exit Status Codes

On Unix-based operating systems and not-so-much Windows, it is possible to return an "exit code" from a program to indicate whether the program succeeded or failed. This is represented as an integer. The convention is to return 0 when the program succeeds, and a non-zero value when the program fails. The non-zero value is not specific and can be anything the program wants.

To facilitate this, an additional extension to this extension is proposed.

In addition to being able to abort with this syntax:

[]

It will also be possible to abort with the following syntax:

[0]

or:

[-1]

In this case, and only in this case, the inner number will be interpreted as the exit code.

The following cases and their variations will not be accepted for this.

Not a number:

[abc]

Not just a number:

[-1 abc]

Number in the wrong position:

[abc 1]

Other valid brainfuck instructions:

[-1>>+]

These will be interpreted using the standard brainfuck interpretation and not used for the exit code.

If no exit code is specified, the default exit code will be 0. So just using [] results in the program exiting with a code of 0.

Backwards Compatible Behaviour

As stated above, this code will work in a reasonable way on brainfuck interpreters that do not support abort semantics. If a user prints a message before the abort, the message will be displayed and then the program will loop infinitely until the user decides to abort it manually.

For exit codes, those characters are ignored by default anyway, so adding them causes no issues with existing implementations.

Brainfuck Specification Updates

The actual content of the updates to the specification can largely be taken from the text above.

Bump the minor version number since this is an additional feature that does not break backwards compatibility
Add a section in the introduction about brainfuck extensions like the section in the GFM spec
- Make sure to include that while extensions are mandatory to implement, they are mandatory to support all code generated by the brain compiler
- Extensions can be put behind an interpreter flag as part of the CLI of the interpreter if desired
Add the abort extension clearly marking it an extension as done in the GFM spec
Add the exit code extension and specify that it extends the abort extension
- The reason this is an extension on the abort extension and not part of that extension is because exit codes are really only

Implementation

Implement this in the interpreter

Smarter Jumping

As it is implemented now, we perform a naive search every time we see a jump instruction. It would be better if as we searched for a matching jump, we cached the positions of other matching jump instructions we find. This would save us a lot of searching the same thing over and over again.

Search should be lazy and should only happen when a jump instruction is encountered (and when that jump instruction actually makes us jump)
Search should only search to the extent that it finds what it is looking for (should only search until the matching jump to what it is looking for is found)
Should work only with a statically allocated hashmap (analyse maximum jump depth in brainfuck examples)
Should drop jumps once we reach past their ending ] instruction (since there is no way to reach the beginning [ anymore)
Should prioritize caching longer jumps since a short jump is not that costly even if you have to search a bit

Replace NullWrite with Sink in benchmarks

Turns out that Rust provides an equivalent to the NullWrite struct defined in the benchmark. That means that we can get rid of that code and use the standard library equivalent.

This is an easy, drop in fix that should not take very much time at all.

https://doc.rust-lang.org/std/io/fn.sink.html

Split Up Code and Test Thoroughly

The code is currently all in one file. Some attempts have been made to better organize this, but have been reverted due to performance concerns.

We want to thoroughly test that this implementation of brainfuck meets the specification. That way we can have some guarantees about its functionality. This will also make it easier to extend the interpreter as time goes on.

When making changes to this codebase, it is critical that performance be maintained. The single file version isn't pretty, but it does perform quite well. We should build a benchmark to ensure that we aren't slowing down the interpreter and adding extra overhead as we refactor it.

Create benchmark for brainfuck interpreter performance (#3)
Split brainfuck code into multiple files
Replace print! in output by adding a generic parameter that accepts any Write instance and using write directly since it supports bytes - this change will help with unit testing - check out: https://doc.rust-lang.org/std/io/struct.LineWriter.html
Add unit tests
Replace stdin with a generic Read parameter in the function
Make sure all code still builds and all examples run

Linear Optimizations

The fastest brainfuck interpreter bff4 uses some clever linear optimization techniques to execute certain brainfuck loops in one step. We could also take advantage of these and speed things up quite a bit.

Here's my proposal: after the precompile step, we should go through the precompiled instructions as lazily as possible and determine which loops match the linearizable form that we are looking for. If you look at mandel.bf, there are many examples of such loops. By the end of this, we should be able to beat bff4.

Make sure to test on simple programs to see how this impacts both simple and loop heavy programs.

This should only run when optimization is enabled.