Git Product home page Git Product logo

accelerated-zig-parser's Introduction

A little about me

  • Zig-z Accelerated the Zig parser up to 3x ๐Ÿš€๐Ÿš€๐Ÿš€
    • โšก Zig is my personal favorite programming language
  • LLVM dragon-head logo Assembly auditor
    • Reduced the instruction count of must_be_2_3_continuation in simdutf/simdjson from 6 to 4 on x86-64 (with similarly small improvements on other architectures). Although this sounds trivial, this garnered a 4% performance uplift in utf8 validation!
  • ๐Ÿ“ƒ Invented a data structure [demo, paper] that improves upon prefix trees (i.e. tries) to solve the scored autocomplete problem orders of magnitude faster ๐Ÿš€๐Ÿš€๐Ÿš€๐Ÿš€๐Ÿš€
  • Co-Developed the initial version of roblox-ts with @Osyrisrblx
  • @RoStrap Created the RoStrap project
  • Lua My first programming language was Lua
  • Big-Theta connoisseur
    • Did you know that a priority queue implemented as a 1-2-3 Skip list can perform the extract-min operation in amortized constant time?
      • And yes, insertions are still logarithmic! And any other value can be extracted in logarithmic time!

Click the following image for a demo of my data structure:

An image of my prefix trie data structure


Favorite talks

Performance Matters (Strange Loop 2019)
Data-Oriented Design and C++
Practical Data-oriented Design

Exciting new tech

  • Mill instruction set architecture
    • An in-order statically scheduled architecture that achieves the performance of an out-of-order superscalar with extremely innovative tricks
    • Eats loops like goats eat underwear
  • LuaJIT Remake
    • Automatically generates a blazingly fast interpreter and multi-level JIT compiler given only a semantic description of a language's bytecodes
  • Pijul version control system
    • Based on the theory of patches and not slow like DARCS
  • Bun JavaScript runtime
    • Blazingly fast runtime and toolkit for JavaScript

Zig-z

accelerated-zig-parser's People

Contributors

validark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

accelerated-zig-parser's Issues

embedded as fallback without simd (1) and parser design (2-4)

  1. As far as I understand, using Zig compiler on embedded (100-300MB RAM) is a potential use cases. Is compatibility without SIMD possible/planned and what would be slowdown and/or drawback?
  2. I guess you plan to make the parser iterative to backtrack: Can the same be applied to zig fmt, which as of now recursively traverses things?
  3. if yes: Incremental parsing planned/possible?
  4. if yes: api to incremental parsing for language server error recovery/robust parsing?

I guess you have probably made up your mind about 1-2 with a draft impementation, so I wanted to ask about it.

Investigate slowdown of reading files

I randomly noticed that the runtime of reading files is now ~70ms, whereas it used to be ~30ms? Not sure what's going on there but something is not right.

more crazy idea for benchmarking: compare against optimal AST structure (found by applying simplex)

A relative optimal AST structure can be found from computing simplex with the derived constrains on the first AST parse + copy things over.

See also https://en.wikipedia.org/wiki/Simplex_algorithm. The general idea of non-recursive tree traversal is given here: https://stackoverflow.com/questions/28544980/data-oriented-tree-traversal-without-recursion/28616278#28616278.

I got the idea from user Prokop on discord, which was asking "is the adjacency of ast nodes in the node list used somehow to store information?".

split files by license

It sounds nicer to me to split Apache derived files and ideally put them into a package.
This would also allow eventual usage of SPDX ids via package manager. Opinions?

Afaik, SPDX does not allow explicit tagging of code ranges, so doing things differently would not be strictly standard-conform. However, feel free to disagree in what the package manager should use.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.