Comments (3)
I see. From what I can tell there's a lot performance to be gained from improving the existing backend/runtime. BOLT can still help if the resulting code is large and causes a significant amount of instruction cache and TLB misses. On Linux you can measure those running the application under perf -e instructions,L1-icache-misses -- <your app with options>
. If you end up with under 5 misses per a thousand instructions, then it's better to look at other options to improve the code. Compute-heavy applications, for example, spend most of their time in loops and are not great candidates for code layout optimizations provided by BOLT.
from bolt.
I don't have an experience with GHC, and I don't have an access to version 8.4.3. However, I tried 7.6.3 with the "Hello World" program, and although BOLT processes the binary, the resulting executable segfaults. It appears there are multiple custom control transfer tables that are embedded into the code, and they are unmarked in the symbol table. stg_ap_p_fast
is once instance of a function that uses such tables. Again, I'm unfamiliar with GHC, and cannot know what is the purpose of this function.
In general, since GHC uses non-standard code sequences involving indirect jumps, it will require a special support in BOLT. How important is this for you? Do you have a large performance-critical application written in Haskell?
from bolt.
Cool, I appreciate you trying it out!
I'm out of my depth here but certainly not surprised that BOLT doesn't work on GHC code as I know it's not a platform you test against.
How important is this for you? Do you have a large performance-critical application written in Haskell?
Difficult question! I'm trying out BOLT because I'm curious and wanted to see what would happen and if I could write a blog post about it (though I do write a lot of haskell for a living, some of which needs to be tuned or carefully tested for performance).
But I have real reasons to be interested in BOLT (or similar tools... I don't honestly have a great sense of what BOLT does besides reorder blocks for better locality) vis a vis GHC/haskell:
- much performance depends on optimizations that happen early in compilation (rewrite rules), which are dependent on inlining; in general marking functions INLINE is good for performance, but this does lead to bloat (whether or how much code bloat leads to performance penalties I think is mostly rumor; BOLT seems like a great tool to explore this)
- there are no LTO or similar whole-program optimization passes available
- the native code generator gets very little active development AFAICT; there is almost no strength reduction type optimizations done, the register allocator is not so great (at least I've observed it do a worse job than the LLVM backend), etc. nor does the runtime which is written in C
- the LLVM backend is also, apparently, sort of crufty or in need of tuning or in some way is not well-suited to what GHC needs; maybe some kind of simple peephole pass would make a huge performance improvement
If you are asking whether I'd be interested in contributing time or money the answer is I probably don't have the knowledge to do that at this point. I might be able to help coordinate or shepherd a GHC ticket, if something could be done on that end
from bolt.
Related Issues (20)
- A dump about function cannot be properly disassembled when use -use-old-text HOT 2
- perf2bolt: crashes with assertion. HOT 5
- A NullPtrException after BOLT on libart.so HOT 2
- BOLT/LLVM? does not preserve prefixes on conditional branches HOT 2
- Assert error in resolveAArch64Relocation HOT 15
- ELF32 support HOT 1
- BOLT only failed with --hugify HOT 13
- Why skip-function when it jump to itself? HOT 7
- Why Is registerName Executed Twice for a Symbol? HOT 1
- How do I run BOLT on a benchmark of HHVM? HOT 1
- A Problem About .got Table Updates HOT 4
- Promote BOLT to the chromium team HOT 11
- A Problem about IsSimple be used in disassemble
- Any ideas to support riscv architecture? HOT 5
- clang14 build with -update-debug-sections rewrite debug info core HOT 14
- A fatal error about -update-debug-sections HOT 3
- Any suggest to couple two functions? HOT 11
- How to use --hugify option? HOT 20
- Failed to BOLT HOT 2
- [C++] Exceptions catch leads to Segmentation fault
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bolt.