Git Product home page Git Product logo

llvmbindings's Introduction

LLVMBindings

LLVM C bindings from Pharo (https://llvm.org/doxygen/group__LLVMC.html).

Repository for the C source code: https://github.com/QDucasse/LLVM-C Repository for the booklet and tutorial: https://github.com/SquareBracketAssociates/Booklet-LLVMCompilationWithPharo

Example of usage


The following code corresponds to the tutorial described in the booklet. It consists of the creation of a sum function in LLVM intermediate representation (IR).

Module definition


The base "object" of the LLVM IR is a module. It will hold all the next objects and works as a placeholder.

  mod := LLVMModule withName: 'mod'.

Parameters Array Definition


The function we aim to create is a function with signature would look like sum(number1,number2). In order to create a proper LLVM function representation, we need to define its parameter array first. We create it by specifying the type of each argument add then put it at the correct place.

  paramArray := LLVMParameterArray withSize: 2.
  t1 := LLVMInt32 create handle getHandle.
  t2 := LLVMInt32 create handle getHandle.
  paramArray at: 1 put: t1.
  paramArray at: 2 put: t2.

Parameters Array Definition


The function signature can be created from the previously defined parameter array, the return type of the function's output, the function's arity and an indication on whether or not it is varidic (accepts a variable number of arguments).

  retType := LLVMInt32 create.
  sumSig := LLVMFunctionSignature withReturnType: retType parametersVector: paramArray arity: 2 andIsVariadic: false.

Function Definition


From the function signature, we can create a memory representation of the function. The LLVMFunction object corresponds to this representation and can be created by adding the previously defined function signature to the top-level module.

  sum := sumSig addToModule: mod withName: 'sum'.

Basic Block creation


A basic block corresponds to a series of instructions that will be executed in the given order successively with no possible escape (an if-then-else would require several basic blocks to be properly translated). Every function needs an entry basic block. In our function and since it is simple enough, this entry block will take care of the whole function.

  block := LLVMBasicBlock appendToFunction: sum withName: 'entry'.

Builder Creation and Usage

The builder (short for Instruction Builder) is the LLVM element that will write the final machine instructions in LLVM IR. The builder can be positioned at the end of a basic block or after a given instruction. Here, we will place the builder at the end of our entry block in the sum function.

  builder := LLVMBuilder create.
  builder positionBuilderAtEndOfBasicBlock: block.

Next we need to get the actual values provided to the function and perform the ADD instruction. Due to LLVM IR restrictions, every temporary variable has to have a unique name. We finally build the return value of the sum function, being the result of the two added parameters.

  param1 := sum parameterAtIndex: 0.
  param2 := sum parameterAtIndex: 1.
  tmp := builder buildAdd: param1 getHandle to: param2 getHandle andStoreUnder: 'tmp'.
  builder buildReturnStatementFromValue: tmp.

Enumeration initialisation


External enumerations have to be initialised at the start. The three main enumerations are used when emitting LLVM IR to machine code to give more precision on the optimisations, code model or the type of the output. The four enumerations have to be initialised.

  LLVMCodeGenFileType initialize.
  LLVMCodeGenOptLevel initialize.
  LLVMCodeModel initialize.
  LLVMRelocMode initialize.

Target Machine Code Emission


LLVM uses a system of Triple, Target and TargetMachine to specify a given architecture. The Triple is the textual representation and consists of three concatenated strings: --. In our case, we will use x86_64, providing default vendor and system. The Target has to be initialised and defined as follows:

  LLVMTarget initializeX86.
  target := LLVMTarget getTargetFromTriple: 'x86_64'.

Next, the TargetMachine can be derived from the Target.

  targetMachine := LLVMTargetMachine fromTarget: target withTriple: 'x86_64'.

Finally, we can emit the OBJ or ASM version of our function's LLVM IR by emitting the module to a memory buffer.

  memBuffASM := mod emitASMFromTargetMachine: targetMachine.
  memBuffObj := mod emitObjFromTargetMachine: targetMachine.

llvmbindings's People

Contributors

qducasse avatar guillep avatar

Stargazers

Jan avatar Pavel avatar

Watchers

James Cloos avatar  avatar

Forkers

guillep

llvmbindings's Issues

Failing callout with ExternalObject/ExternalData and subclasses

From the example of code you sent me, I had the following.
Where param1 and param2 are of type LLVMFunctionParameter, and LLVMFunctionParameter inherits from LLVMValue.

param1 := sum parameterAtIndex: 0.
param2 := sum parameterAtIndex: 1.
tmp := builder buildAdd: param1 to: param2 andStoreUnder: 'tmp'.

>> buildAdd: aValue to: anotherValue andStoreUnder: aTemporaryValueName [
	^ self ffiCall: #(LLVMValue LLVMBuildAdd(LLVMBuilder self,
                                            LLVMValue 	 aValue,
                                            LLVMValue 	 anotherValue,
                                            const char * aTemporaryValueName ))
]

This callout fails (with a rather obscure error message that is no useful).

Note, I don't know if related, that LLVMFunctionParameter contains an ExternalData and not an ExternalReference.

I've tried the following then:

param1 := sum parameterAtIndex: 0.
param2 := sum parameterAtIndex: 1.
tmp := builder buildAdd: param1 getHandle to: param2 getHandle andStoreUnder: 'tmp'.

>> buildAdd: aValue to: anotherValue andStoreUnder: aTemporaryValueName [
	^ self ffiCall: #(LLVMValue LLVMBuildAdd(LLVMBuilder self,
                                            void* 	 aValue,
                                            void* 	 anotherValue,
                                            const char * aTemporaryValueName ))
]

And it worked.

This means there is something in the coercion of those objects that's not letting the callout pass.

To add more info:

  1. parameters are get with a callout defined as:
^ self ffiCall: #(LLVMFunctionParameter LLVMGetParam(LLVMFunction self, uint anInteger))
  1. LLVMValue inherits from FFIExternalObject

Add example and instructions in the readme

I have this from an email with you:

"MODULE DEFINITION"
mod := LLVMModule withName: 'mod'.

"PARAMETERS ARRAY DEFINITION"
paramArray := LLVMParameterArray withSize: 2.
t1 := LLVMInt32 create handle getHandle.
t2 := LLVMInt32 create handle getHandle.
paramArray at: 1 put: t1.
paramArray at: 2 put: t2.

"FUNCTION SIGNATURE DEFINITION"
retType := LLVMInt32 create.
sumSig := LLVMFunctionSignature withReturnType: retType parametersVector: paramArray arity: 2 andIsVaridic: false.

"FUNCTION DEFINITION"
sum := sumSig addToModule: mod withName: 'sum'.
param1 := sum parameterAtIndex: 0.
param1 type. "i32"

param2 := sum parameterAtIndex: 1.
param2 type. "CRASH"

"BASIC BLOCK DEFINITION"
block := LLVMBasicBlock appendToFunction: sum withName: 'entry'.

"BUILDER DEFINITION AND OPERATIONS"
builder := LLVMBuilder create.
builder positionBuilderAtEndOfBasicBlock: block.
param1 := sum parameterAtIndex: 0.
param2 := sum parameterAtIndex: 1.
tmp := builder buildAdd: param1 getHandle to: param2 getHandle andStoreUnder: 'tmp'.
builder buildReturnStatementFromValue: tmp.

"TARGET DEFINITION"
LLVMTarget initializeX86.
target := LLVMTarget getTargetFromTriple: 'x86_64'.

"TARGET MACHINE DEFINITION"
targetMachine := LLVMTargetMachine fromTarget: target withTriple: 'x86_64'.
memBuff := mod emitASMFromTargetMachine: targetMachine.

The readme should also say that all enumerations should be initialized.

LLVMFunction parameterAtIndex: is problematic

In the example code you sent me by email, I got, for a function with two arguments:

param1 := sum parameterAtIndex: 1.
param2 := sum parameterAtIndex: 2.

However, that's not correct and creates a buffer overflow (reading after the end of data) :).
In LLVM, indexes are 0 based, not 1 based like in Pharo ;)

I think we should:

  1. support pharo like indexes (1 based) automatically transforming to 0 based (that is substracting 1?)
  2. still have a lower level function that does the 0-based access for power users
  3. the high level function should maybe check for bounds (like you should not access arg N if your function has < N args)

What do you think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.