singhcoder / compiler-construction Goto Github PK
View Code? Open in Web Editor NEWCompiler in C for a custom language
Compiler in C for a custom language
We are storing lexeme in heap here, which could create a problem when function and other invocation are done. String local variable has value in the heap which is not "reset" until the program terminates. Also when we change it to storing lexeme within token then we have to pass the information of lexeme appropriately.
Grammar is changed at rule numbers 61-63, 75-77, 79 , 90 to do left factoring properly
Implement these semantic rules for ERPLAG
An identifier cannot be declared multiple times in the same scope.
While installing in symbol table, make a check if already present, flag an error (make sure that while checking you only search current symbol table and not parent too)
An identifier must be declared before its use.
Whenever there is a usage and symbol table says no entry, flag an error
The types and the number of parameters returned by a function must be the same as that of the parameters used in invoking the function.
Check semantics of FUNCTIONCALLSTMT
The parameters being returned by a function must be assigned a value. If a parameter does not get a value assigned within the function definition, it should be reported as an error.
There must be a assignmentstmt for each output parameter type within function
The function that does not return any value, must be invoked appropriately.
Same as 3
Function input parameters passed while invoking it should be of the same type as those used in the function definition.
Semantics of FUNCTIONCALLSTMT
A switch statement with an integer typed identifier associated with it, can have case statement with case keyword followed by an integer only and the case statements must be followed by a default statement.
Semantics of SWITCH stmt
A switch statement with an identifier of type real is not valid and an error should be reported.
same as 7
A switch statement with a boolean type identifier can have the case statements with labels true and false only. The switch statement then should not have a default statement.
same as 7
Function overloading is not allowed.
Same as identifier multiple installations
A function declaration for a function being used (say F1) by another (say F2) must precede the definition of the function using it(i.e. F2), only if the function definition of F1 does not precede the definition of F2.
might require 2 passes of AST
If the function definition of F1 precedes function definition of F2(the one which uses F1), then the function declaration of F1 is redundant and is not valid.
A for statement must not redefine the variable that participates in the iterating over the range.
The function cannot be invoked recursively.
An identifier used beyond its scope must be viewed as undefined
You can find the changes done to incorporate symbol table in these two commits 9aef702 and 59f8140
Currently, Single AST pass has been implemented for semantic rules checking
Please write test cases and verify correctness for following
Whenever it encounters an assignment statement, it prints :
"id" = .. type checking
LHS_TYPE : REAL | BOOLEAN | ARRAY | INTEGER | TYPE_ERROR
RHS_TYPE : same as above
If types match
Else
If an identifier is not defined,
If array is not dynamic,
Description of type checking rules here
Type Extractor and Checker:
Type of an identifier is extracted from the declaration statement that declares the identifier. The data types supported in the language you are implementing are: integer, real, boolean and array. The type of an element of an array is the type of the data the array refers to. For instance, consider the statement declare a: array[1..15] of integer; the type of an element a[5], say is integer. The type checker verifies the type of an expression appearing at the right hand side of the assignment statement such as value := (a+bโc ) and checks if it matches with that of the identifier on the left hand side. An arithmetic operator can have two operands of the similar type, where types can be integer and real data types. Example : let a,b, c and d be declared as integer then
An expression in a statement a := (b2โ4c)+5*d; its RHS has a type integer and since the identifier a is also of the same type, the type checker approves the expression assignment
An expression in a statement a:= (b2 +c)/d <= c10; has a boolean valued expression which is assigned to the real valued identifier, hence the expression value assignment is wrong.
The Static type checking rules are:
The type of an identifier is the type appearing while declaring the variable.
The type of NUM is integer.
The type of RNUM is real.
The type of true or false value is boolean.
The type of an array variable A of type array [12..20] of real (say) is defined as an expression <real, 12, 20>. The type of an array element A[13] (say) is real if 13 is within the bounds [12, 20]
The type of a simple expression (say E) of the form expression(say E1) Expression(say E2)
is integer, if both expressions are of type integer and the operator is arithmetic operator.
is real, if both the expressions are of type real and the operator is arithmetic operator.
is boolean, if both expressions are of type integer and the operator is a relational operator.
is boolean, if both expressions are of type real and the operator is relational.
is boolean, if both expressions are of type boolean and the operator is logical.
The type of the expression is ERROR, if the above rules do not derive the type of E appropriately.
Type checking rules for array construct are as follows:
The operations +, -, *, / and all relational and logic operators cannot be applied on array variables of type <type, 12, 20> (say)
The assignment operator applied to two array variables of the same type is valid. For example, if A and B are the array variables of type array[12..20] of real, then A:= B; is a valid statement. This applies to dynamic arrays of type array[a..b] of real as well.
Consider array elements with index represented by integer identifier say A[k]. Here type checking of variable k is done at compile time. If the type of k is integer, then it is valid else it is reported as an error. Also, the type checking of A[13], for type of index (NUM), is done at compile time.
The bound checking of A[13] where A is a static array is done at compile time. If A is a dynamic array, then bound checking of A[13] is done at run time [see below]
The type of an identifier or an expression is computed by traversing the abstract syntax tree.
@happy please remove it. It's no longer required.
Also #define MAX_CARSNUM_IN_EXPR
Checkout functionality for searching for a lexeme in hash table(key_present_in_table),
Type checker's description - here
Array bounds, if of type a..b were kept as a constant OBTAIN_DYNAMICALLY while installing in symbol table to tell that it will be obtained dynamically.
But it isn't working as expected, and gives some other integral values check it out
Maybe keep a label is_dynamic to ensure it?
While making hash_table to static instead of dynamic added a #define NUM_TABLE_ENTRIES but it's not needed due to presence of HASH_SIZE already, so remove it
Currently each error printing again and again types code for printing colors, make a separate module print_error
with following declaration
void print_error( char *error_type /*LEXICAL / SYNTAX / SEMANTIC */, char *error_msg);
Right now, to get second/third child, it's hardcoded to as leftmost_child->sibling->sibling->...
Define a function giving nth child
Store number of children of a tree_node also in a node. Will help in generating codes later
There is an unnecessary error_present variable present in parser.h/c which is not used anywhere later. Remove it
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.