Compiler structure
- All the modern compiler are consisit of front end and back end. The division between two phases is the Intermediate code.
- Front end and Back end are completely separeted.
- Front : It only related to source code, makes IR. Scanner(spelling), Parser(grammar), Sementic analyzer. Optionally optimizes.
- Back : It only related to IR, translates IR into target program. Generages objec code(assembly). Optimize.
Scanner
- Reads source program as stream of characters
- Performs lexical analysis - collects sequence of chracters into meaninful units called lexems/toekns
- <token Name , attribute Value> ex)<main-function, main()>
- Performs spelling check
- white space eliminated
Parser
- Uses tokens to build syntax or parse tree.
- Performs synxat analysis using token obtained from the scanner
- grammatical analysis based on CFG
- we have CFG, CSG.
- CSG is for analyzing natural language.
Semantic analyzer
- static semantic : language features can be determinde prior to execution. It can't be expressed in syntax. ex)declaration, type check
- dynamic semantic : determined during execution. ex) array bound checking
This makes full use of symmbol table. makes annotated parse tree.
annotated parse tree
- It is related to semantic checking.
syntax tree and parse tree
-
Parse tree : It has some grammar, consist of grammar symbol such as nonterminals.
-
Syntax tree : Removes grammar nonterminals.