C++ build process
Introduction
Prerequisite
Preprocessing
In the first step, the preprocessor essentially performs textual substitution. In reality the preprocessor can do more than this, it can conditionally compile (or ignore) portions of code, and it can expand macros to behave like functions. So the preprocessor does the following:
-
File inclusion, e.g.
Includes header files for other libraries, classes, etc. The preprocessor actually copies the entire header into your source file.#include <string>
-
Macro expansion, e.g.
Macro expansion is literally replacement of the macro usage in the code by its definitions.#define MAX(a, b) (a > b) ? a : b
-
Conditional compilation, e.g.
Conditional behaviour that tells the preprocessor to include code within the conditional declaration if the condition is met. You can use these just like if-else statements, choosing from: #ifdef, #ifndef, #if, #else, and #elif.#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) // Define something common for Windows 32-bit and 64-bit. #ifdef _WIN64 // Define something for Windows 64-bit only. #else // And for Windows 32-bit only. #endif #elif __APPLE__ // macOS code goes here. #elif __linux__ // Linux code. #else #error "Unknown compiler." #endif
- Remove each comments and replace with a space.
If you want to see what your file looks like after preprocessing, pass gcc the -E option, that tells compiler to perform preprocessing only (not compile, assemble or link), e.g.
The -o option specifies the desired name of the preprocessed source file.
In order to see the preprocessor in action let's refer to our test project. The Greeting class has one macro and small comment. After preprocessing the comment is removed and macro is replaced:
Compilation
In the second step, the compiler does its main task. It processes each source file (without directives) to produce an assembly code. This is intermediate step between the high-level programming language and getting machine (binary) code.
Pass gcc the -S option, that gives assembler code in the output file, e.g.
As a result, we will get assembly code, e.g.
The result files are in build/asm folder.
This code is still pretty readable (if you know assembler π) but machines cannot work with it. They work with machine code that will be obtained in the next step.
Also, compilation divides into the following stages:
- Lexical analysis (producing tokens and lexical errors).
- Syntactical analysis (producing a parse tree and syntactical errors).
- Semantic analysis (producing a symbol table, scoping info and scoping/typing errors).
- Optimization.
Here is a simple image that represents compilation step:
Assembly
Assembly is the third stage. Assembler takes the assembly source code and transforms it into machine code, storing in the object files.
Machine code looks like:
In order to get an object file, use as program, e.g.
Assembly step is pretty simple to illustrate:
Linking
In the fourth step, the linker combines the object files for a program, along with any library functions that are necessary, into a file containing the complete executable program.
In order to get executable program, let's apply the final command:
There are two types of linking:
- Linking the functions together by jumping directly to the function. It is static linking. This is more efficient, less flexible and rarely used.
- Having a table that contains our functions and look up where to jump before jumping to the desired function. This is dynamic linking. It is a little bit slower but much more flexible and is the standard way to ship a library.
And here is an image that illustrate linking stage:
Conclusion
As we just saw, building an executable file from C++ source files is a multi-step process. In short, we could build an executable via one command, e.g.
Comments
Post a Comment