Compilers and Self-hosting
It can indeed seem paradoxical at first glance, but the process of creating an assembler, or indeed any software development tool, follows a logical sequence that builds on itself. Here's a breakdown of how this works:
Historical Context: Bootstrap Compilers and Assemblers
Manual Encoding:
- In the early days of computing, the first programs, including assemblers and compilers, were written in machine code by hand. Programmers would write the binary instructions directly, or they would use switches on the front panel of the computer to enter machine code instructions.
First Assembler:
- Once the first assembler was created using machine code, it simplified the process of writing further software. This assembler could then translate assembly language into machine code, which made programming easier.
Higher-Level Languages:
- With an assembler available, programmers could write more sophisticated programs, including those that would be used to develop higher-level languages like C.
- The first compilers for these higher-level languages were often written in assembly language. Once a compiler existed, it could compile higher-level language code into machine code.
Modern Development: Bootstrapping
Writing Assemblers in Higher-Level Languages:
- Modern assemblers are often written in higher-level languages like C for ease of development, maintainability, and portability.
- The C compiler itself is a complex piece of software but follows the same principles.
Bootstrapping Process:
- Initial Compiler/Assembler: The initial compiler or assembler for a new language can be written in assembly or an already established higher-level language.
- Cross-Compilation: Often, a new compiler or assembler is developed on an existing system using an existing compiler. This is called cross-compilation, where the compiler runs on one system but generates code for another system.
Self-Hosting:
- Once a compiler or assembler for a higher-level language like C is written, it can be compiled using an existing compiler.
- After it’s compiled and functional, subsequent versions of the compiler can be written in its own language. For example, a C compiler can be written in C. This is known as self-hosting.
more on Self-hosting later...
Example: GCC Compiler
- Initial Development: The GNU Compiler Collection (GCC) was initially written in C.
- Bootstrapping: The first version of GCC was compiled using an existing C compiler. Once GCC was working, subsequent versions of GCC could be compiled using GCC itself.
Conceptual Process
- Manual Bootstrapping: The very first tools were written manually in machine code.
- First Assemblers/Compilers: These tools enabled the creation of slightly more complex tools.
- Iterative Improvement: With each step, new tools enabled the creation of even more sophisticated software.
- Self-Hosting Compilers: Eventually, languages could be used to write their own compilers, creating a virtuous cycle of improvement.