sop (2023-12-03) - jlxip's blog

Get notified of new posts via mail

The first program to be executed by a computer when it's turned on is the bootloader. On x86 BIOS systems, the first part (stage 1) of a bootloader is called a bootsector: a one sector-long piece of code which starts bootstrapping the system. This is the very first code you can control, and it must be written in assembly.

It's not regular 64-bits assembly: the CPU boots in a backwards-compatible mode known as real mode, which is 16-bits: 100% compatible with a 8086 processor from 1978. There are many differences between modern assembly and real mode; for instance, there's no paging and all memory uses segmentation. In my opinion, the reduced instruction set and addressing modes, as well as the lack of an OS, actually makes it an easier language to learn than regular asm. Furthermore, the BIOS interface is available, which makes some operations very easy, even though it's generally not considered reliable (some are buggy).

On floppies and most magnetic hard disks, a sector is 512 bytes long. A bootsector always ends in 0xAA55, which is two bytes, so you have 510 bytes to do whatever you want. Nothing else. Most bootloaders use stage 1 for very basic initialization and for loading a stage 2 with looser size restrictions. Especially because of the 510 bytes restriction and the real mode situation, a bootsector ends up being a very interesting environment for those of us who are nerds enough to appreciate it. When the point is to create the smallest program with the greatest functionality, one must think outside of the box to use the instructions that take the least space, play with side effects, and be smart about the control flow to avoid as many jumps as possible. Assembly is not structured programming: nothing's stopping you from falling through functions or even overlapping them.

I'd recommend checking out the bootsect project from porta, a friend of mine. She has some very interesting and fun bootsector demos.

Porta's bootsect

So, what's sop? It's short for Sector OPerations; it's a command-line made in a bootsector. Something like a terminal, a tty, or a very simple shell. It has features like basic keyboard support and scrolling. You have a prompt in which you enter commands. In vanilla sop there are two: "l" for loading a sector into memory, and "c" for calling a given address. sop is an environment in which it's extremely easy to write real mode assembly code (sop extensions), in a highly rewarding way. You can completely ommit the command-line interface and jump straight into writing your project while having the sop functions available for things like printing to screen and reading user input. Alternatively, you can choose to keep it and effortlessly implement your own suite of commands and, in a few hours, end up with something you can call your own kind-of OS.

It's specially great for learning. If you already know C and want to get into assembly, I would recommend you check out real mode asm with sop before jumping into userspace with all its additional complexity like syscalls, external symbols, linking, and binary sections.

After a month and a half of working on it in my free time, it's ready. I finished it last week and documented it yesterday. sop ships with some example extensions to give you an idea of its capabilities and show how easy it is to extend it. I encourage you to try it in five minutes.

Check out sop.

Thanks for reading.

-- jlxip