loader.asm

The loader.asm program is a demonstration boot loader which is capable of loading a "home-grown" operating system from either a floppy or hard drive. I have also included a demo target program beroset.asm which just displays a message and halts, but it shows that the program was really loaded into memory. The thing that makes this boot loader different from most is that it loads an ordinary DOS file into memory and is not dependent on the target program being in a particular location on the hard drive or floppy.

Purpose

The purpose of this program is twofold:

to demonstrate some methods for manipulating DOS files without DOS
to show a somewhat practical bootloader

Building the Code

There are differences between the MASM version and TASM version of loader.asm. Building the TASM version is simply a matter of issuing the commands:
tasm /la /m2 loader.asm tlink loader, loader.bin With MASM, the process looks like this: ml /Fl /Sa loader.asm MASM's ml will run the linker automatically, but be aware that it only generates an executable and doesn't seem to be able to generate the same kind of binary file that TASM does. (Or more accurately, if there is a way to do it, I don't know it and would be grateful to learn about it if you've got a solution.)

Technical Description

If you look through the source code as you read this description, it will be easier to follow along. Each of the sections of description which follow are also roughly in the same order as they appear in the program.

Structure Declarations

The first few non-comment lines of loader contain structure definitions. The first structure is PartEntry which is not actually used in this program. It's included because it could be part of a usable hard disk based version which doesn't rely on an secondary loader. As it is written, this loader is intended to be the secondary loader, i.e. the loader that the master boot record entry would load and execute.

The second definition is BootSector which is used extensively. The BootSector structure is a DOS-based structure located within the boot record of the disk, either the first sector loaded from floppy or, as mentioned above, the second sector loaded and executed from a hard drive. It contains a number of useful bits of information including the essential clues about the geometry of the FAT (File Allocation Table) and where to find the root directory of the disk. Because this loader loads a regular file from disk, this kind of information is necessary.

The third structure definition is DirEntry which describes the directory entry structure of a FAT-formatted disk. Note that it only describes the "old-style" directory entries from back in the Bronze Age during which DOS was popular. Specifically, it does not account for the long file names used by Windows 95 or any variation after that. However, it does allow searching for a file which has a simple 8.3 (eight characters for file name and three for file extension) name, and that's what this loader does.

These structure declarations are compatible with either Microsoft's MASM or Borland's TASM, which is what I actually used to assemble and link this program. If you are attempting to use NASM, you will have to do some translation because NASM's support for structures is poor.

Note, however, that this comment applies to the structures only, and not to the rest of the code. There are a number of syntactical differences between the MASM version and TASM version of loader.asm.

Over Yonder

In American English slang, "yonder" is a word meaning some distant location. In this loader yonder is the name of the segment to which the loader will jump after it has loaded the operating system. The destination is called "destination" for reasons which remain mysterious.

The Code Begins

There are a few lines which set up the segment in which the loader is to run. They look like this:

code segment para public use16 '_CODE'
        .386
        assume cs:code, ds:code, es:code, ss:code
        org 7c00h
main PROC

The first line is a segment declaration which states that there is a segment named code with paragraph alignment and public type, residing in a 16-bit segment in the group named _CODE. There isn't really anything special about the name code or the group name _CODE except that the debugging program CodeView from Microsoft is (or at least at one time was) unable to understand any code segment belonging to a group which did not have a name ending in "_CODE" for some strange reason.

The next line just says that it's expected that a minimum of a 386 is going to be running this code. There's not much here that wouldn't run on an 8086, but I didn't have a need to do that when I wrote this program and those are pretty rarely encountered these days unless you're looking through my junk box.

The assume directive just tells the assembler what the conditions are upon entry to the code. NASM doesn't support the ASSUME directive but as you'll see later, it's too bad because it's a useful construct.

Finally, the code is declared to start at hex address 7C00h. That's because when a boot loader begins, it is loaded at address 0:7C00h and execution begins from that address. In our case, the linker will eventually produce a .COM file which has 7C00h bytes of filler in it, but the sector.asm program understands that and skips that many bytes before copying the sector to a floppy or hard drive. If you use some other means of copying the file, you'll need to take that into account.

Finally, the main procedure runs and its first job is to jump over the header information to the location labelled over. From there, the program copies the contents of the CS register into the SS register, the DS register and the ES register for later use. The code then drops through into the CalcClustOff routine.

CalcClustOff

The CalcClustOff routine is actually almost a C-callable function which calculates the starting logical sector number of cluster zero. It is almost C-callable and not quite because it doesn't actually have a proper return, and instead just falls through to the next routine in this example. However, the code which would need to be there to make it into a generic routine callable from C is there but just commented out since it's not needed in this case.

More Cluster Calculations

The next part of code calculates the location of logical cluster two which happens to be the start of the root directory. This result is returned in terms of a logical sector number. After that the logical sector number is translated into the physical geometry of the drive as seen from the Int 13h function. Once all those calculations are made, the root directory is loaded into memory and searched for the file name which contains the operating system we wish to load. In this demo code, the name of that file is BEROSET.SYS but it can, of course be easily changed.

Leap Into the Unknown

Finally, after the operating system file is loaded into memory, the loader program jumps to that location and executes whatever code was loaded. That's the code line that says, somewhat enigmatically: jmp destination

Ed Beroset
Last modified: Fri Jan 15 22:25:04 2004