Git Product home page Git Product logo

Comments (13)

dariusk avatar dariusk commented on July 17, 2024 4

Okay, I think I've figured out what's going on here. What follows is a long but not-very-technical introduction to memory pages by way of a weird analogy. I have no idea if this is appropriate for the book or what, but it's where I'm at mentally right this second.

I accept that I may still be completely wrong here.

Normally ld does a thing called "page alignment". It's a complex idea but here's an analogy to give you the gist: visualize a refrigerator that's only 5 inches wide but just as deep as a normal fridge, with a bunch of shelves. Imagine you had an apple, an orange, a pear, and a peach you wanted to store in the fridge. You could put them all on the same narrow shelf, stacking them one behind the other until you have a stack that's 4 items deep, and you label it "fruit". Or you put each fruit on a different shelf so every shelf is only stacked one item deep. Each fruit is right at the front so as soon as you open the fridge door you can see every fruit at a glance and know exactly which fruit is where.

In the first case, you're being efficient by packing everything in and you're leaving plenty of room for a vegetable shelf or a cheese shelf or whatever else, but if you want to find a particular fruit you might have to rummage through that fruit shelf for a bit. You also know that it's the fruit shelf, so if someone says "get me these two fruits" you know they're both in the same shelf. In the second case, you can glance at the front of every shelf and know exactly what you're going to get and can quickly retrieve it, but at the cost of having a mostly empty fridge with a small number of fruits.

That's kind of a weird image, but computer memory is broken up into sections like our fridge shelves, called "pages". "Page alignment" is when we put things right at the top of these sections of memory, which like the front part of our shelf is the easiest place for the CPU to look for something. When we do page alignment, we sacrifice space (think of the mostly empty fridge) for speed (think of how fast you can find that peach).

GRUB is designed to be a lazy fridge user. It really only looks at the front of the fridge shelf. If there's nothing at the front, it assumes the whole shelf is empty and complains that it can't find any food. Ideally you could say, "Hey GRUB, you lazy fool, crane your neck, there's food at the back of the shelves!" But you can't. Instead what we do is provide GRUB a fridge that consists of a zillion tiny shelves that just store one fruit each.

That is what --nmagic does. It waves a magic wand and transforms our fridge from having a few very deep shelves into having a ton of very shallow shelves, which is the way GRUB likes it. To put it technically, without --nmagic, page alignments occur once every 2MiB (2 * 1024 * 1024 bytes). With --nmagic, page alignments occur once every 16 bytes, so GRUB can find anything pretty much anywhere.

from book.

dariusk avatar dariusk commented on July 17, 2024

I was curious as to why we enable --nmagic here so I did some reading. This turns off automatic page alignment (reducing file size of the .bin, I have it at 1.1M without the flag but 901 bytes with the flag).

It also sets the output of the objects to NMAGIC format instead of OMAGIC format. This write-protects the .text section, which I guess completely makes sense for a kernel -- if we accidentally overwrite the kernel code we're up shit creek. This also has the side effect of us needing to manually align any .data sections to the next page so that we can actually write .data...

So, is the write protection why we're doing this?

from book.

phil-opp avatar phil-opp commented on July 17, 2024

The write protection does not work without os support.

The reason is that the multiboot header is not at the beginning without the -n or -nmagic flag. See
http://os.phil-opp.com/multiboot-kernel.html#building-the-executable

from book.

dariusk avatar dariusk commented on July 17, 2024

Oh I see! For a book-friendly explanation of this I might word it as:

By enabling the --nmagic flag we're turning on manual memory alignment. This lets us position the multiboot header exactly where where GRUB is going to look for it: at the beginning of the file. Otherwise the linker might put it somewhere else, and then GRUB won't be able to do its thing.

I have a PR of various edits coming up. I'll make that one of the commits and reference this issue.

from book.

phil-opp avatar phil-opp commented on July 17, 2024

The placement of the sections is specified in the linker script. In it, we ensure that the multiboot header is at the beginning. This is unrelated to the --nmagic flag.

The --nmagic flag just turns off the automatic page alignment. The automatic page alignment wouldn't be a problem, but somehow it also changes the offset of sections in the ELF file. Without --nmagic:

> objdump -h build/kernel-x86_64.bin 

build/kernel-x86_64.bin:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .boot         00000018  0000000000100000  0000000000100000  00100000  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         0000000b  0000000000100020  0000000000100020  00100020  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

With --nmagic:

> objdump -h build/kernel-x86_64.bin 

build/kernel-x86_64.bin:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .boot         00000018  0000000000100000  0000000000100000  00000080  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         0000000b  0000000000100020  0000000000100020  000000a0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

It's completely identical except for the File off column. I don't know why, but the automatic page alignment also makes the sections start at offset 0x100000 in the ELF file. The multiboot header is still the first section, but it isn't at the beginning of the kernel file anymore. Thus GRUB can't find it.

Another strange thing is that the text section starts at address 0x100020, which is not page aligned… Maybe they had a reason for naming it “magic”.

from book.

varjmes avatar varjmes commented on July 17, 2024

Great explanation, thanks @dariusk, it really helped!

from book.

rylev avatar rylev commented on July 17, 2024

It seems like there's a lot of complexity here that would be come clearer later on in the book. Maybe it's best to do a bit of hand waving here and try to explain these concepts later on.

from book.

steveklabnik avatar steveklabnik commented on July 17, 2024

I would be open to that.

from book.

phil-opp avatar phil-opp commented on July 17, 2024

@rylev I agree completely. There's absolutely no need to understand it at this early stage (if at all).

from book.

dato avatar dato commented on July 17, 2024

(I started with the intermezzOS book yesterday, and I dived on this part a bit. In the end, I've switched to using 32-bit only myself, but I'll leave the notes below in case they're helpful / or you want to follow-up.)

It's completely identical except for the File off column. I don't know why, but the automatic page alignment also makes the sections start at offset 0x100000 in the ELF file. The multiboot header is still the first section, but it isn't at the beginning of the kernel file anymore. Thus GRUB can't find it.

Alternatively, you could use -zmax-page-size=0x1000 to force 4K pages even in ELF64 mode. That would make the multiboot header fit below the 8K mark. And, at the same time, a flag saying "set our page size to 4K" is less obscure, and less difficult to hand-wave, than something called --nmagic.

The placement of the sections is specified in the linker script. In it, we ensure that the multiboot header is at the beginning.

It is at the beginning of the linker script, though the setting of the current address to 1M confuses me. It would seem to me it's (partly) at fault.

-d

from book.

phil-opp avatar phil-opp commented on July 17, 2024

Alternatively, you could use -zmax-page-size=0x1000 to force 4K pages even in ELF64 mode.

That's a great alternative!

It is at the beginning of the linker script, though the setting of the current address to 1M confuses me. It would seem to me it's (partly) at fault.

With . = 1M, we set the current virtual and physical load address to 1M. We do that because we don't want GRUB to load our kernel to memory below the 1M mark. Thus, we avoid writing to special memory areas such as the VGA buffer at 0xb8000. Loading the kernel to 1M is just convention, 2M or even 42M would work, too.

The file offset is independent from the virtual/physical addresses. It just describes the offset of the section data in the ELF file. Most ELF loaders don't care about the file offset, so the linker tries to page align the sections in the file to allow more effective loading. The default page size for ELF64 is 2M (= 0x200000), so the linker places the sections in a way that allows loading in 2M chunks.

Some examples:

  • With . = 4M the linker uses a file offset of 2M. Thus the loader can load the section by loading the 2M chunk starting at file offset 4M to the memory address 2M.
  • With . = 1M the linker uses a file offset of 1M. Thus, the loader can just load the first 2M chunk of the ELF file into memory at address 0 (thus the section ends up at 1M).
  • With . = 1K the linker uses a file offset of 1K. Like in the previous point, the loader can just load the first 2M of the ELF file into memory at address 0.
  • With . = 0 the linker uses a file offset of 2M (!). Why not 0? Because every ELF file starts with various headers. So the file offset 0 is already needed for those headers.
  • With . = SIZEOF_HEADERS the linker uses a file offset of SIZEOF_HEADERS (e.g. 0xb0).

Our problem is that GRUB expects the multiboot header in the first 8k or so of the ELF file, so we need to ensure a file offset < 8K. From the examples we saw that the file offset depends on various factors (e.g. page size, section address, size of ELF headers, etc.).

To achieve a low file offset, we have multiple options:

  1. Disable the page alignment of the sections in the ELF file completely through -nmagic. Thus, loading might become slower since we can no longer copy aligned 2M chunks. However, this isn't really relevant for GRUB since it doesn't use paging.
  2. Use a page size of 4K instead of the default 2M through -zmax-page-size=0x1000. Thus the linker can do its section alignment without harm, since a section with file offset of 4k is still in the first 8k of the file.
  3. Use a carefully chosen virtual/physical load address so that the resulting file offset is <8K. For example, we could use . = 2M + SIZEOF_HEADERS; or . = 2M + 4K;.

from book.

dato avatar dato commented on July 17, 2024

The file offset is independent from the virtual/physical addresses.

Thanks for the detailed explanation. That difference between file offset and addresses still gets me every time, until I wrap my head around it.

That's a great alternative!

Nice. :)

Cheers.

from book.

steveej avatar steveej commented on July 17, 2024

Maybe we should include a summary of this in the book, possibly as an extra appendix? Obviously it's not straight forward, and it would be awesome if one wouldn't have to leave the book in order to understand everything that is going on.

@steveklabnik if you agree, please reopen :-)

from book.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.