Comments (13)
Okay, I think I've figured out what's going on here. What follows is a long but not-very-technical introduction to memory pages by way of a weird analogy. I have no idea if this is appropriate for the book or what, but it's where I'm at mentally right this second.
I accept that I may still be completely wrong here.
Normally
ld
does a thing called "page alignment". It's a complex idea but here's an analogy to give you the gist: visualize a refrigerator that's only 5 inches wide but just as deep as a normal fridge, with a bunch of shelves. Imagine you had an apple, an orange, a pear, and a peach you wanted to store in the fridge. You could put them all on the same narrow shelf, stacking them one behind the other until you have a stack that's 4 items deep, and you label it "fruit". Or you put each fruit on a different shelf so every shelf is only stacked one item deep. Each fruit is right at the front so as soon as you open the fridge door you can see every fruit at a glance and know exactly which fruit is where.In the first case, you're being efficient by packing everything in and you're leaving plenty of room for a vegetable shelf or a cheese shelf or whatever else, but if you want to find a particular fruit you might have to rummage through that fruit shelf for a bit. You also know that it's the fruit shelf, so if someone says "get me these two fruits" you know they're both in the same shelf. In the second case, you can glance at the front of every shelf and know exactly what you're going to get and can quickly retrieve it, but at the cost of having a mostly empty fridge with a small number of fruits.
That's kind of a weird image, but computer memory is broken up into sections like our fridge shelves, called "pages". "Page alignment" is when we put things right at the top of these sections of memory, which like the front part of our shelf is the easiest place for the CPU to look for something. When we do page alignment, we sacrifice space (think of the mostly empty fridge) for speed (think of how fast you can find that peach).
GRUB is designed to be a lazy fridge user. It really only looks at the front of the fridge shelf. If there's nothing at the front, it assumes the whole shelf is empty and complains that it can't find any food. Ideally you could say, "Hey GRUB, you lazy fool, crane your neck, there's food at the back of the shelves!" But you can't. Instead what we do is provide GRUB a fridge that consists of a zillion tiny shelves that just store one fruit each.
That is what
--nmagic
does. It waves a magic wand and transforms our fridge from having a few very deep shelves into having a ton of very shallow shelves, which is the way GRUB likes it. To put it technically, without--nmagic
, page alignments occur once every 2MiB (2 * 1024 * 1024 bytes). With--nmagic
, page alignments occur once every 16 bytes, so GRUB can find anything pretty much anywhere.
from book.
I was curious as to why we enable --nmagic
here so I did some reading. This turns off automatic page alignment (reducing file size of the .bin, I have it at 1.1M without the flag but 901 bytes with the flag).
It also sets the output of the objects to NMAGIC format instead of OMAGIC format. This write-protects the .text section, which I guess completely makes sense for a kernel -- if we accidentally overwrite the kernel code we're up shit creek. This also has the side effect of us needing to manually align any .data sections to the next page so that we can actually write .data...
So, is the write protection why we're doing this?
from book.
The write protection does not work without os support.
The reason is that the multiboot header is not at the beginning without the -n
or -nmagic
flag. See
http://os.phil-opp.com/multiboot-kernel.html#building-the-executable
from book.
Oh I see! For a book-friendly explanation of this I might word it as:
By enabling the
--nmagic
flag we're turning on manual memory alignment. This lets us position the multiboot header exactly where where GRUB is going to look for it: at the beginning of the file. Otherwise the linker might put it somewhere else, and then GRUB won't be able to do its thing.
I have a PR of various edits coming up. I'll make that one of the commits and reference this issue.
from book.
The placement of the sections is specified in the linker script. In it, we ensure that the multiboot header is at the beginning. This is unrelated to the --nmagic
flag.
The --nmagic
flag just turns off the automatic page alignment. The automatic page alignment wouldn't be a problem, but somehow it also changes the offset of sections in the ELF file. Without --nmagic
:
> objdump -h build/kernel-x86_64.bin
build/kernel-x86_64.bin: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .boot 00000018 0000000000100000 0000000000100000 00100000 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 0000000b 0000000000100020 0000000000100020 00100020 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
With --nmagic
:
> objdump -h build/kernel-x86_64.bin
build/kernel-x86_64.bin: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .boot 00000018 0000000000100000 0000000000100000 00000080 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 0000000b 0000000000100020 0000000000100020 000000a0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
It's completely identical except for the File off
column. I don't know why, but the automatic page alignment also makes the sections start at offset 0x100000
in the ELF file. The multiboot header is still the first section, but it isn't at the beginning of the kernel file anymore. Thus GRUB can't find it.
Another strange thing is that the text section starts at address 0x100020
, which is not page aligned… Maybe they had a reason for naming it “magic”.
from book.
Great explanation, thanks @dariusk, it really helped!
from book.
It seems like there's a lot of complexity here that would be come clearer later on in the book. Maybe it's best to do a bit of hand waving here and try to explain these concepts later on.
from book.
I would be open to that.
from book.
@rylev I agree completely. There's absolutely no need to understand it at this early stage (if at all).
from book.
(I started with the intermezzOS book yesterday, and I dived on this part a bit. In the end, I've switched to using 32-bit only myself, but I'll leave the notes below in case they're helpful / or you want to follow-up.)
It's completely identical except for the File off column. I don't know why, but the automatic page alignment also makes the sections start at offset 0x100000 in the ELF file. The multiboot header is still the first section, but it isn't at the beginning of the kernel file anymore. Thus GRUB can't find it.
Alternatively, you could use -zmax-page-size=0x1000
to force 4K pages even in ELF64 mode. That would make the multiboot header fit below the 8K mark. And, at the same time, a flag saying "set our page size to 4K" is less obscure, and less difficult to hand-wave, than something called --nmagic
.
The placement of the sections is specified in the linker script. In it, we ensure that the multiboot header is at the beginning.
It is at the beginning of the linker script, though the setting of the current address to 1M confuses me. It would seem to me it's (partly) at fault.
-d
from book.
Alternatively, you could use -zmax-page-size=0x1000 to force 4K pages even in ELF64 mode.
That's a great alternative!
It is at the beginning of the linker script, though the setting of the current address to 1M confuses me. It would seem to me it's (partly) at fault.
With . = 1M
, we set the current virtual and physical load address to 1M
. We do that because we don't want GRUB to load our kernel to memory below the 1M
mark. Thus, we avoid writing to special memory areas such as the VGA buffer at 0xb8000
. Loading the kernel to 1M
is just convention, 2M
or even 42M
would work, too.
The file offset is independent from the virtual/physical addresses. It just describes the offset of the section data in the ELF
file. Most ELF
loaders don't care about the file offset, so the linker tries to page align the sections in the file to allow more effective loading. The default page size for ELF64
is 2M (= 0x200000
), so the linker places the sections in a way that allows loading in 2M
chunks.
Some examples:
- With
. = 4M
the linker uses a file offset of2M
. Thus the loader can load the section by loading the2M
chunk starting at file offset4M
to the memory address2M
. - With
. = 1M
the linker uses a file offset of1M
. Thus, the loader can just load the first2M
chunk of the ELF file into memory at address0
(thus the section ends up at1M
). - With
. = 1K
the linker uses a file offset of1K
. Like in the previous point, the loader can just load the first2M
of the ELF file into memory at address0
. - With
. = 0
the linker uses a file offset of2M
(!). Why not 0? Because every ELF file starts with various headers. So the file offset 0 is already needed for those headers. - With
. = SIZEOF_HEADERS
the linker uses a file offset ofSIZEOF_HEADERS
(e.g.0xb0
).
Our problem is that GRUB expects the multiboot header in the first 8k or so of the ELF file, so we need to ensure a file offset < 8K. From the examples we saw that the file offset depends on various factors (e.g. page size, section address, size of ELF headers, etc.).
To achieve a low file offset, we have multiple options:
- Disable the page alignment of the sections in the ELF file completely through
-nmagic
. Thus, loading might become slower since we can no longer copy aligned2M
chunks. However, this isn't really relevant for GRUB since it doesn't use paging. - Use a page size of 4K instead of the default 2M through
-zmax-page-size=0x1000
. Thus the linker can do its section alignment without harm, since a section with file offset of 4k is still in the first 8k of the file. - Use a carefully chosen virtual/physical load address so that the resulting file offset is <8K. For example, we could use
. = 2M + SIZEOF_HEADERS;
or. = 2M + 4K;
.
from book.
The file offset is independent from the virtual/physical addresses.
Thanks for the detailed explanation. That difference between file offset and addresses still gets me every time, until I wrap my head around it.
That's a great alternative!
Nice. :)
Cheers.
from book.
Maybe we should include a summary of this in the book, possibly as an extra appendix? Obviously it's not straight forward, and it would be awesome if one wouldn't have to leave the book in order to understand everything that is going on.
@steveklabnik if you agree, please reopen :-)
from book.
Related Issues (20)
- [2.3] Toolchain install on Mac OSX
- unknown --nmagic command after setting the $PATH HOT 8
- Translations to Spanish HOT 3
- Latex error... HOT 3
- Book index page needs styles HOT 2
- Review "setting up a development environment"
- "What tools do we use" is awkward
- update for new bootimage HOT 2
- problems with compilation HOT 4
- problems with running qemu HOT 3
- Copy button incorrectly copies to clipboard HOT 2
- Paging explanation HOT 2
- re-sync book with kernel code
- OSX Install Script - Can't unzip objconv HOT 1
- OSX Install Script - can't build grub HOT 1
- some errors in the intro
- Docker appendix?
- update book for new bootloader version
- update theme directory
- Some changes needed in 'creating our first crate'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from book.