Git Product home page Git Product logo

Comments (14)

binaryfields avatar binaryfields commented on April 28, 2024

I think this has to do with memory being cached by the CPU so the response code from VC (mbox.buffer[1]) is never updated as far as the CPU view of that memory region (cpu reads 0x0 instead of 0x8000_0000). I've run into the same issue while working on my C64 bare-metal port.

My current workaround/hack was to setup a region of memory marked as device memory by the MMU and use it for the mailbox messages.

https://github.com/digitalstreamio/zinc64/blob/master/zinc64-raspi/src/hal/mbox.rs

There is probably a cleaner way to do this that avoid copying message buffer back and forth, but it mostly works for now (see below).

Interestingly enough, I think there is still some timing issue there as I have one use case (frame buffer allocation) that didn't quite work. I got it to work by putting in some delays (writes to uart) as I was debugging it (consequently the implementation referenced above contains function call2 w/ the debugging code there).

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

Ah, that sounds very likely. I'll ponder a bit what a good solution in the scope of the tutorials could look like.

Thanks!

from rust-raspberrypi-os-tutorials.

jakelstr avatar jakelstr commented on April 28, 2024

I just wanted to add that I was experimenting with the framebuffer on top of the rust code, and I ran into the exact same issue where it wouldn't work unless I put in a lot of writes to the UART. I assumed it was my fault.

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

A quite easy workaround for now for you guys would be to disable data caching altogether in mmu.rs: SCTLR_EL1::C::NonCacheable.

I have yet to decide how I want to introduce this topic in a way that makes sense both didactically and is not completely out of touch of how this stuff is approached in the real world. So far I thought about:

  • Declaring such buffers global static and link them into a non-cacheable page.
  • Introduce a special allocator that allocates from a non-cacheable page at runtime.
  • Make use of flush and invalidate operations before/after exchange with the Videocore.

from rust-raspberrypi-os-tutorials.

binaryfields avatar binaryfields commented on April 28, 2024

The only issue with disabling data cache that I found is that it seems to break atomic ops (atomic xch/etc) which in my case were needed by Rust's allocator. It seems atomics need MMU + Data Cache.

What if you allocate a region for mbox messages in the linker definition file, similarly to how you may define a region for stack/heap, and use a reference to it (similarly to how you refer to bss when you zero it out) to initialize corresponding mmu pages and mbox itself. This seems to fall into option 1 that you listed.

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

FYI, I have decided to go the allocator way, since this one comes closest to real world kernels where you would use functions called like dma_alloc() to obtain a piece of memory that can be used for these purposes. Hopefully I'll get it done within the next few days.

I could also reproduce atomics not working when data chache is disabled. I was actually surprised to find out about that.

from rust-raspberrypi-os-tutorials.

binaryfields avatar binaryfields commented on April 28, 2024

Sounds good. I look forward to seeing how that would work with the allocator setup. Would you need two separate allocators, one for heap and one for device memory?

from rust-raspberrypi-os-tutorials.

binaryfields avatar binaryfields commented on April 28, 2024

Having thought about it a bit more, I think this setup would also solve an issue that I haven't quite hit yet but will soon run into with EMMC and reading data out of SD card so most likely there will be other device buffer sharing scenarios where the allocator setup will be handy.

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

Here is a sneak peak: https://github.com/rust-embedded/rust-raspi3-tutorial/tree/println_dma/0F_DMA_memory

This is all still subject to change and cleanups which will happen with force pushes at any time. Also need to yet write the prose.

Output currently only works on a real RPi, because I haven't come up with a solution yet to multiplex two virtual UARTs to STDIO using QEMU.

Idea is to rebase the exception lesson on top of the DMA_memory lesson, then add a lesson that transforms the NullLocks to real Spinlocks (for which we need MMU first to use atomics), then enable Interrupts:

  • 0E println
  • 0F DMA memory
  • 10 Exceptions
  • 11 Spinlocks and Critical Sections
  • 12 Interrupts

Cheers,
Andre

from rust-raspberrypi-os-tutorials.

jakelstr avatar jakelstr commented on April 28, 2024

If one of you has a moment, could you explain why atomics don't work without data cache, or possibly direct me to something to read? I'm fairly new to this advanced level and I'm trying to learn all I can.

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

Unfortunately, this is a very complicated topic in terms of processor architecture, very dependent on the implementer of the SoC and on top of that, poorly documented.

On AArch64, exclusive access operations are realized using so called monitors.
There exist two of them, local monitors and global monitors. I recommend you glimpse into the ARMv8 Arch Reference Manual, and additionally read https://stackoverflow.com/questions/23914222/arm-is-ldrx-strx-needed-if-interrupts-are-disabled.

What I can say is, if you execute exclusive store/load instructions on the RPi3 with exceptions enabled but MMU disabled, you get a synchronous abort (you can test it by modifying lesson 0E):

A synchronous exception happened.
    ELR_EL1: 0x80584
    ESR_EL1: 0x96000035

$ make objdump | rg 80584
   80584:       6e 7d 5f c8     ldxr    x14, [x11]

Upon further inspecting the given ESR_EL1, you can see that ESR_EL1.DFSC = 110101, which is IMPLEMENTATION DEFINED fault (Unsupported Exclusive or Atomic access).

Please inspect Pages 132 and 2463 of the ARMv8 Arch reference manual where these decodings are reported.

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

Fixed by 48cf71b.

@digitalstreamio @jakelstr FYI.

from rust-raspberrypi-os-tutorials.

binaryfields avatar binaryfields commented on April 28, 2024

I really like your separate dma allocator setup. The code and its structure look great. It keeps with the clean and easy to follow spirit of other lessons.

I have one question on the memory map that you may be able to shed some light on. I'm a bit confused on when to put stuff into linker definition vs hard code in the source code. Any thoughts on that?

Also, I'm looking forward to your synchronization tutorial. I would love to see it bring up multicore setup.

Thanks

from rust-raspberrypi-os-tutorials.

andre-richter avatar andre-richter commented on April 28, 2024

I would tend to say that something that is global (in Rust speak: static) and you have confidence in that it is device-independent (e.g. on cortex-a systems there should always be enough DRAM so that you can have a stack from 0-0x8000) you can hardcode it in the linker script.

For everything else, eg Max amount of DRAM on a system, device mmio addresses etc, make your code dynamic.
In real embedded systems, all the device specific stuff usually goes into so-called device tree blobs and those are parsed by the OS and bootloader during early boot.

from rust-raspberrypi-os-tutorials.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.