Git Product home page Git Product logo

Comments (10)

taiki-e avatar taiki-e commented on July 1, 2024 1

The code was compiled with #![no_std] so it couldn't be the standard lib.

To be clear: I think "standard library" in my comment was misleading (I also meant core and alloc), but even with no_std, core is always linked, and it is also common to link to alloc. In riscv64gc-unknown-none-elf, both are precompiled libraries by default.

from spin-rs.

zesterer avatar zesterer commented on July 1, 2024

This seems to me like a bug. I'm going to investigate this today.

from spin-rs.

zesterer avatar zesterer commented on July 1, 2024

I've gone of the implementation quite carefully. Is there anything unusual about the platform you're running on? Perhaps unusual or non-existent support for atomics, or a CPU that starts with SMP enabled by default?

from spin-rs.

Lorilandly avatar Lorilandly commented on July 1, 2024

Here is my qemu command

qemu-system-riscv64 -M virt -smp 1 -m 128M \
            -display none -serial stdio \
            -bios none -kernel [PATH_TO_ELF]

The elf is compiled with the target riscv64gc-unknown-none-elf with the following rustflag

-Ctarget-feature=-c

I don't think there is there's anything unusual with the configuration. They are written to .cargo/config.toml file. I tried on macos and ubuntu, but the error is the same.

from spin-rs.

zesterer avatar zesterer commented on July 1, 2024

Is it possible that you have something like an interrupt handler being invoked at an unexpected time?

from spin-rs.

Lorilandly avatar Lorilandly commented on July 1, 2024

Just for debugging, I've changed the interrupt handler to

  .align 2
trap_entry:
  j 3f

3: // loop infinitely
  wfi
  j 3b

If the interrupt is invoked, it should end up in the dead-loop. Yet interrupt is not triggered, the code panics still.

from spin-rs.

taiki-e avatar taiki-e commented on July 1, 2024

(Disclaimer: I don't know what linker scripts are being used or what assembly is being generated, so I can only speak to some guesses.)

First, one of the differences between lazy_static and Lazy is that lazy_static does not store initialization code in static, while Lazy stores initialization code in static as a function pointer (as a field of Lazy). This gives the following differences:

  • Lazy at least is aligned to the same alignment as the function pointer. (AFAIK, except for AVR, the function pointer alignment is equal to its size, i.e., greater than 8-bit)
    • This probably optimizes the alignment calculation in 8-bit CAS used in Once. 1
  • In some environments, storing the function pointer in static will cause the compiler bug. However, AFAIK, this should not affect riscv*-unknown-none-elf.

Second, rustc does not automatically adjust sp alignment in riscv*-unknown-none-elf at startup. If your code or the library you are using does not adjust the alignment here, the result can be various bugs due to misalignment. (rust-lang/rust#86693) For example:

  • The compiler performs optimization based on the assumption that references are properly aligned, which can lead to incorrect calculations in the alignment calculation mentioned above.
  • In most architectures including RISC-V, one of the requirements for memory access to be atomic is that addresses be properly aligned.2 Misalignment can cause silently loss of atomicity.

As mentioned at the outset, I'm not certain that misalignment is the cause of the problem, but the fact is that I could not reproduce this problem in an environment3 where sp is properly4 aligned.


-Ctarget-feature=-c

BTW, probably not related to this problem, but I guess that if you have not recompiled the standard library with this flag, the standard library compiled with the c target-feature will be linked, so the binary will have a mix of compressed parts and non-compressed parts.

Footnotes

  1. RISC-V's {8,16}-bit CAS/RMW are internally emulated by 32-bit LR/SC, so alignment calculation is required

  2. The proper alignment here is usually the same as the size. Note that this may be larger than the alignment of the Rust integer type.

  3. The minimal setup (two linker scripts (1, 2) and one line of initialization code) to test riscv*-unknown-none-elf with QEMU used in portable-atomic/semihosting did not reproduce this problem.

  4. The proper alignment here is 16 bytes (riscv64) / 8 bytes (riscv32): https://github.com/riscv-non-isa/riscv-eabi-spec/blob/HEAD/EABI.adoc#eabi-stack-alignment

from spin-rs.

Lorilandly avatar Lorilandly commented on July 1, 2024

Thank you so much for this detailed explanation. The code was compiled with #![no_std] so it couldn't be the standard lib. After some troubleshooting I can confirm that the issue is caused by a bad alignment. However, it wasn't sp, but the alignment of .data section for whatever reason. I had:

.data : {
    . = ALIGN(4096);
    PROVIDE(_data_start = .);
    *(.sdata .sdata.*) *(.data .data.*)
    PROVIDE(_data_end = .);
} >RAM AT>RAM :data

When I changed it to

 .data : ALIGN(4096) {
    PROVIDE(_data_start = .);
    *(.sdata .sdata.*) *(.data .data.*)
    PROVIDE(_data_end = .);
} >RAM AT>RAM :data

the problem was solved. Some quirk with the linker!

Again, thank you so much for dealing with this bug that was caused by myself after all!

from spin-rs.

Lorilandly avatar Lorilandly commented on July 1, 2024

Oh ok great! That actually answers why I was having some strange issues with core libs when I tries to run the binary on the real chip. Again thanks for the clarification😄

from spin-rs.

zesterer avatar zesterer commented on July 1, 2024

Woo, good to hear this is solved! I was worried this might have been a soundness issue in spin for a moment 😀

Thanks @taiki-e for providing your knowledge!

from spin-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.