Git Product home page Git Product logo

Comments (10)

nilres avatar nilres commented on May 5, 2024 2

@TheGuyDanish thanks for the info that your card failed the same way.

Getting the kernel panic data via a serial console connection was also my idea but yesterday it was too late. I will try to find my USB<->UART connector this evening and get the Kernel panic data. I'll keep you updated.

from raspberry-pi-pcie-devices.

geerlingguy avatar geerlingguy commented on May 5, 2024

Oliver mentioned I should look for the aacraid driver in mainline kernel. Should be a recompile away!

from raspberry-pi-pcie-devices.

nilres avatar nilres commented on May 5, 2024

I got the exact same card and tried to plug in into my compute module 4 but upon boot raspbian (32bit) spit out a lot of kernel errors and paniced. I then tried to plug it into a riser card that was externally powered just to ensure that it is not a power delivery problem but this yielded the same result.
I then tried an Ubuntu 64bit installation. There the boot just stopped on the multi-color screen. So no immediate luck on my side with this card. Has your card already arrived? I would be interested if you have the same issues as my comes from ebay I'm not 100% certain the card works maybe I should check it in a x86 machine...

from raspberry-pi-pcie-devices.

TheGuyDanish avatar TheGuyDanish commented on May 5, 2024

This was my observation too. With or without the aacraid module in the kernel, it immediately causes a panic. I didn't have enough time to troubleshoot it myself, which is why I sent it over to Jeff. I think the best step forward from here is to get a Pi with the card on it hooked up to UART and dumping the full output from boot. Hopefully that can give a better idea of what exactly is causing the kernel to dislike the card.

from raspberry-pi-pcie-devices.

geerlingguy avatar geerlingguy commented on May 5, 2024

@TheGuyDanish - I'll try to get to it sometime this month. Just a bit backlogged catching up on some other projects still :(

Hopefully there's some good data I can get out via UART.

from raspberry-pi-pcie-devices.

TheGuyDanish avatar TheGuyDanish commented on May 5, 2024

@TheGuyDanish - I'll try to get to it sometime this month. Just a bit backlogged catching up on some other projects still :(

All good man, don't be hard on yourself, I'm happy to wait! I'm in the process of moving anyway, so I've got plenty to look at myself.

from raspberry-pi-pcie-devices.

nilres avatar nilres commented on May 5, 2024

I found my adapter, I figured out the pinout and got it working so that I have a console on the serial console. I thought I was good to go caputring the data but unfortunately this output is not printed to the serial console (maybe the panic is before the serial port is configured in the kernel, I'm not sure what the order is).
Next idea was to enable kernel debug output to the serial port via kgdboc=serial0,115200 (and a lot of variations) but with that the kernel panics even with no card plugged in.

Any ideas what I could try to get the debug data?

from raspberry-pi-pcie-devices.

nilres avatar nilres commented on May 5, 2024

I also tried setting "earlycon=uart8250,mmio32,0xfe215040" in my cmdline.txt and that actually helped to get at least some output from the early boot process but unfortunately the panic was still not printed to the serial port.

I had a quick read through the log but wasn't able to spot anything that gives a hint want could go wrong. But I uploaded the log here just in case anyone else spots something: https://gist.github.com/nilres/b2de8d64317d667dfd6fb70b8b04dae3

from raspberry-pi-pcie-devices.

nilres avatar nilres commented on May 5, 2024

Got it!
Disabling the console=tty parameter in the cmdline leads to a full dump of everything on the serial console. I'm sorry that I spamed this thread. But here is a full gist of the failing kernel boot with the adapter plugged in: https://gist.github.com/nilres/0be7f407bf76c4537a09a3989c48d5f9

And the probably the most important parts:

[ 1.873603] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[ 1.908874] 8<--- cut here ---
[ 1.911963] Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
[ 1.919353] pgd = (ptrval)
[ 1.922089] [00000000] *pgd=80000000004003, *pmd=00000000
[ 1.927568] Internal error: : 1211 [#1] SMP ARM
[ 1.932148] Modules linked in:
[ 1.935246] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.10.11-v7l+ #1399
[ 1.942019] Hardware name: BCM2711
[ 1.945474] PC is at pci_generic_config_read+0x44/0xa0
[ 1.950669] LR is at 0xc1bf6200
[ 1.953844] pc : [] lr : [] psr: 20000093
[ 1.960180] sp : c18f9b18 ip : f0830000 fp : c18f9b2c
[ 1.965461] r10: 00000000 r9 : 50000013 r8 : 00000000
[ 1.970743] r7 : c18f9bc8 r6 : c1bf6800 r5 : c07947c0 r4 : 00000004
[ 1.977341] r3 : deaddead r2 : 00008000 r1 : f0839000 r0 : f0838000
[ 1.983942] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 1.991245] Control: 30c5383d Table: 00003000 DAC: 55555555
[ 1.997054] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[ 2.003127] Stack: (0xc18f9b18 to 0xc18fa000)

[ 2.338326] Backtrace:
[ 2.340818] [] (pci_generic_config_read) from [] (pci_bus_read_config_dword+0x80/0xbc)
[ 2.350581] r5:c07947c0 r4:c1205048
[ 2.354205] [] (pci_bus_read_config_dword) from [] (pci_bus_generic_read_dev_vendor_id+0x34/0x184)
[ 2.365023] r9:c1bf6800 r8:0000ea60 r7:00000000 r6:c18f9bc8 r5:c18f9bc8 r4:00000000
[ 2.372857] [] (pci_bus_generic_read_dev_vendor_id) from [] (pci_bus_read_dev_vendor_id+0x58/0x6c)
[ 2.383675] r10:00000000 r9:c1be9000 r8:00000000 r7:271114e4 r6:0000ea60 r5:c18f9bc8
[ 2.391590] r4:00000000
[ 2.394156] [] (pci_bus_read_dev_vendor_id) from [] (pci_scan_single_device+0x70/0xd4)
[ 2.403918] r7:00000000 r6:c1bf6800 r5:00000000 r4:c1205048
[ 2.409645] [] (pci_scan_single_device) from [] (pci_scan_slot+0x4c/0x110)
[ 2.418352] r8:c1bf6400 r7:00000000 r6:c0e66820 r5:c1bf6800 r4:c1bf6800
[ 2.425133] [] (pci_scan_slot) from [] (pci_scan_child_bus_extend+0x50/0x2ac)
[ 2.434104] r7:00000000 r6:c0e66820 r5:00000001 r4:c1bf6800
[ 2.439831] [] (pci_scan_child_bus_extend) from [] (pci_scan_bridge_extend+0x308/0x688)
[ 2.449682] r10:c1205048 r9:c1be9000 r8:c1bf6400 r7:00000000 r6:c1bf6800 r5:00000001
[ 2.457598] r4:00000001
[ 2.460162] [] (pci_scan_bridge_extend) from [] (pci_scan_child_bus_extend+0x1fc/0x2ac)
[ 2.470013] r10:00000000 r9:00000000 r8:00000001 r7:00000000 r6:c1bf6414 r5:c1be9000
[ 2.477928] r4:c1bf6400
[ 2.480493] [] (pci_scan_child_bus_extend) from [] (pci_scan_root_bus_bridge+0x70/0xdc)
[ 2.490345] r10:00000000 r9:c1a72410 r8:eff72634 r7:c1b9ec40 r6:c1bf6200 r5:c1bf6000
[ 2.498259] r4:c1bf6000
[ 2.500825] [] (pci_scan_root_bus_bridge) from [] (pci_host_probe+0x1c/0xa0)
[ 2.509706] r5:c1bf6000 r4:c1bf6000
[ 2.513326] [] (pci_host_probe) from [] (brcm_pcie_probe+0x318/0x42c)
[ 2.521595] r7:c1b9ec40 r6:c1bf6200 r5:c1a72400 r4:c1bf6000
[ 2.527324] [] (brcm_pcie_probe) from [] (platform_drv_probe+0x58/0xa8)
[ 2.535770] r10:00000000 r9:00000000 r8:c12e8268 r7:00000000 r6:c12e8268 r5:00000000
[ 2.543685] r4:c1a72410
[ 2.546248] [] (platform_drv_probe) from [] (really_probe+0x100/0x3c8)
[ 2.554605] r7:00000000 r6:c13fd634 r5:c13fd62c r4:c1a72410
[ 2.560329] [] (really_probe) from [] (driver_probe_device+0x6c/0xc4)
[ 2.568599] r10:c1353000 r9:c1053854 r8:00000132 r7:c08412bc r6:c12e8268 r5:c12e8268
[ 2.576514] r4:c1a72410 r3:00000000
[ 2.580132] [] (driver_probe_device) from [] (device_driver_attach+0x68/0x70)
[ 2.589102] r5:00000000 r4:c1a72410
[ 2.592719] [] (device_driver_attach) from [] (__driver_attach+0x68/0xdc)
[ 2.601338] r7:c08412bc r6:c1a72410 r5:c12e8268 r4:00000000
[ 2.607066] [] (__driver_attach) from [] (bus_for_each_dev+0x84/0xc4)
[ 2.615335] r7:c08412bc r6:c12e8268 r5:c1205048 r4:c1a64eb4
[ 2.621064] [] (bus_for_each_dev) from [] (driver_attach+0x2c/0x30)
[ 2.629157] r7:00000000 r6:c2073e00 r5:c12f0da8 r4:c12e8268
[ 2.634885] [] (driver_attach) from [] (bus_add_driver+0x1c8/0x1e8)
[ 2.642984] [] (bus_add_driver) from [] (driver_register+0x84/0x118)
[ 2.651164] r7:00000000 r6:c1053834 r5:c10278e4 r4:c12e8268
[ 2.656889] [] (driver_register) from [] (__platform_driver_register+0x50/0x58)
[ 2.666034] r5:c10278e4 r4:c1205048
[ 2.669657] [] (__platform_driver_register) from [] (brcm_pcie_driver_init+0x24/0x28)
[ 2.679338] [] (brcm_pcie_driver_init) from [] (do_one_initcall+0x50/0x264)
[ 2.688141] [] (do_one_initcall) from [] (kernel_init_freeable+0x258/0x2b8)
[ 2.696937] r8:00000132 r7:c180f800 r6:c1053834 r5:00000007 r4:c1088de4
[ 2.703718] [] (kernel_init_freeable) from [] (kernel_init+0x18/0x130)
[ 2.712076] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0b61d00
[ 2.719991] r4:00000000
[ 2.722556] [] (kernel_init) from [] (ret_from_fork+0x14/0x28)
[ 2.730209] Exception stack(0xc18f9fb0 to 0xc18f9ff8)
[ 2.735316] 9fa0: 00000000 00000000 00000000 00000000
[ 2.743586] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 2.751856] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 2.758543] r5:c0b61d00 r4:00000000
[ 2.762160] Code: e3540002 0a000005 e5903000 f57ff04f (e59b2004)
[ 2.768333] ---[ end trace d6064a47070472fc ]---
[ 2.773007] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Looking at the kernel sources and figuring out what is happening here has to wait until the weekend. And I also have not looked at any linux kernel code for more than a year so don't expect to much.

from raspberry-pi-pcie-devices.

nilres avatar nilres commented on May 5, 2024

What I found: From the stack trace it is obvious that the kernel is currently iterating over the PCIe bus and querys all devices and then asks them for their vendor id (pci_bus_read_dev_vendor_id). While doing so the processor encounters a asynchronous external abort. If my understanding is correct this means something bad (e.g. unaligned data access) happened while performing an asynchronous operation. My guess would be that this is the interrupt handling: Kernel asks PCIe device for the vendor id, device anwsers with something the kernel (or maybe even hardware?) isn't capable of handling and throws this abort exception.
Since this is all asynchronous it is also possible that the root for this error is something completely differnt but since plugging in this PCIe devices leads to the error I think there is enough evidence that the error is probably related to the device initialization.
I spend an hour trying to understand how the kernel implements the arm interrupt handling but I don't think I have the time at the moment to learn all this from scratch. But if someone has a nice idea or maybe also a patch that helps gathering more debug information I'm willing to compile that and test it out.

from raspberry-pi-pcie-devices.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.