Comments (26)
@vegedb - Right now my goal is to get any card working, and some seem to have a better chance than others :D
Once we can prove that some card actually works, then at that point, I'll start thinking about use cases like media transcoding, AI/ML, gaming, etc.
from raspberry-pi-pcie-devices.
So... trying this again today as I thought I'd be wrapping up work on a video for Friday, but as always, things behave differently if you look at them sideways—in this case, I have a camera on it, so it's doing different things than it did late last year.
This time around, since there the BAR space issue was resolved in newer Pi OS kernels, and since the 64-bit Pi OS now has a proper headers package available that can be installed via apt instead of by compiling things by myself, I'm taking another stab at installing Nvidia's proprietary AARCH64 latest driver from https://www.nvidia.com/en-us/drivers/unix/linux-aarch64-archive/
First I flashed my Pi's drive with the 64-bit beta release, then I booted it and ran:
sudo apt-get update
sudo apt-get -y dist-upgrade
sudo apt-get install -y raspberrypi-kernel-headers
sudo reboot
If running an X server (if you're logged into a GUI), and you can't log out from it, run from SSH / terminal: sudo systemctl stop lightdm
. Nvidia's driver can't be installed while an X server is running.
I copied Nvidia's driver .run
file to the Pi, then I ran:
chmod +x NVIDIA-Linux-aarch64-450.119.03.run
sudo ./NVIDIA-Linux-aarch64-450.119.03.run
After a reboot, everything seemed to be coming up and then, about 30 seconds later, before X logged me in, I got the following via dmesg
—but thankfully the whole system didn't lock up, and I could still access the Pi via SSH.
[ 39.313959] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b7
[ 39.313974] Mem abort info:
[ 39.313983] ESR = 0x96000005
[ 39.313995] EC = 0x25: DABT (current EL), IL = 32 bits
[ 39.314003] SET = 0, FnV = 0
[ 39.314012] EA = 0, S1PTW = 0
[ 39.314019] Data abort info:
[ 39.314027] ISV = 0, ISS = 0x00000005
[ 39.314036] CM = 0, WnR = 0
[ 39.314047] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000483fe000
[ 39.314056] [00000000000000b7] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 39.314096] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 39.314102] Modules linked in: rfcomm bnep hci_uart btbcm bluetooth ecdh_generic ecc 8021q garp stp llc nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) brcmfmac brcmutil vc4 cec cfg80211 v3d bcm2835_v4l2(C) bcm2835_codec(C) bcm2835_isp(C) raspberrypi_hwmon gpu_sched rfkill videobuf2_vmalloc drm_kms_helper bcm2835_mmal_vchiq(C) v4l2_mem2mem videobuf2_dma_contig snd_bcm2835(C) videobuf2_memops videobuf2_v4l2 videobuf2_common vc_sm_cma(C) videodev drm mc drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd rpivid_mem syscopyarea sysfillrect sysimgblt fb_sys_fops backlight uio_pdrv_genirq uio i2c_dev aes_neon_bs sha256_generic aes_neon_blk crypto_simd cryptd ip_tables x_tables ipv6
[ 39.314353] CPU: 2 PID: 528 Comm: Xorg Tainted: P C O 5.10.17-v8+ #1403
[ 39.314358] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 39.314368] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--)
[ 39.318042] pc : _nv036670rm+0x0/0x110 [nvidia]
[ 39.321855] lr : _nv029673rm+0x11c/0x9a0 [nvidia]
[ 39.321862] sp : ffffffc012a4b500
[ 39.321867] x29: ffffffc012a4b500 x28: ffffff8047215808
[ 39.321879] x27: ffffff8048754008 x26: ffffff8047215808
[ 39.321890] x25: ffffff8043c8c008 x24: ffffff8043c8c008
[ 39.321900] x23: ffffff8047215808 x22: 0000000000000000
[ 39.321911] x21: ffffff8048754008 x20: ffffff8047216008
[ 39.321922] x19: ffffffc00a1b4000 x18: 0000000000800000
[ 39.321932] x17: ffffffc0095de650 x16: ffffffc0095de718
[ 39.321943] x15: ffffffc0095defe0 x14: ffffffc0095df0f8
[ 39.321953] x13: ffffffc0095df128 x12: ffffffc0113d3a80
[ 39.321964] x11: 0000000000019560 x10: 0000000000000000
[ 39.321974] x9 : ffffffc0091c4b1c x8 : 0000000000000000
[ 39.321985] x7 : 0000000000000001 x6 : ffffffc012a4b450
[ 39.321995] x5 : 0000000000000000 x4 : ffffffc009535500
[ 39.322006] x3 : ffffffc009539278 x2 : 0000000000000001
[ 39.322016] x1 : ffffffc0097ba228 x0 : ffffff8047216008
[ 39.322028] Call trace:
[ 39.325874] _nv036670rm+0x0/0x110 [nvidia]
[ 39.329692] _nv029705rm+0x1c4/0x2e0 [nvidia]
[ 39.333476] _nv029672rm+0x5c/0x238 [nvidia]
[ 39.337292] _nv030359rm+0x90/0x130 [nvidia]
[ 39.341076] _nv009458rm+0x5c/0x790 [nvidia]
[ 39.344901] _nv019777rm+0xb8/0x1a8 [nvidia]
[ 39.348712] _nv020023rm+0x28/0x78 [nvidia]
[ 39.352566] _nv000732rm+0xee4/0x19e0 [nvidia]
[ 39.356416] rm_init_adapter+0xa8/0xb8 [nvidia]
[ 39.360232] nv_open_device+0x420/0x6e8 [nvidia]
[ 39.364044] nvidia_open+0x100/0x3a0 [nvidia]
[ 39.367851] nvidia_frontend_open+0x74/0xc0 [nvidia]
[ 39.367873] chrdev_open+0xb0/0x1a8
[ 39.367883] do_dentry_open+0x134/0x398
[ 39.367892] vfs_open+0x34/0x40
[ 39.367900] path_openat+0xa24/0xe20
[ 39.367907] do_filp_open+0x84/0x100
[ 39.367915] do_sys_openat2+0x1f8/0x2a8
[ 39.367921] do_sys_open+0x60/0xa8
[ 39.367927] __arm64_sys_openat+0x2c/0x38
[ 39.367939] el0_svc_common.constprop.2+0xac/0x1d0
[ 39.367947] do_el0_svc+0x2c/0x98
[ 39.367956] el0_svc+0x20/0x30
[ 39.367964] el0_sync_handler+0x90/0xb8
[ 39.367970] el0_sync+0x174/0x180
[ 39.367986] Code: a94363b7 17ffffbc 52800580 17ffffbc (3942d844)
[ 39.367997] ---[ end trace e68ea2ca7b20909f ]---
The attached display (HDMI0 on Pi) showed a solid cursor, so it seems the display system may have locked up still.
from raspberry-pi-pcie-devices.
Nice, thanks Jeff!
from raspberry-pi-pcie-devices.
This is awesome, however its more interesting to have Quadro P400 for plex/emby servers. It's the cheapest Quadro for transcoding costing around 50-100 on ebay. I feel that if this works, it will be the ultimate low power plex server.
Benefits,
Power
P400 = 30W
GTX 750 TI = 60-75W
Codecs
P400 = 6th Gen NVENC (Supports X265)
GTX 750 TI = 4th Gen NVENC
Additional info
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new
https://www.elpamsoft.com/?p=Plex-Hardware-Transcoding
from raspberry-pi-pcie-devices.
01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2) (prog-if 00 [VGA controller])
Subsystem: eVga.com. Corp. GM107 [GeForce GTX 750 Ti]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 0
Region 0: Memory at 618000000 (32-bit, non-prefetchable) [disabled] [size=16M]
Region 1: Memory at 600000000 (64-bit, prefetchable) [disabled] [size=256M]
Region 3: Memory at 610000000 (64-bit, prefetchable) [disabled] [size=32M]
Region 5: I/O ports at <unassigned> [disabled]
[virtual] Expansion ROM at 619000000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Via message
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [250 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [258 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=262144ns
L1SubCtl2: T_PwrOn=10us
Capabilities: [128 v1] Power Budgeting <?>
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900 v1] #19
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)
Subsystem: eVga.com. Corp. Device 3751
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin B routed to IRQ 0
Region 0: Memory at 619080000 (32-bit, non-prefetchable) [disabled] [size=16K]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [78] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Via message
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
from raspberry-pi-pcie-devices.
[ 1.257396] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[ 1.260080] brcm-pcie fd500000.pcie: No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[ 1.262848] brcm-pcie fd500000.pcie: MEM 0x0600000000..0x063fffffff -> 0x00c0000000
[ 1.265605] brcm-pcie fd500000.pcie: IB MEM 0x0000000000..0x00ffffffff -> 0x0200000000
[ 1.314966] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC)
[ 1.317945] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00
[ 1.320502] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 1.323035] pci_bus 0000:00: root bus resource [mem 0x600000000-0x63fffffff] (bus address [0xc0000000-0xffffffff])
[ 1.325717] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400
[ 1.328558] pci 0000:00:00.0: PME# supported from D0 D3hot
[ 1.334679] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[ 1.337626] pci 0000:01:00.0: [10de:1380] type 00 class 0x030000
[ 1.340201] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00ffffff]
[ 1.342879] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x0fffffff 64bit pref]
[ 1.345534] pci 0000:01:00.0: reg 0x1c: [mem 0x00000000-0x01ffffff 64bit pref]
[ 1.348120] pci 0000:01:00.0: reg 0x24: [io 0x0000-0x007f]
[ 1.350701] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref]
[ 1.353566] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0000:00:00.0 (capable of 126.016 Gb/s with 8.0 GT/s PCIe x16 link)
[ 1.356323] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 1.359054] pci 0000:01:00.1: [10de:0fbc] type 00 class 0x040300
[ 1.361747] pci 0000:01:00.1: reg 0x10: [mem 0x00000000-0x00003fff]
[ 1.368030] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[ 1.370696] pci 0000:00:00.0: BAR 9: assigned [mem 0x600000000-0x617ffffff 64bit pref]
[ 1.373239] pci 0000:00:00.0: BAR 8: assigned [mem 0x618000000-0x6197fffff]
[ 1.375813] pci 0000:01:00.0: BAR 1: assigned [mem 0x600000000-0x60fffffff 64bit pref]
[ 1.378416] pci 0000:01:00.0: BAR 3: assigned [mem 0x610000000-0x611ffffff 64bit pref]
[ 1.380948] pci 0000:01:00.0: BAR 0: assigned [mem 0x618000000-0x618ffffff]
[ 1.383428] pci 0000:01:00.0: BAR 6: assigned [mem 0x619000000-0x61907ffff pref]
[ 1.385888] pci 0000:01:00.1: BAR 0: assigned [mem 0x619080000-0x619083fff]
[ 1.388276] pci 0000:01:00.0: BAR 5: no space for [io size 0x0080]
[ 1.390562] pci 0000:01:00.0: BAR 5: failed to assign [io size 0x0080]
[ 1.392998] pci 0000:00:00.0: PCI bridge to [bus 01]
[ 1.395355] pci 0000:00:00.0: bridge window [mem 0x618000000-0x6197fffff]
[ 1.397746] pci 0000:00:00.0: bridge window [mem 0x600000000-0x617ffffff 64bit pref]
[ 1.400377] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
That new BAR space increase must've landed in the kernel I built, because I didn't have to manually tweak the BAR space anymore!
from raspberry-pi-pcie-devices.
Downloading the proprietary Nvidia AARCH64 Driver first: https://www.nvidia.com/en-us/drivers/unix/linux-aarch64-archive/
$ sudo ./NVIDIA-Linux-aarch64-460.27.04.run
ERROR: Unable to find the kernel source tree for the currently running kernel...
But it exists, inside /usr/src/linux-headers-5.10.1-v8+
(in my case). So I ran:
$ sudo ./NVIDIA-Linux-aarch64-460.27.04.run --kernel-source-path /usr/src/linux-headers-5.10.1-v8+
(Just noting that I had previously run the gist to compile kernel headers for 64-bit Pi OS as directed in this comment: #40 (comment)).
from raspberry-pi-pcie-devices.
Full log with the errors in the build: https://gist.github.com/geerlingguy/33539fd16a1b2ec7cabc6d86d0e75cd9
from raspberry-pi-pcie-devices.
Trying a cross-compile with the Nouveau driver (inside menuconfig, under Device Drivers > Graphics support).
After copying the files, I created /etc/modprobe.d/blacklist-nouveau.conf
with the contents:
blacklist nouveau
And after reboot I'll see what happens when I try loading the module with sudo modprobe nouveau
.
from raspberry-pi-pcie-devices.
$ dmesg --follow
[ 57.389199] pci 0000:00:00.0: enabling device (0000 -> 0002)
[ 57.389235] nouveau 0000:01:00.0: enabling device (0000 -> 0002)
[ 57.389369] nouveau 0000:01:00.0: NVIDIA GM107 (117000a2)
[ 57.651947] nouveau 0000:01:00.0: bios: version 82.07.55.00.29
[ 59.761566] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
[ 59.761591] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 3e6684 [ IBUS ]
...
And as with all the other video cards, the entire Pi just completely locks up at this point, being unresponsive to input or to any remote commands. Even the little flashing cursor at the CLI prompt stops flashing, a complete system halt.
from raspberry-pi-pcie-devices.
Someone else with that exact same fault: https://bugzilla.kernel.org/show_bug.cgi?id=202731 — but I'm guessing it could be like with the Radeon 5350, where it's actually failing somewhere else.
from raspberry-pi-pcie-devices.
Also in https://bugs.launchpad.net/nouveau/+bug/1684123 (migrated to https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/335).
from raspberry-pi-pcie-devices.
Trying now with a powered external riser. And a note:
- I tried the PCE164P-NO3 ver 888, and it resulted in a kernel panic (no boot). Tried 5x before giving up.
- I tried the PCE164P-NO6 ver 008S, and it resulted in kernel panics (no boot). Tried 3x.
- I tried the PCE164P-NO3 ver 006, and it resulted in kernel panics if I used the USB 3 cable it came with, but if I swapped in the beefier USB 3 cable that came with my ver 888 board, it booted.
Go figure. Cheap junk doesn't work wonderfully. And the 888 board somehow fried my 2.5G network adapter yesterday, wish I had that on video. Lots of smoke!
from raspberry-pi-pcie-devices.
With the ver 006 riser, after I run sudo modprobe nouveau
, I get:
[ 172.199242] pci 0000:00:00.0: enabling device (0000 -> 0002)
[ 172.199279] nouveau 0000:01:00.0: enabling device (0000 -> 0002)
[ 172.199411] nouveau 0000:01:00.0: NVIDIA GM107 (117000a2)
[ 172.461834] nouveau 0000:01:00.0: bios: version 82.07.55.00.29
And then the system freezes completely. Watching with an external display on the console, I do see a full kernel panic. How can I get that output in text form? Here's what it looks like on the diminutive screen I have hooked up:
Edit I was hopeful I could use the Ubuntu Linux Crash Dump Guide, but that package is not available on the Pi.
from raspberry-pi-pcie-devices.
Watching with an external display on the console, I do see a full kernel panic. How can I get that output in text form?
UART like https://www.amazon.co.uk/Serial-Converter-Adapter-Prolific-Windows-Black/dp/B08DKM6Q63/ref=sr_1_3 on GPIOs 14&15 + GND, and configured as a console (use raspi-config).
It does depend on how quickly the full kernel gets killed as it takes a little while to dribble everything out at 115200baud.
There is also a kernel config option NOUVEAU_DEBUG that can be set with menuconfig or similar. Crank it up before doing your cross-compile to get lots of debug out. That was the next step I was intending to do in my conversation on the nouveau kernel mailing list, but other things became the priority :-/
from raspberry-pi-pcie-devices.
@6by9 - Thanks; it seems like that's the long-term route to go, but seeing as I don't have one of those cables handy and have everything plugged in today, I was thinking maybe I could use netconsole
...
I added the following in /boot/cmdline.txt
:
debug [email protected]/,[email protected]/
And I made sure my Mac (142 / target) was running netcat on UDP port 6666:
nc -u -l 6666
The Pi would boot, but I don't see any output making its way to my Mac. I confirmed if I ran nc -u 10.0.100.142 6666
on the Pi directly and typed in text, it made it across to my Mac.
Is netconsole not supported on Pi OS?
I'll also crank up NOUVEAU_DEBUG
and see what that gets me.
from raspberry-pi-pcie-devices.
Set debug levels to max:
Pushing the code over to the Pi now...
from raspberry-pi-pcie-devices.
Here's the entire dump of NOUVEAU output: https://gist.github.com/geerlingguy/7a021f3fecf198bf7020b85244e772ee
But I suspect it just got cut off at the kernel panic with more output queued up in a buffer like the Radeon did when we were debugging it. It seems like it was in the middle of the devinit process, as the last main section was:
[ 182.675218] nouveau 0000:01:00.0: sw: preinit running...
[ 182.675223] nouveau 0000:01:00.0: sw: preinit completed in 0us
[ 182.675233] nouveau 0000:01:00.0: devinit: running init tables
...
Each time I try, it gets to a different line before the output is cut off.
@6by9 - I ordered AdaFruit's serial cable, from Amazon instead of direct, just because Amazon ships tomorrow :D
from raspberry-pi-pcie-devices.
Received the ADAFRUIT Industries 954 USB-to-TTL Serial Cable today from Amazon (less than 18 hours after ordering it!).
-
Install the Silicon Labs driver for macOS.
-
Connect the cable to the outside GPIO pins: black to ground (3rd pin), white to UART0_TXD (4th pin), green to UART0_RXD (5th pin).
-
Connect the USB end to your Mac.
-
Add
enable_uart=1
to the bottom of/boot/config.txt
, and reboot. -
Check serial ports on the Mac:
$ ls /dev/cu.* /dev/cu.Bluetooth-Incoming-Port /dev/cu.SLAB_USBtoUART /dev/cu.usbserial-0001
-
Connect using
screen
:screen /dev/cu.SLAB_USBtoUART 115200
-
Start working on the Pi like you would over SSH or via keyboard locally.
-
(Hit CTRL+a and then 'k' to kill the session, or 'd' to just detach but leave it running.)
In my case, I ran: sudo modprobe nouveau
, and this is the result:
[ 113.322557] SError Interrupt on CPU1, code 0xbf000002 -- SError
[ 113.322559] CPU: 1 PID: 598 Comm: modprobe Tainted: G C 5.10.2-v8+ #1
[ 113.322561] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 113.322563] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 113.322564] pc : init_rd32+0xf8/0x330 [nouveau]
[ 113.322566] lr : init_rd32+0x58/0x330 [nouveau]
[ 113.322567] sp : ffffffc011f8b350
[ 113.322568] x29: ffffffc011f8b350 x28: ffffff8042889000
[ 113.322573] x27: ffffff804a06c238 x26: ffffffc011f8b808
[ 113.322577] x25: 0000000000000002 x24: 0000000000021328
[ 113.322580] x23: 0000000000000008 x22: ffffff8045ed3800
[ 113.322583] x21: ffffff8044c89400 x20: ffffffc011f8b4c0
[ 113.322586] x19: 0000000000021328 x18: 0000000000000030
[ 113.322590] x17: 0000000000000000 x16: 0000000000000000
[ 113.322593] x15: ffffffffffffffff x14: 3078302026205d38
[ 113.322596] x13: 323331323078305b x12: ffffffc0112a2ff8
[ 113.322599] x11: 0000000000000003 x10: ffffffc01125a6b8
[ 113.322602] x9 : ffffffc0091fd018 x8 : 0000000000005c88
[ 113.322606] x7 : c0000000fffff3db x6 : ffffffc011f8af60
[ 113.322609] x5 : ffffff80fb7a58e0 x4 : 0000000000000000
[ 113.322612] x3 : 0000000000000000 x2 : 00000000deaddead
[ 113.322615] x1 : ffffffc015000000 x0 : ffffffc015021328
[ 113.322619] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 113.322621] CPU: 1 PID: 598 Comm: modprobe Tainted: G C 5.10.2-v8+ #1
[ 113.322622] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 113.322623] Call trace:
[ 113.322625] dump_backtrace+0x0/0x1b8
[ 113.322626] show_stack+0x20/0x70
[ 113.322627] dump_stack+0xf0/0x158
[ 113.322629] panic+0x18c/0x38c
[ 113.322630] nmi_panic+0x6c/0xa0
[ 113.322631] arm64_serror_panic+0x7c/0x90
[ 113.322632] do_serror+0x38/0x98
[ 113.322634] el1_error+0x84/0x104
[ 113.322635] init_rd32+0xf8/0x330 [nouveau]
[ 113.322637] init_condition_met+0xc0/0x150 [nouveau]
[ 113.322638] init_condition+0x64/0xe0 [nouveau]
[ 113.322640] nvbios_exec+0x5c/0x120 [nouveau]
[ 113.322641] init_sub_direct+0x98/0x160 [nouveau]
[ 113.322642] nvbios_exec+0x5c/0x120 [nouveau]
[ 113.322644] nvbios_post+0xac/0x180 [nouveau]
[ 113.322645] nv04_devinit_post+0x1c/0x28 [nouveau]
[ 113.322647] nvkm_devinit_post+0x40/0xb8 [nouveau]
[ 113.322648] nvkm_device_init+0xd4/0x230 [nouveau]
[ 113.322649] nvkm_udevice_init+0x68/0xa0 [nouveau]
[ 113.322651] nvkm_object_init+0x64/0x198 [nouveau]
[ 113.322652] nvkm_ioctl_new+0x1a4/0x288 [nouveau]
[ 113.322653] nvkm_ioctl+0xd4/0x278 [nouveau]
[ 113.322655] nvkm_client_ioctl+0x18/0x28 [nouveau]
[ 113.322656] nvif_object_ioctl+0x5c/0x70 [nouveau]
[ 113.322658] nvif_object_ctor+0xcc/0x160 [nouveau]
[ 113.322659] nvif_device_ctor+0x30/0x78 [nouveau]
[ 113.322660] nouveau_cli_init+0x168/0x568 [nouveau]
[ 113.322662] nouveau_drm_device_init+0x88/0x898 [nouveau]
[ 113.322663] nouveau_drm_probe+0x15c/0x1f8 [nouveau]
[ 113.322665] pci_device_probe+0xc0/0x190
[ 113.322666] really_probe+0xec/0x3b8
[ 113.322667] driver_probe_device+0x60/0xc0
[ 113.322669] device_driver_attach+0x7c/0x88
[ 113.322670] __driver_attach+0x60/0xe8
[ 113.322671] bus_for_each_dev+0x7c/0xd0
[ 113.322672] driver_attach+0x2c/0x38
[ 113.322674] bus_add_driver+0x194/0x1f8
[ 113.322675] driver_register+0x6c/0x128
[ 113.322676] __pci_register_driver+0x4c/0x58
[ 113.322678] nouveau_drm_init+0x180/0x1000 [nouveau]
[ 113.322679] do_one_initcall+0x54/0x2c8
[ 113.322680] do_init_module+0x60/0x240
[ 113.322682] load_module+0x1f20/0x2160
[ 113.322683] __do_sys_finit_module+0xbc/0xf8
[ 113.322684] __arm64_sys_finit_module+0x28/0x38
[ 113.322686] el0_svc_common.constprop.2+0x9c/0x1a8
[ 113.322687] do_el0_svc+0x2c/0x98
[ 113.322688] el0_svc+0x20/0x30
[ 113.322690] el0_sync_handler+0x90/0xb8
[ 113.322691] el0_sync+0x158/0x180
[ 113.322709] SMP: stopping secondary CPUs
[ 113.322710] Kernel Offset: disabled
[ 113.322712] CPU features: 0x0240022,61002000
[ 113.322713] Memory Limit: none
from raspberry-pi-pcie-devices.
Digging around quite a bit, I found little clues in a few places:
- aarch64 Kernel Panic Asynchronous SError Interrupt on large file IO
- [aarch64] Kernel crash on v5.1-rc5, __arch_copy_from_user+0x1bc/0x240
- Kernel crash on large file operations
- Nouveau: kernel hang on Optimus+Intel+NVidia GeForce 1060m
Glancing especially at that last result, I tried setting nouveau.runpm=0
in /boot/cmdline.txt
, and rebooted, but same result. Seems like init_rd32
is the final culprit?
static u32
init_rd32(struct nvbios_init *init, u32 reg)
{
struct nvkm_device *device = init->subdev->device;
reg = init_nvreg(init, reg);
if (reg != ~0 && init_exec(init))
return nvkm_rd32(device, reg);
return 0x00000000;
}
from raspberry-pi-pcie-devices.
I might also try the proprietary driver once more for fun on the latest precompiled 64-bit kernel, since kernel headers are now available via package and don't have to be built from source.
from raspberry-pi-pcie-devices.
I might also try the proprietary driver once more for fun on the latest precompiled 64-bit kernel, since kernel headers are now available via package and don't have to be built from source.
Hey how is it going? I have a GT 730 from the same generation (last time I checked) and I'm hoping it will work
from raspberry-pi-pcie-devices.
Finding similar reports here: gnab/rtl8812au#92, and here: raspberrypi/linux#3222
Though it could be a variety of things, I'm really thinking the driver is doing something like memcopy()
and it's breaking similar to how it broke on the RX 550, and on the Broadcom MegaRAID (the latter of which we got fixed/patched to work around the memory addressing limitations on 64-bit Pi OS and a PCIe bug (see: raspberrypi/linux#4158) — though I'm not sure if the PCIe thing is the problem here.
from raspberry-pi-pcie-devices.
From @uruhakomachin on Twitter:
@geerlingguy I had some tries using Nvidia graphics cards with the official arm64 driver. I had the same/quite similar error as you got, RmInitAdapter failed! (0x25:0x54:1262).
So I checked the rm_init_adapter function, and it seemed that the driver tried to init the card with some probably x86 code. 0x400000 is the usual base address for 32-bit applications. Given other asm code here, this func seems to save some regs and then load/exec some init code from the graphics card.
Also, this kind of error has been existed for a long time. Probably have to wait for NVIDIA to fix it...
For now, I think open source drivers have more chances to succeed
from raspberry-pi-pcie-devices.
Going to see what happens with nouveau
on Pi 5.
The proprietary drivers weren't built for arm64 (only amd64) back when this card was supported...
from raspberry-pi-pcie-devices.
pi@pi5:~ $ dmesg | grep nouveau
[ 3.121503] nouveau 0000:01:00.0: enabling device (0000 -> 0002)
[ 3.121599] nouveau 0000:01:00.0: unknown chipset (ffffffff)
from raspberry-pi-pcie-devices.
Related Issues (20)
- Add NVPI5-2242T and NVPI5-2280B HAT HOT 1
- 2.5 Gigabit M.2 M+B key Ethernet Card NIC HOT 1
- Pi 5 nvme expansion boards hat testing HOT 1
- Waveshare PCIe To M.2 Adapter for Raspberry Pi 5 Testing HOT 4
- tested - SK Hynix Gold P31 1TB NVME Pi 5 4GB
- Add Waveshare PoE HAT (F) for Raspberry Pi 5 HOT 18
- Add HAT: Mcuzone MP4GM 4G LTE Module for Pi 5
- Add HAT: Mcuzone MPW7 E-Key WiFi 7 HAT HOT 1
- ASMedia Technology Inc. Device 1064 M.2 NVME to Mini SAS Expansion Card Support 4 Ports SATA3.0 6Gbps HDD SSD SATA Controller SFF8087 HOT 1
- Add HAT: Dual M.2 X1004 Board from Geekworm HOT 1
- Add Pi 5 HAT: Mcuzone MP2.5GD Pi 5 Dual 2.5 Gbps Ethernet HAT
- Test LSI 9440-8i on rpi5 with "FW in FAULT state" HOT 1
- Add PI-2230-SSD, an SSD expansion board for Raspberry PI 5 HOT 1
- Delock NVME Ethernet I225-V Adapter on RPI5 HOT 12
- Add HAT: NVMe Base Duo HOT 1
- Add LSI SAS 9211-8i / Dell PERC H310 SAS HBA raid controller HOT 1
- Add Pineberry Pi HatBRICK! Commander HOT 2
- Add Geekworm X1010 PCIe x4 slot expansion board for Pi 5
- Add Pi 5 HAT: Mcuzone MPS2242 POE HAT (Combined NVME and POE)
- Pi 5 HAT: Radxa Penta SATA HAT HOT 61
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from raspberry-pi-pcie-devices.