Comments (19)
Since you have two, you should try to see if RoCE can work. It would be awesome to have an RDMA-enabled Pi and it would help removing the IRQ issue you had in the other videos.
If all of this work then I'll be pleased to see some infiniband network on it :)
from raspberry-pi-pcie-devices.
@albydnc - Heh, I'll see what I can do!
from raspberry-pi-pcie-devices.
@geerlingguy let me know if you need some help, I work on infiniband and rdma
from raspberry-pi-pcie-devices.
$ sudo lspci -vvvv
01:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)
Subsystem: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 0
Region 0: Memory at 600800000 (64-bit, non-prefetchable) [disabled] [size=1M]
Region 2: Memory at 600000000 (64-bit, prefetchable) [disabled] [size=8M]
[virtual] Expansion ROM at 600900000 [disabled] [size=1M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] Vital Product Data
Product Name: ConnectX-2 SFP+
Read-only fields:
[PN] Part number: MNPA19-XTR
[EC] Engineering changes: A2
[SN] Serial number: MT1148X12321
[V0] Vendor specific: PCIe Gen2 x8
[RV] Reserved: checksum good, 0 byte(s) reserved
Read/write fields:
[V1] Vendor specific: N/A
[YA] Asset tag: N/A
[RW] Read-write area: 105 byte(s) free
End
Capabilities: [9c] MSI-X: Enable- Count=128 Masked-
Vector table: BAR=0 offset=0007c000
PBA: BAR=0 offset=0007d000
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #8, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [148 v1] Device Serial Number 00-02-c9-03-00-53-00-fa
from raspberry-pi-pcie-devices.
$ dmesg
...
[ 1.217953] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[ 1.217995] brcm-pcie fd500000.pcie: No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[ 1.218079] brcm-pcie fd500000.pcie: MEM 0x0600000000..0x063fffffff -> 0x00c0000000
[ 1.218178] brcm-pcie fd500000.pcie: IB MEM 0x0000000000..0x00ffffffff -> 0x0100000000
[ 1.282343] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC)
[ 1.282710] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00
[ 1.282742] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 1.282770] pci_bus 0000:00: root bus resource [mem 0x600000000-0x63fffffff] (bus address [0xc0000000-0xffffffff])
[ 1.282866] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400
[ 1.283113] pci 0000:00:00.0: PME# supported from D0 D3hot
[ 1.286752] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[ 1.400521] pci 0000:01:00.0: [15b3:6750] type 00 class 0x020000
[ 1.400803] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]
[ 1.400979] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x007fffff 64bit pref]
[ 1.401254] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
[ 1.402247] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0000:00:00.0 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link)
[ 1.405844] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[ 1.405903] pci 0000:00:00.0: BAR 9: assigned [mem 0x600000000-0x6007fffff 64bit pref]
[ 1.405933] pci 0000:00:00.0: BAR 8: assigned [mem 0x600800000-0x6009fffff]
[ 1.405965] pci 0000:01:00.0: BAR 2: assigned [mem 0x600000000-0x6007fffff 64bit pref]
[ 1.406122] pci 0000:01:00.0: BAR 0: assigned [mem 0x600800000-0x6008fffff 64bit]
[ 1.406275] pci 0000:01:00.0: BAR 6: assigned [mem 0x600900000-0x6009fffff pref]
[ 1.406303] pci 0000:00:00.0: PCI bridge to [bus 01]
[ 1.406334] pci 0000:00:00.0: bridge window [mem 0x600800000-0x6009fffff]
[ 1.406363] pci 0000:00:00.0: bridge window [mem 0x600000000-0x6007fffff 64bit pref]
from raspberry-pi-pcie-devices.
Trying this driver first: https://www.mellanox.com/products/ethernet-drivers/linux/mlnx_en
$ wget http://www.mellanox.com/downloads/ofed/MLNX_EN-5.1-1.0.4.0/mlnx-en-5.1-1.0.4.0-debian10.3-aarch64.tgz
$ tar xvf mlnx-en-5.1-1.0.4.0-debian10.3-aarch64.tgz
$ cd mlnx-en-5.1-1.0.4.0-debian10.3-aarch64/
$ sudo ./install
Error: The current mlnx-en is intended for debian10.3
How unfortunate :P
from raspberry-pi-pcie-devices.
Digging through the installer, I found --skip-distro-check
as an available option.
$ sudo ./install --skip-distro-check
System has one or more unsupported device, see below.
MLNX_OFED / mlnx_en 5.1 and above supports only ConnectX-4 or newer devices.
This device could become unavailable which might result in loss of connectivity.
Use --skip-unsupported-devices-check to skip this check.
Aborting.
* 01:00.0 Ethernet controller [0200]: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] [15b3:6750] (rev b0)
No support for older cards? What madness! Let's try with:
$ sudo ./install --skip-distro-check --skip-unsupported-devices-check
Now it's attempting to install extra stuff:
Checking SW Requirements...
One or more required packages for installing mlnx-en are missing.
/lib/modules/5.10.3-v8+/build/scripts is required for the Installation.
Attempting to install the following missing packages:
autotools-dev graphviz autoconf chrpath linux-headers-5.10.3-v8+ dpatch lsof dkms m4 automake quilt debhelper swig libltdl-dev
Failed command: apt-get install -y autotools-dev graphviz autoconf chrpath linux-headers-5.10.3-v8+ dpatch lsof dkms m4 automake quilt debhelper swig libltdl-dev
Why don't all these device manufacturers account for Raspberry Pi OS 64-bit beta? ๐ค
Anyways, going to stop for now, and get back at it later. At least I have the card identified. It works through my 1x-to-16x adapter, but it wasn't showing if I tried powering it through my external adapter...
I also received a Noctua fan PWM controller today. Nice for my ears to not run the 12V fan at maximum speed all day :D
from raspberry-pi-pcie-devices.
@albydnc - I may ask for some help figuring out a good test / benchmark for RDMA, as I know a lot of people may be interested in whether the Pi can support it.
from raspberry-pi-pcie-devices.
@geerlingguy you can use the default benchmarks available with the mellanox driver: perfest.
You can also look at the source on GitHub. This is the optimal condition for testing the performance of the network, since the benchmarks are written using the C low-level API for RDMA (infinband verbs).
For a more general test, I suggest to use MPI tests, so you can compare easily various technologies; you will see a drop in performance, but it shouldn't be significant.
from raspberry-pi-pcie-devices.
Trying this driver first: https://www.mellanox.com/products/ethernet-drivers/linux/mlnx_en
$ wget http://www.mellanox.com/downloads/ofed/MLNX_EN-5.1-1.0.4.0/mlnx-en-5.1-1.0.4.0-debian10.3-aarch64.tgz $ tar xvf mlnx-en-5.1-1.0.4.0-debian10.3-aarch64.tgz $ cd mlnx-en-5.1-1.0.4.0-debian10.3-aarch64/ $ sudo ./install Error: The current mlnx-en is intended for debian10.3
How unfortunate :P
you lost me here :(
Why wouldn't the available debian10.0 driver have no chance of working?
from raspberry-pi-pcie-devices.
@mi-hol - It seems like that install script is a giant bash script that has a lot of points of entanglement where it's looking for exact strings in returned information. Pi OS, and especially Pi OS 64-bit beta, don't behave identically to Debian 10.3 / Debian 10.
The Ubuntu installer might have better success, but honestly, the drivers have a ton of warnings and checks and things that try to force you to use ConnectX-4 or later generation of cards... I'm thinking compiling in the kernel would be easier since it's not as preachy about making you buy the latest generation of card.
from raspberry-pi-pcie-devices.
So, Connectx2, while still interesting, are not something you'll want to waste your time on.
Mellanox has dropped driver support from ages and they miss the only interesting thing of Mellanox NICs, RDMA.
You should be able to get one Connectx3 on eBay for cheap and get all the nice modern features.
If you want to try it, I'm willing to buy it for you @geerlingguy
from raspberry-pi-pcie-devices.
@albydnc - I figured as much... and I would gladly take you up on that offer! If you can DM me on Twitter, or email me (my email is on my website about page), I can sort out the details. And I'll happily plug your Twitter/name/whatever in an eventual video I make on 10G networking on the Pi (whether or not I can get the X3 working! I already have the ASUS card going).
from raspberry-pi-pcie-devices.
Jeff, just to (re-)pique your interest in the Mellanox cards, I have the dual NIC versions of the same venerable beasties:
$lspci -nn | grep Mellanox
01:00.0 Ethernet controller [0200]: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] [15b3:6750] (rev b0)
where CentOS 7 worked a treat but CentOS Stream dropped support, I discovered, when upgrading a couple of weeks ago. You should note that Linux, at least, uses the MLX4 driver for these parts.
In my case the issue was "as simple as" the drivers having GEN2 support #define
'd out. I wrote some notes to self.
On my RPi4 running 64-bit there's barely support for any ethernet device:
$ls /lib/modules/$(uname -r)/kernel/drivers/net/ethernet/
microchip qualcomm wiznet
but there are some 4.x kernels lying about (no idea why) which do have full MLX4 support. In particular you can grep out this particular card (using the PCI vendor and product IDs from lspci -nn
above):
$modinfo /lib/modules/4.19.0-16-arm64/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko | grep -i 15b3 | grep -i 6750
alias: pci:v000015B3d00006750sv*sd*bc*sc*i*
So I'm going to guess that support is entirely feasible.
Until last week I'd not compiled anything kernel-y before but I guess the process is similar on the RPi.
from raspberry-pi-pcie-devices.
@ianfitchet - Thanks! I'll keep that in mind next time I get back to this cardโfor now I'm switching my sights over to the ConnectX-3 I just got (see #143).
from raspberry-pi-pcie-devices.
Just tried with same freshly-compiled kernel I tested in #143 with a ConnectX-3 adapter, and getting the exact same error:
[ 28.219483] mlx4_en: eth1: Link Up
[ 43.997574] ------------[ cut here ]------------
[ 43.997620] NETDEV WATCHDOG: eth1 (mlx4_core): transmit queue 0 timed out
[ 43.997703] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x3a0/0x3a8
[ 43.997710] Modules linked in: bnep hci_uart btbcm bluetooth ecdh_generic ecc mlx4_en 8021q garp stp llc vc4 brcmfmac cec brcmutil drm_kms_helper v3d cfg80211 gpu_sched bcm2835_codec(C) rfkill bcm2835_v4l2(C) drm bcm2835_isp(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common drm_panel_orientation_quirks raspberrypi_hwmon videodev mlx4_core vc_sm_cma(C) mc snd_bcm2835(C) i2c_brcmstb snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd syscopyarea rpivid_mem sysfillrect sysimgblt fb_sys_fops backlight uio_pdrv_genirq uio nvmem_rmem aes_neon_bs sha256_generic aes_neon_blk crypto_simd cryptd ip_tables x_tables ipv6
[ 43.998043] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G C 5.10.39-v8+ #1
[ 43.998050] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 43.998062] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[ 43.998071] pc : dev_watchdog+0x3a0/0x3a8
[ 43.998078] lr : dev_watchdog+0x3a0/0x3a8
[ 43.998085] sp : ffffffc0115bbd10
[ 43.998092] x29: ffffffc0115bbd10 x28: ffffff804b0d3f40
[ 43.998108] x27: 0000000000000004 x26: 0000000000000140
[ 43.998124] x25: 00000000ffffffff x24: 0000000000000002
[ 43.998139] x23: ffffffc011286000 x22: ffffff804b0a03dc
[ 43.998154] x21: ffffff804b0a0000 x20: ffffff804b0a0480
[ 43.998168] x19: 0000000000000000 x18: 0000000000000000
[ 43.998183] x17: 0000000000000000 x16: 0000000000000000
[ 43.998198] x15: ffffffffffffffff x14: ffffffc011288948
[ 43.998213] x13: ffffffc01146ebd0 x12: ffffffc011315430
[ 43.998227] x11: 0000000000000003 x10: ffffffc0112fd3f0
[ 43.998242] x9 : ffffffc0100e5358 x8 : 0000000000017fe8
[ 43.998256] x7 : c0000000ffffefff x6 : 0000000000000003
[ 43.998270] x5 : 0000000000000000 x4 : 0000000000000000
[ 43.998285] x3 : 0000000000000103 x2 : 0000000000000102
[ 43.998299] x1 : 730045c0bcfb7500 x0 : 0000000000000000
[ 43.998314] Call trace:
[ 43.998324] dev_watchdog+0x3a0/0x3a8
[ 43.998339] call_timer_fn+0x38/0x200
[ 43.998349] run_timer_softirq+0x298/0x548
[ 43.998358] __do_softirq+0x1a8/0x510
[ 43.998369] irq_exit+0xe8/0x108
[ 43.998378] __handle_domain_irq+0xa0/0x110
[ 43.998386] gic_handle_irq+0xb0/0xf0
[ 43.998393] el1_irq+0xc8/0x180
[ 43.998407] arch_cpu_idle+0x18/0x28
[ 43.998416] default_idle_call+0x58/0x1d4
[ 43.998427] do_idle+0x25c/0x270
[ 43.998437] cpu_startup_entry+0x30/0x70
[ 43.998448] secondary_start_kernel+0x170/0x180
[ 43.998456] ---[ end trace 257c7cb4ef196f12 ]---
[ 43.998490] mlx4_en: eth1: TX timeout on queue: 0, QP: 0x208, CQ: 0x84, Cons: 0xffffffff, Prod: 0x1
[ 44.046185] mlx4_en: eth1: Steering Mode 1
[ 44.052169] mlx4_en: eth1: Link Down
[ 46.301966] mlx4_en: eth1: Link Up
[ 61.917527] mlx4_en: eth1: TX timeout on queue: 2, QP: 0x20a, CQ: 0x86, Cons: 0xffffffff, Prod: 0x1
[ 61.949949] mlx4_en: eth1: Steering Mode 1
[ 61.970419] mlx4_en: eth1: Link Down
[ 64.379433] mlx4_en: eth1: Link Up
The lights flash, things seem to work, but it keeps re-connecting :(
$ ip a
...
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:02:c9:4e:e2:fa brd ff:ff:ff:ff:ff:ff
inet 169.254.135.78/16 brd 169.254.255.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet6 fe80::25c8:7bfd:2254:dad4/64 scope link
valid_lft forever preferred_lft forever
from raspberry-pi-pcie-devices.
Marking this as done... can't find any way to get the thing working, unfortunately.
from raspberry-pi-pcie-devices.
(I have since confirmed these cards work fine in a few different PCs, though.)
from raspberry-pi-pcie-devices.
If you still have one of these cards around, turning off tx/fx flow control (pause frames) may work via ethtool -A DEVICE rx off tx off
, which I've had to do for connectx2 cards on some PC installations that see a similar timeout & linkdown/up behavior. The card's also designed to work with multiple tx/rx queues to split traffic amongst CPUs. Maybe that's interfering with something as well and you could try using ethtool -l DEVICE
and ethtool -L DEVICE rx RXCHNUM tx TXCHNUM
to tweak the channel count.
Incidentally, I've also had these cards silently fail when I try to use a MTU larger than 4032 on ethernet, but IDK if you've done anything IRT that.
from raspberry-pi-pcie-devices.
Related Issues (20)
- 52Pi NVDAC - 2230/2242 NVMe and PCM5122 DAC HAT for RPi5 HOT 1
- 52Pi NVDigi - 2230/2242 NVMe and WM8804 Digi HAT for RPi5 HOT 1
- Add 52Pi 2.5G PCIe Network Adapter HOT 2
- Add 52Pi PD Power HAT HOT 4
- Add NVPI5-2242T and NVPI5-2280B HAT HOT 1
- 2.5 Gigabit M.2 M+B key Ethernet Card NIC HOT 1
- Pi 5 nvme expansion boards hat testing HOT 1
- Waveshare PCIe To M.2 Adapter for Raspberry Pi 5 Testing HOT 4
- tested - SK Hynix Gold P31 1TB NVME Pi 5 4GB
- Add Waveshare PoE HAT (F) for Raspberry Pi 5 HOT 18
- Add HAT: Mcuzone MP4GM 4G LTE Module for Pi 5
- Add HAT: Mcuzone MPW7 E-Key WiFi 7 HAT HOT 1
- ASMedia Technology Inc. Device 1064 M.2 NVME to Mini SAS Expansion Card Support 4 Ports SATA3.0 6Gbps HDD SSD SATA Controller SFF8087 HOT 1
- Add HAT: Dual M.2 X1004 Board from Geekworm HOT 1
- Add Pi 5 HAT: Mcuzone MP2.5GD Pi 5 Dual 2.5 Gbps Ethernet HAT
- Test LSI 9440-8i on rpi5 with "FW in FAULT state" HOT 1
- Add PI-2230-SSD, an SSD expansion board for Raspberry PI 5 HOT 1
- Delock NVME Ethernet I225-V Adapter on RPI5 HOT 12
- Add HAT: NVMe Base Duo HOT 1
- Add LSI SAS 9211-8i / Dell PERC H310 SAS HBA raid controller HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from raspberry-pi-pcie-devices.