Git Product home page Git Product logo

open-nic-shell's Introduction

AMD OpenNIC Shell

This is one of the three components of the OpenNIC project. The other components are:

OpenNIC shell delivers an FPGA-based NIC shell with 100Gbps Ethernet ports. The latest version is built with Vivado 2020.x, 2021.x or 2022.1. Currently, the supported boards include:

  • Xilinx Alveo U50, and
  • Xilinx Alveo U55N, and
  • Xilinx Alveo U55C, and
  • Xilinx Alveo U200, and
  • Xilinx Alveo U250, and
  • Xilinx Alveo U280, and
  • Xilinx Alveo U45N
Notes:
* In the Alveo U50 version only, Vivado may issue critical warnings regarding the power margin for the MGTYAVtt with respect to a margin on the 4A rail limit. While the U50's open-nic-shell MGT current (~3.97A) is still slightly below the rail's limit, this will go outside of the 10% margin described in the U50 board files. The U50 version has worked in a lab setting but with minimal testing. While this issue is considered to have low risk, please be aware of this condition and proceed if that risk is acceptable.
* Vivado 2022.2 uses QDMA v5.0. Driver may need to be upgraded to support QDMA 5.0.
* Also, starting from OpenNIC 1.0, the support for Bittware SoC-250 is obsolete and no longer maintained.

The NIC shell consists of skeleton components which implement host and Ethernet interfaces and two user logic boxes that wraps up user RTL plugins. Its architecture is shown in the figure below.

-----  -----------------------------------------------
|   |  |            System Configuration             | 
|   |  -----------------------------------------------
|   |     |         |         |         |         |  AXI-lite 125MHz
|   |     V         V         V         V         V
|   |  -------   -------   -------   -------   -------
|   |  |     |   |     |   |     |   |     |   |     |
| P |  |  Q  |==>| Box |==>|  A  |==>| Box |==>|  C  |
| C |  |  D  |   |  @  |   |  D  |   |  @  |   |  M  |
| I |  |  M  |   | 250 |   |  A  |   | 322 |   |  A  |
| E |  |  A  |<==| MHz |<==|  P  |<==| MHz |<==|  C  |
|   |  |     | | |     | | |     | | |     | | |     |
-----  ------- | ------- | ------- | ------- | -------
               |         |         |         |
               -----------         -----------
             AXI4-stream 250MHz  AXI4-stream 322MHz

The shell skeleton has the following components.

  • QDMA subsystem. It includes the Xilinx QDMA IP and RTL logic that bridges the QDMA IP interface and the 250MHz user logic box. The interfaces between QDMA subsystem and the 250MHz box use a variant of the AXI4-stream protocol. Let us refer the variant as the 250MHz AXI4-stream.
    • U45N has two QDMA subsystems. One for the host CPU, another for onboard Arm CPU. Two sets of AXI4-stream interfaces between QDMA subsystems and the 250MHz box.
  • CMAC subsystem. It includes the Xilinx CMAC IP and some wrapper logic. OpenNIC shell supports either 1 or 2 CMAC ports. In the case of 2 CMAC ports, there are two instances of CMAC subsystems with dedicated data and control interfaces. The CMAC subsystem runs at 322MHz and connects to the 322MHz user logic box using a variant of the AXI4-stream protocol. Similarly, let us refer the variant as the 322MHz AXI4-stream.
  • Packet adapter. It is used to convert between the 250MHz AXI4-stream and 322MHz AXI4-stream. On both TX and RX paths, the packet adapter serves as a packet-mode FIFO, which buffers the whole packet before sending out. On the RX path, it also restores the back-pressure capability which is missing from the CMAC subsystem interface.
  • System configuration. It implements a reset mechanism and allocates the register addresses for each components. The register interface uses AXI4-lite protocol and runs at 125MHz, which is phase-aligned with the 25MHz clock.

There are 2 user logic boxes, one running at 250MHz and the other at 322MHz. Each has a AXI-lite interface for register, 2 pairs of slave and master AXI4-stream interfaces for TX and RX respectively. User RTL plugins are responsible for implementing the handling of these interfaces.

Repo Structure

The open-nic-shell repository is organized as follows.

|-- open-nic-shell --
    |-- constr --
        |-- au50 --
        |-- au55c --
        |-- au55n --
        |-- au200 --
        |-- au250 --
        |-- au280 --
        |-- U45N --
        |-- ... --
    |-- plugin --
        |-- p2p --
    |-- script --
        |-- board_settings --
        |-- build.tcl
        |-- ...
    |-- src --
        |-- box_250mhz --
        |-- box_322mhz --
        |-- cmac_subsystem --
        |-- packet_adapter --
        |-- qdma_subsystem --
        |-- system_config --
        |-- utility --
        |-- open_nic_shell.sv
        |-- ...
    |-- LICENSE.txt
    |-- README.md
    |-- ...

Most of the directories are self-explanatory. The code under src contains the skeleton components and the "empty" boxes. Sample plugins are available under the plugin directory.

How to Build

OpenNIC shell is built by running the Tcl script build.tcl under script in Vivado. Depending on the target device, the build script generates proper files for flash programming.

It is recommended to build the design with Internet connection, as it relies on updated Xilinx board files, accessible through Xilinx Board Store. The build script automatically updates Vivado against this repository. See the below section for build without Internet/Github access.

Build Script Options

To start building the shell, run the following command under script with a proper MODE choice (i.e., tcl, batch or gui).

vivado -mode MODE -source build.tcl -tclargs [-OPTION VALUE] ...

A list of options are available to configure the build process and customize the design parameters.

# Build options

-board_repo  PATH
             path to local Xilinx board store repository for offline build.
             This option is used when Vivado is unable to connect to github
             and update the board repository.

-board       BOARD_NAME
             supported boards include:
             - au250, and
             - au280, and
             - au200, and
             - au55c, and
             - au55n, and
             - au50,  and
			 - U45N.

-tag         DESIGN_TAG
             string to identify the build.  The tag, along with the board
             name, becomes part of the build directory name.

-overwrite   0 (default), 1
             indicate if the script should overwrite existing build results.

-jobs        [1, 32] (default to 8)
             number of jobs for synthesis and implementation.

-synth_ip    0, 1 (default)
             indicate if IPs are out-of-box synthesized after creation.

-impl        0 (default), 1
             indicate if the script runs towards bitstream generation.  If
             set to 0, the script only creates the project and do not launch
             any run.

-post_impl   0 (default), 1
             indicate if MCS file is generated after bitstream generations.

-user_plugin PATH
             path to the user plugin repository.

# Design parameters

-build_timestamp VALUE
                 VALUE should be an 8-digit hexdecimal value without prefix.
                 It serves as the timestamp to identify the build and is
                 written into the shell register 0x0.  If not specified, the
                 date and time of the build is recorded using the format
                 MMDDhhmm, where MM is for month, DD for day, hh for hour
                 and mm for minute.

-min_pkt_len     [64, 256] (default to 64)
                 minimum packet length.

-max_pkt_len     [256, 9600] (default to 1514)
                 maximum packet length.

-use_phys_func   0, 1 (default)
                 indicates if the QDMA H2C and C2H AXI4-stream interfaces are
                 included in the 250MHz user logic box.  A common scenario
                 for not using them is networking accelerators without DMA.
                 Regardless the value of this option, the QDMA IP is always
                 present in the shell since it also provide the AXI-lite
                 interfaces for register access.

-num_phys_func   [1, 4] (default to 1)
                 number of QDMA physical functions per QDMA subsystem.

-num_qdma        1 (default), 2
                 number of QDMA subsystems, subjects to the board model.

-num_queue       [1, 2048] (default to 512)
                 number of QDMA queues.

-num_cmac_port   1 (default), 2
                 number of CMAC ports, subjects to the board model.

Build Process

The build process involves four steps.

  1. IP creation and optionally, out-of-box synthesis.
  2. Design project setup.
  3. Synthesis, implementation and bitstream generation.
  4. Post-processing.

By default, the script completes the first two steps, producing a Vivado project under the build directory. This can be customized using the -synth_ip, -impl and -post_impl options.

  • If -synth_ip is set to 0, the IP out-of-box synthesis is deferred.
  • If -impl is set to 1, the third step is performed.
  • If -post_impl is set 1, the post-processing step is performed after bitstream generation.

The build directory is located under build and named as [BOARD]_[TAG]. Under the build directory, there is a text file, DESIGN_PARAMETERS, which contains the parameters passed to the RTL top-level. All the IP files are stored under vivado_ip. The shell project is under open_nic_shell.

The following Verilog macros are defined and made available to the RTL source code.

  • The __synthesis__ macro.
  • Board name, either __au250__, __au280__, __au50__, __au55c__, __au55n__ or __au45n__.

Build without Github Access from Vivado

If Vivado does not have access to Github, we need to have a local copy of the Xilinx Board Store repository, and pass the path to the build script via the -board_repo option.

User Plugin Integration

OpenNIC shell provides 2 user logic boxes for instantiating custom RTL logic, one running at 250MHz and the other at 322MHz. To build with custom plugins, pass the path of the plugin repository to the -user_plugin argument of the build script. Default plugins are available for both boxes under plugin/p2p/box_250mhz and plugin/p2p/box_322mhz respectively, which does simple port-to-port connection. If users are interested in only one box, they should instantiate the default plugin in the other one.

Each box requires a top-level wrapper for all the user plugins instantiated in that box. In other words, only one module should be instantiated in each box. There are a few rules on how to structure the -user_plugin. At a minimum level, it should look like the following.

|-- USER_PLUGIN_DIR --
    |-- box_250mhz --
        |-- user_plugin_250mhz_inst.vh
        |-- box_250mhz_address_map_inst.vh
        |-- box_250mhz_address_map.v
        |-- box_250mhz_axi_crossbar.tcl
    |-- box_322mhz --
        |-- user_plugin_322mhz_inst.vh
        |-- box_322mhz_address_map_inst.vh
        |-- box_322mhz_address_map.v
        |-- box_322mhz_axi_crossbar.tcl
    |-- ... --
    |-- build_box_250mhz.tcl
    |-- build_box_322mhz.tcl
    |-- ...

The files under box_250mhz and box_322mhz are glue code that connects the user plugins to OpenNIC shell.

  • user_plugin_XXXmhz_inst.vh is a Verilog header file that instantiates the top-level wrapper. It is included into src/box_XXXmhz/box_XXXmhz.sv.
  • box_XXXmhz_address_map_inst.vh is a Verilog header file that instantiates the box_XXXmhz_address_map module. It is included into src/box_XXXmhz/box_XXXmhz.sv.
  • box_XXXmhz_address_map.v implements the register address mapping in box_XXXmhz.
  • box_XXXmhz_axi_crossbar.tcl creates the AXI crossbar IP instantiated in box_XXXmhz_address_map.v.

Currently, these files need to be created and modified manually. In the next release, they will be auto-generated through a configuration file.

For each box, the build script performs the following steps.

  1. Source box_XXXmhz_axi_crossbar.tcl.
  2. Read box_XXXmhz_address_map.v.
  3. Add USER_PLUGIN_DIR/box_XXXmhz to the include_dirs property.
  4. Source build_box_XXXmhz.tcl.

To use the default plugin (i.e., plugin/p2p) in one of the boxes, remove the corresponding box_XXXmhz directory and build_box_XXXmhz.tcl.

When the design is configured with two QDMA subsystems, two sets of AXI4-stream interfaces are provided to box_250MHz. The default p2p plugin of box_250MHz has one ingress switch and one egress switch per QDMA physical function. For example, P2P plugin of U45N has totaly four AXI4-stream switches. To select the data path between MAC and two QDMA subsystems, AXI4-stream switch control registers are used. The user can select one QDMA subsystem to be used exclusively for transmitting or receiving traffic. Please refer to AXI4-Stream Switch Control Register of PG085 for more details.

For example:

Data path: MAC <-> QDMA subsystem 0 (On U45N, it is connected to host CPU) Select QDMA subsystem 0:

# BDF is the bus ID of first physical function of QDMA's
bdf=<PCIe BDF> #e.g. d9
# Ingress port 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100040 w*1 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100000 w*1 0x2
# Egress port 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x1000c0 w*1 0x0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100080 w*1 0x2
# Ingress port 1
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100140 w*1 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100100 w*1 0x2
# Egress port 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x1001c0 w*1 0x0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100180 w*1 0x2

Data path: MAC <-> QDMA subsystem 1 (On U45N, it is connected to onboard Arm CPU) To select QDMA subsystem 1:

# BDF is the bus ID of first physical function of QDMA's
bdf=<PCIe BDF> #e.g. d9
# Ingress port 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100040 w*1 0x80000000
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100044 w*1 0x0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100000 w*1 0x2
# Egress port 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x1000c0 w*1 0x1
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100080 w*1 0x2
# Ingress port 1
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100140 w*1 0x80000000
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100144 w*1 0x0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100100 w*1 0x2
# Egress port 0
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x1001c0 w*1 0x1
pcimem /sys/bus/pci/devices/0000\:${bdf}\:00.0/resource2 0x100180 w*1 0x2

Shell Interface

Shell and user logic boxes communicates through 3 types of interfaces.

  • AXI-lite interface running at 125MHz for register access.
  • AXI4-stream interface running at either 250MHz or 322MHz for data path.
  • Synchronous reset interface running at 125MHz.

The 125MHz and 250MHz clock domains are phase aligned. Signals can be sampled across each domain without clock domain crossing. On the other hand, do note that different clock frequencies can lead to double sampling or missing samples.

The AXI-lite interface follows the standard AXI4-lite protocol without wstrb, awprot and arprot. At the system level, the address ranges for the 2 boxes are

  • 0x100000 - 0x1FFFFF for the 322MHz box, and
  • 0x200000 - 0x2FFFFF for the 250MHz box.

The 250MHz and 322MHz AXI4-stream interfaces have slightly different semantics. The 250MHz interface has the following signals.

  • tvalid, 1 bit: same as standard AXI4-stream protocol.
  • tdata, 512 bits: data maps from lower to upper bytes.
  • tkeep, 64 bits: null bytes are only allowed when both tvalid and tlast are asserted and cannot be followed by other data bytes for the same packet. For example, a 96B packet has a tkeep value of 0xFFFFFFFFFFFFFFFF in the first beat, and 0x00000000FFFFFFFF in the second beat.
  • tlast, 1 bit: same as standard AXI4-stream protocol.
  • tuser_size, 16 bits: field to specify packet size. It contains the number of bytes in the packet and must remain valid and unchanged if tvalid is asserted.
  • tuser_src, 16 bits: source of the packet.
  • tuser_dst, 16 bits: destination of the packet.
  • tuser_user, N bits: side-band user data.
  • tready, 1 bit: same as standard AXI4-stream protocol.

For the tkeep signal, the interface assumes its presence depending on the direction of a packet.

  • For packets exiting the shell, it is guaranteed that tkeep is consistent with tuser_size. This means that tkeep consists of all 1s when tvalid is asserted and tlast is de-asserted and shows a bit-mask for the valid bytes in the beat when both tvalid and tlast are asserted.
  • For packets entering the shell, tkeep can be optionally set to all 1s regardless of the value of tuser_size. This allows user plugins to drop tkeep in their implementation. If tkeep is not set to all 1s, it must be consistent with tuser_size.

For tuser_src and tuser_dst, they have the following format. For packets exiting the shell, tuser_src is marked accordingly and tuser_dst is all 0s.

----------------------------------------------------
| 15                         6 |    4 |          0 |
|------------------------------|------|------------|
| P  P  P  P  P  P  P  P  P  P | R  R | F  F  F  F |
|------------------------------|------|------------|
|            MAC ports         | Rsvd |  PCIe PFs  |
----------------------------------------------------

The 322MHz interface is more restrictive due to requirements from CMAC IP. It has the following signals.

  • tvalid, 1 bit: no de-assertion in the middle of a packet. In other words, once tvalid is asserted to indicate the start of packet, it must remain asserted until tlast.
  • tdata, 512 bits: same as in the 250MHz interface.
  • tkeep, 64 bits: same as in the 250MHz interface.
  • tlast, 1 bit: same as in the 250MHz interface.
  • tuser_err, 1 bit: indicates if the packet contains an error.
  • tready, 1 bit: same as in the 250MHz interface. But packets entering the shell, i.e., from the RX side of CMAC IPs, the tready signal is not present and the master side assumes that tready is always asserted.

Behavioral simulation

Currently, cocotb (https://www.cocotb.org/) and modelsim (https://eda.sw.siemens.com/en-US/) are used for simulation. cocotb allows writing testbenches in python.

Building for simulation

Example command:

vivado -mode tcl -source ./build.tcl -tclargs \
  -board au280 \
  -num_cmac_port 2 -num_phys_func 2 \
  -sim 1 \
  -sim_lib_path $HOME/opt/xilinx_sim_libs/Vivado2021.2/compile_simlib \
  -sim_exec_path $HOME/opt/modelsim/modelsim-se_2020.1/modeltech/linux_x86_64 \
  -sim_top p2p_250mhz
  • sim 1 builds simulation sources.
  • sim_lib_path specifies libraries used by Xilinx IPs.
  • If the directory pointed by sim_lib_path is empty, the build script will compile and install the sim libraries. This step is one time and can take ~hours. This simulation libraries are global and can be shared across projects.
  • sim_exec_path is path to modelsim installation (i.e., the directory containing the vsim executable).

This command will create simulation sources in build/<board>_<tag>/open_nic_shell/open_nic_shell.sim/sim_1/behav/modelsim. These sources can be used to test any module instantiated within sim_top.

Running simulation

cd build/<board>_<tag>/open_nic_shell/open_nic_shell.sim/sim_1/behav/modelsim

# Symlink helper scripts
ln -s <path-to>/open-nic-shell/script/tb/* ./
# Example:
ln -s ../../../../../../../script/tb/* ./

# Symlink the test bench files
ln -s <path-to>/open-nic-shell/plugin/p2p/box_250mhz/tb/<module-to-test>/* ./
# Example:
ln -s ../../../../../../../plugin/p2p/box_250mhz/tb/* ./

# Compile the simulation sources using modelsim
./compile.sh  # (This script is autogenerated during building step by Vivado)

# Run the simulation
DUT=<module-to-test> GUI=0 DEBUG=0 ./run.sh  # run.sh is linked from the script/tb directory
# Example
DUT=p2p_250mhz_wrapper ./run.sh
## Set GUI=1 to see waveform

Note, in the above example, the top level module specified to Vivado was p2p_250mhz, while the top module being simulated is p2p_250mhz_wrapper. This can be done since we also copied p2p_250mhz_wrapper.sv. In general any module that is included in the top module specified to Vivado can be simulated with the exported sources.

How the simulation works

Simulation setup

Open nic shell uses Xilinx IPs, these cannot be directly simulated by third party simulators. Third party simulators require Vivado to compile simulation libraries.

For any project, Vivado can output simulation sources for a set of supported simulators. Vivado will understand the dependencies in the project and created sources for simulating a specified top level module.

These sources are then compiled by the simulator (modelsim in this case). The run file executes modelsim and also tells it to use cocotb (using the pli flag), cocotb knows the test bench file using the exported env variables in run.sh file. (See https://docs.cocotb.org/en/stable/custom_flows.html for reference on how to tell modelsim that cocotb is being used).

Simulation execution

The python test bench executes python code until it reaches a point where it needs simulation of the verilog design, at that point modelsim takes over until some signals need to be set by the python test bench and this process continues.

Note: There is no fundamental restriction to use above tools. One can write a testbench in verilog to simulate parts of open nic shell using Vivado simulator (https://docs.xilinx.com/r/en-US/ug900-vivado-logic-simulation).

Programming FPGA

After bitstream generation, FPGA can be programming in two ways.

  1. Program the device directly. The FPGA configuration will lose after reboot.
  2. Program and boot from the configuration memory.

Because the shell bitstream contains a PCI-e IP core, both approaches will cause the lost of PCI-e link. It could be fatal depending on the model of host motherboards. For example, for Dell servers, losing discovered PCI-e links could lead to a forced reboot triggered by iDRAC.

To avoid this, use the Bash script script/setup_device.sh, which disables the reporting of fatal errors to PCI-e root complex before programming and triggers a PCI-e link re-scan after programming. The script takes the device BDF (i.e., BB:DD.FF without the domain, which usually is 0000) as the single input, and should be ran on the server with the FPGA card.

There are two limitations related to the script. First, if an FPGA is not yet programmed with a PCI-e enabled bitstream, it would not have an BDF address. In this case, the script does not work. The second case is when OpenNIC driver is already loaded, it can hang the kernel after the script issues link re-scan. This issue is planned to be addressed in the future driver release.

For the above two cases, the safest workaround is to use a different server to program the FPGA. For configuration memory programming, do a cold reboot to trigger the FPGA boot process.

For the "Program the device directly" way, consider using/referring to script/program_fpga.sh. Note, if this is the first time programming FPGA with the bitstream then the server needs to be warm rebooted. No reboot is required for later programming (unless a cold reboot is issued).

Known Issues

Server Boot Failure after FPGA Programming

A warm reboot is needed after loading the bitstream onto the FPGA. But this reboot fails with the error message:

A PCIe link training failure is observed in Slot1 and the link is disabled.

For Dell servers, there is a temporary hack discussed here. The trick is to issue a second warm reboot command using iDRAC while the system is rebooting and before PCIe endpoint detection. The hypothesis is that this gives enough time to load the configuration on the FPGA. This seems to be working so far.

CMAC license

Some users have reported this error when trying to build the shell.

ERROR: [Common 17-69] Command failed: This design contains one or more cells
for which bitstream generation is not permitted:
`cmac_subsystem_inst/cmac_wrapper_inst/cmac_inst/inst/i_cmac_usplus_0_top
(<encrypted cellview>)`.  If a new IP Core license was added, in order for
the new license to be picked up, the current netlist needs to be updated by
resetting and re-generating the IP output products before bitstream
generation.

Since CMAC is hardened in Ultrascale+, cmac_usplus has a free license. To get the CMAC license, go to www.xilinx.com/getlicense. After login, click "Search Now" in the "Evaluation and No Charge Cores" box on the right side of the page. You will see a popup with a "Search" box at top left. Enter "100G" in the search box. You will see "UltraScale+ Integrated 100G Ethernet No Charge License". Select this and click "Add". Then you'll be able to find it in the "Certificate Based Licenses" table. Select it and click "Generate Node-Locked License". A screenshot could be found here.


Copyright Notice and Disclaimer

This file contains confidential and proprietary information of Xilinx, Inc. and is protected under U.S. and international copyright and other intellectual property laws.

DISCLAIMER

This disclaimer is not a license and does not grant any rights to the materials distributed herewith. Except as otherwise provided in a valid license issued to you by Xilinx, and to the maximum extent permitted by applicable law: (1) THESE MATERIALS ARE MADE AVAILABLE "AS IS" AND WITH ALL FAULTS, AND XILINX HEREBY DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under or in connection with these materials, including for any direct, or any indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same.

CRITICAL APPLICATIONS

Xilinx products are not designed or intended to be fail-safe, or for use in any application requiring failsafe performance, such as life-support or safety devices or systems, Class III medical devices, nuclear facilities, applications related to the deployment of airbags, or any other applications that could lead to death, personal injury, or severe property or environmental damage (individually and collectively, "Critical Applications"). Customer assumes the sole risk and liability of any use of Xilinx products in Critical Applications, subject only to applicable laws and regulations governing limitations on product liability.

THIS COPYRIGHT NOTICE AND DISCLAIMER MUST BE RETAINED AS PART OF THIS FILE AT ALL TIMES.

open-nic-shell's People

Contributors

108anup avatar albert-llimos avatar cen256 avatar cneely-amd avatar gbrebner avatar harisj-xlnx avatar hyunok-kim avatar laochonlam avatar luminousxlb avatar lyftfc avatar msbaz2013 avatar vic0428 avatar yanz-xlnx avatar zhiyisun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-nic-shell's Issues

Can we use the vivado for the behavioral simulation to instead of the modelsim simulations

Hello, I want to check the signals/ wave of the data.
When I directly use the vivado for simulation, it occurs the “Failed to locate 'vsim' executable in the shell environment 'PATH' variable. Please source the settings script included with the installation and retry this operation again.” error.
In this end, I want to us can we use the vivado for the behavioral simulation to instead of the modelsim simulations for openNIC?

Thank you so much in advance for your reply.

Best regards~

OpenNIC on U55N can't pass traffic

I use the below command to compile the project.

vivado -mode batch -source build.tcl -tclargs -board au55n -impl 1 -post_impl 1 -num_phys_func 2 -num_cmac_port 2

Then, I send traffic from both port 0 and port 1 from the host CPU side. But none of them can send any packet out.
In the RX direction, only port 0 can receive the traffic. The traffic to port 1 will also be received by port 0 on the host CPU side instead of port 1.

Synthesis error board u45n

I encountered a synthesis error while building the OpenNIC shell for the Alveo U45N board. The error message is as follows:
ERROR: [Vivado 12-13638] Failed runs(s) : 'synth_1'
'synth_1' run failed with below errors.
ERROR: [Synth 8-524] part-select [23:0] out of range of prefix 'qdma_pcie_rxp' [F:/open-nic-shell-main/src/open_nic_shell.sv:489]
ERROR: [Synth 8-6156] failed synthesizing module 'open_nic_shell' [F:/open-nic-shell-main/src/open_nic_shell.sv:20]
I used the vivado 2024.1 version and this command: F:\xilinx\Vivado\2024.1\bin\vivado.bat -mode batch -source build.tcl -tclargs -board au45n -impl 1 -user_plugin ../plugin/p2p

Please add U55C support

I see you have added U55N support. Adding U55C is very straightforward from there. I have done it manually on my end to get my project done but I see it is started as a branch. I would start a branch/pull request but if someone already started this, please pull U55C changes in.

The link cannot be established on U200.

I built the project with the command vivado -mode batch -source build.tcl -tclargs -board_repo ../XilinxBoardStore-master/ -board au200 -jobs 32 -num_phys_func 2 -num_cmac_port 2. And the project was built successfully. I programmed the device, insert the driver, and connected two ports directly with a QSFP28 DAC cable. But the link cannot be established. Then I read the register STAT_RX_STATUS_REG of cmac with pcimem, it shows:

[root@localhost pcimem]# ./pcimem /sys/bus/pci/devices/0000:05:00.0/resource2 0x8204 w
/sys/bus/pci/devices/0000:05:00.0/resource2 opened.
Target offset is 0x8204, page size is 4096
mmap(0, 4096, 0x3, 0x1, 3, 0x8204)
PCI Memory mapped to address 0x7fab473ce000.
**_0x8204: 0x000000C0_**

According to the cmac manual, that means stat_rx_local_fault=1, stat_rx_internal_local_fault=1.
I am sure that the DAC cable has no problem. How can I solve this problem? Thank you wery much.

Timing closure problem with Alveo U200 and 2 interfaces

Hi!

I generated the project with command:
vivado -mode tcl -source build.tcl -tclargs -board_repo G:/Xilinx/Vivado/2020.2/data/boards/board_files -board au200 -num_phys_func 2 -num_cmac_port 2
(i added the "num_phys_func" argument bcause the included user plugin-box needed it)

The project builds with Vivado 2020.2, but with timing closure errors on the PCIe clock:
opennic_timefail

Project builds with 1 interface without problem.

L.

How to write HLS plugins for Box 250 MHz

Hi all,

Whishing you have a good day.

Our project is to analyzing the incoming package data from Telecom network, filter the packages, modify the packages, grouping some packages, sending to host for future processing
We would like to write a plug-in to handle above works. As we are not family with Verilog/System Verilog, we would like to make it by using Vitis HLS C/C++ or P4. Our cards are Alveo U50.
Please advice if you have any guide line, examples, recommendations on this !

Thanks & Best Regards,
Quang Nguyen

Is the definition of MAC port in "tuser_src" invalid?

I noticed that the "s_axis_c2h_tuser_src" in the file "qdma_subsystem_function" is not being used, even though it is described as an interface for describing the source MAC port and source PCIe-PFs in Page10.

Is this a bug?

Similarly, I found that "m_axis_h2c_tuser_src" only operates as follows: "254 assign m_axis_h2c_tuser_src = 16'h1 << FUNC_ID". This means it only operates on PCIe-PFs and doesn't affect MAC ports,which contradicts the description in document P10 where bits[15:6] are supposed to represent MAC ports.

Can we use OpenCL to implenent our IP kernal on the user plguins?

Thank you so much for your hard and valuable work.
We are interested in developing our IP kernel on the user plugins through OpenNIC on our U250 platform. Can we use OpenCL to implement the application on the user plugins? Also, where we can find specific solutions or examples of the user plugins of OpenNIC?

I am looking forward your comments and suggestions.
Thank you so much in advance!

Question about "cmac_usplus_1_au280.tcl"

Hi all,

Thanks for your contributing the great OpenNIC project! And I have one question about GT_LANE configuration in cmac_usplus_1_au280.tcl

    CONFIG.CMAC_CORE_SELECT {CMACE4_X0Y7}
    CONFIG.GT_GROUP_SELECT {X0Y44~X0Y47}
    CONFIG.LANE1_GT_LOC {X0Y40}
    CONFIG.LANE2_GT_LOC {X0Y41}
    CONFIG.LANE3_GT_LOC {X0Y42}
    CONFIG.LANE4_GT_LOC {X0Y43}

It seems that the LANEx_GT_LOC and GT_GROUP_SELECT are mismatch. Or there are some reasons behind that?

Really appreciate for your help!

~Vic

Unable to read from user box address map using pcimem

I am using following command to read the first address of the user box @ 250MHz.
sudo ./pcimem /sys/bus/pci/devices/0000:3b:00.0/resource2 0x100000

I get following error:

/sys/bus/pci/devices/0000:3b:00.0/resource2 opened.
Target offset is 0x100008, page size is 4096
mmap(0, 4096, 0x3, 0x1, 3, 0x100000)
Error at line 111, file pcimem.c (22) [Invalid argument]

From the docs on mmap (https://linux.die.net/man/2/mmap) this error suggests "We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary)." Since this command sudo ./pcimem /sys/bus/pci/devices/0000:3b:00.0/resource2 0x10480 works fine and since pcimem does page alignment, I am guessing the issue is that we are reading a large offset (0x100000 vs 0x10480). Is there a work around to read the large offset?

Based on billfarrow/pcimem#13 (comment), I tried removing the onic kernel module, if that interferes with the mmap, but that did not help.

Additional details:

Note, in the 250MHz user box, I instantiated an axi_lite_register (independent clock mode, with register using axis clock (250 MHz) and axi-lite using 125MHz clock) that returns the value maintained by an axi_stream_size_counter (snooping on tx slave axis interface) on address offset 0. I additionally generated a reset signal in the axis clock domain (used for axi_lite_register) by updating the generic_reset module instance.

I can try running the pcimem command on vanilla open nic shell, but it will likely give the same error as the command fails to read addresses mapped to the 322 MHz box as well which uses the default p2p plugin.

pcimem ➜  sudo ./pcimem /sys/bus/pci/devices/0000:3b:00.0/resource2 0x200000
/sys/bus/pci/devices/0000:3b:00.0/resource2 opened.
Target offset is 0x200000, page size is 4096
mmap(0, 4096, 0x3, 0x1, 3, 0x200000)
Error at line 111, file pcimem.c (22) [Invalid argument]

For reference, reading 0x10480 gives output:

/sys/bus/pci/devices/0000:3b:00.0/resource2 opened.
Target offset is 0x10480, page size is 4096
mmap(0, 4096, 0x3, 0x1, 3, 0x10480)
PCI Memory mapped to address 0x7fd1268e9000.
0x10480: 0x0000A618

Based on my understanding, this address corresponds to SLR1 temperature in sysmon module and A618 translates to around 49 deg C (based on the transfer function in sysmon docs).

Issue building OpenNIC on U55C

Hi,
I was trying to build OpenNic targeting the U55C board using Vivado 2022.1 version and seeing this strange issue. I was hoping to get both ports going with DPDK driver so I gave the following build command with -num_phys_func 2 -num_cmac_port 2 stated in DPDK build suggestion.

../script$ vivado -mode batch -source build.tcl -tclargs -board au55c -tag build1 -synth_ip 1 -impl 1 -post_impl 1 -max_pkt_len 9000 -num_phys_func 2 -num_cmac_port 2

am I sending in wrong parameters for a 2 port 100G OpenNIC pipeline?

WARNING: [Vivado 12-818] No files matched '/disk2/opennic/open-nic-shell-u55c/open-nic-shell/build/au55c_build1/vivado_ip/clk_wiz_50Mhz/clk_wiz_50Mhz.xci'
WARNING: [Vivado 12-818] No files matched '/disk2/opennic/open-nic-shell-u55c/open-nic-shell/build/au55c_build1/vivado_ip/clk_wiz_50Mhz/clk_wiz_50Mhz.xci'
expected floating-point number but got "Unable to get value from speedsfile for keyword MM"
ERROR: [IP_Flow 19-3476] Tcl error in create_gui procedure for IP 'clk_wiz_50Mhz'. expected floating-point number but got "Unable to get value from speedsfile for keyword MM"
ERROR: [IP_Flow 19-3428] Failed to create Customization object clk_wiz_50Mhz
CRITICAL WARNING: [IP_Flow 19-5622] Failed to create IP instance 'clk_wiz_50Mhz'. Failed to customize IP instance 'clk_wiz_50Mhz'. Failed to load customization data
ERROR: [Common 17-69] Command failed: Create IP failed with errors

    while executing
"source ${ip_tcl_dir}/${ip}.tcl"
    invoked from within
"if {[file exists "${ip_tcl_dir}/${ip}_${board}.tcl"]} {
            source ${ip_tcl_dir}/${ip}_${board}.tcl
        } elseif {[file exists "${ip_tcl_d..."
    ("foreach" body line 30)
    invoked from within
"foreach ip $ips {
        # Pre-save IP name and its build directory to a global dictionary
        dict append ip_dict $ip ${ip_build_dir}/${ip}

   ..."
    ("dict for" body line 17)
    invoked from within
"dict for {module module_dir} $module_dict {
    set ip_tcl_dir ${module_dir}/vivado_ip

    # Check the existence of "$ip_tcl_dir" and "${ip_tcl_dir}/..."
    (file "build.tcl" line 255)

U55N support?

I would love to see support for the C1100 (U55N) in this!

Other versions of Vivado supported?

Hi OpenNIC developers,

I just found out that OpenNIC only supports 2020.2. Is there possible to use OpenNIC in other versions like 2020.1?

Thanks,
Lam

Can not generate the bistream on Alveo U250

When we tried to generate the open-nic-shell on Alveo U250, we got the below errors:

[Common 17-69] Command failed: This design contains one or more cells for which bitstream generation is not permitted:
cmac_port[1].cmac_subsystem_inst/cmac_wrapper_inst/cmac_inst/inst/i_cmac_usplus_1_top (<encrypted cellview>)
cmac_port[0].cmac_subsystem_inst/cmac_wrapper_inst/cmac_inst/inst/i_cmac_usplus_0_top (<encrypted cellview>)
If a new IP Core license was added, in order for the new license to be picked up, the current netlist needs to be updated by resetting and re-generating the IP output products before bitstream generation.

Thank you so much in advance for your comments and solution for this issues.

cmac_usplus_1 module problem.

Hello,

I installed an alveo u50 card on each of the two servers. The programming cable has been cross-connected.
My enviroment is below.

System Configuration
  OS Name              : Linux
  Release              : 5.15.0-91-generic
  Version              : #101~20.04.1-Ubuntu SMP Thu Nov 16 14:22:28 UTC 2023
  Machine              : x86_64
  CPU Cores            : 32
  Memory               : 128460 MB
  Distribution         : Ubuntu 20.04.6 LTS
  GLIBC                : 2.31
  Model                : X12DPi-N(T)6

XRT
  Version              : 2.13.466
  Branch               : 2022.1
  Hash                 : f5505e402c2ca1ffe45eb6d3a9399b23a0dc8776
  Hash Date            : 2022-04-14 17:43:11
  XOCL                 : unknown, unknown
  XCLMGMT              : unknown, unknown


Devices present
  0 devices found
BDF  :  Shell  Platform UUID  Device ID  Device Ready*

* Devices that are not ready will have reduced functionality when using XRT tools

I want to test the OPEN-NIC DPDK driver. This requires using two CMAC ports. So, I used the command below:
vivado -mode tcl -source build.tcl -tclargs -board au50 -jobs 32 -synth_ip 1 -impl 1 -post_impl 1 -num_phys_func 2 -num_cmac_port 2
However, it cannot find cmac_usplus1 module. I think this is because the cmac_usplus_1_au50.tcl file is missing. (https://github.com/Xilinx/open-nic-shell/tree/main/src/cmac_subsystem/vivado_ip)
I can make a bit stream if I use with num_cmac_port 1 argument.
How can I resolve this issue?

The result is
Pasted Graphic

Thanks!

Requested IP 'xilinx.com:ip:cms_subsystem:4.0' cannot be created.

Hello, I am using Vivado 2020.2 to compile the au250 model. However, I encountered an error that the IP "cms_subsystem" was not found in the entire file. Can you please advise on how to resolve this issue? Thank you.

ERROR: [Coretcl 2-1133] Requested IP 'xilinx.com:ip:cms_subsystem:4.0' cannot be created. The latest available version in the catalog is 'xilinx.com:ip:cms_subsystem:3.0'. If you do not wish to select a specific version please omit the version field from the command arguments, or use a wildcard in the VLNV.

Please add QSPI and CMS support

I have also added and tested these features on my local project and would add a branch/pull request but I see someone has started this branch already also. Please merge these features in if they are ready.

What do we do after programming the board with open nic shell ?

I am currently working on SN1022 and I have programmed the board successfully. What is the next step? How do I connect my smart NIC to the host and the ethernet and send the data for the simple wire plugin that is provided and also for some other plugins that I can generate?

U45N does not build

I cloned this repository from the current head: 8077751

From the script directory, I run this:

/m2ssd/Xilinx/Vivado/2022.1/bin/vivado -mode tcl -source build.tcl -tclargs -board au45n -overwrite 1 -jobs 12 -impl 1 -post_impl 1

After churning for a while, it fails with this:

ERROR: [Bitstream 40-47] File /m2ssd/tmp/open-nic-shell/build/au45n/open_nic_shell/open_nic_shell.runs/impl_1/open_nic_shell.bit does not exist.

Further digging shows the root cause in file ../build/au45n/open_nic_shell/open_nic_shell.runs/synth_1/runme.log:

ERROR: [Synth 8-524] part-select [23:0] out of range of prefix 'qdma_pcie_rxp' [/m2ssd/tmp/open-nic-shell/src/open_nic_shell.sv:489]

As you can see above, this is with Vivado 2022.1, one of the supported releases.

Alveo U200 not transmitting/receiving any packets

Hi!

I built the code without modification from the main repo with switches:
vivado -mode tcl -source build.tcl -tclargs -board au200 -num_phys_func 2 -num_cmac_port 2

Then implemented the project using Vivado 2021.2, and uploaded the firmware to the card. (There were no timing issues whatsoever)
The card was recognized after reboot (Ubuntu 20.04), and the onic-driver loaded without errors. (with RS_FEC_ENABLED=0)

Then I assigned IP addresses to the interfaces, and put the card into loopback mode by wrtiting '1' to 0x8090, and 0xC090.
I checked link status:
0x8200 = 0x00000000 (no Tx error)
0x8204 = 0x00000003 (Rx aligned OK) (also 0x820C, 0x8210 showed that all lanes are syned/locked: 0x000FFFFF)
(Also note, that link is not coming up, when I plug in 2 QSFP28 100GBASE-SR4 modules, and connect them together.)

Then I transmitted some packets, and checked by "ifconfig", that they show up under TX packets, but RX packets reads '0'.

I can see, that the packets reach the Tx packet adapter:
0xB000 = the amount of packets I transmitted
[CORRECTION: It always shows 0x21, no matter how many packets I transmit!
0xB008 = 0x11CE always, Tx dropped shows '0']
0xB020 = 0 (Rx packet adapter)

Also the CMAC "STAT_TX_TOTAL_PACKETS" register reads 0 (0x8500).

According to the manual, and what I see in code, the box_322 and packet adapter AXI-s are connected by default, so what could cause this issue?
The 100G interfaces work using the CMAC_USPLUS example code, in phys, and GT loopback mode too.

Thanks,
L.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.