Git Product home page Git Product logo

basejump_stl's Introduction

BaseJump Standard Template Library (STL) Repository

This library is a comprehensive hardware library for SystemVerilog that seeks to contain all of the commonly used HW primitives.

See this paper docs/BaseJump_STL_DAC_2018_Camera_Ready.pdf which describes the design and usage.

Please also see the BSG SystemVerilog Style Guide which describes many of the conventions used in this library, including the variants of the valid/ready handshaking protocols.

Note: bsg_misc/bsg_defines.sv contains many macros used by BaseJump STL. Make sure it is in your include path.

Contents

  • bsg_misc

Lots of digital building blocks, like counters, reset timers, gray to binary coders, etc.

  • bsg_mem

Portable SRAM and RF interfaces.

  • bsg_dataflow

For standalone modules involved in data plumbing. E.g. two-element fifos, fifo-to-fifo transfer engines, sbox units, compare_and_swap, and array pack/unpack.

  • bsg_async

This is for asynchronous building blocks, like the bsg_async_fifo, synchronizers, and credit counters.

Note: for tapeouts, you will need to pay attention to the physical design and timing constraints for these components.

  • bsg_noc

Network on chip implementations

  • bsg_cache

Reusable Cache implementation

  • bsg_link

High speed off-chip communication link (over LVCMOS I/Os, can hit 1.2 Gbps per pin to FPGA).

Unidirectional off-chip high-speed source synchronous communication interface. (also used as FPGA bridge).

  • bsg_clk_gen

Open source portable clock generator (all-standard cell)

  • bsg_dmc

LPDDR1 Dram Controller and PHY. Requires advanced knowledge to tapeout.

  • bsg_test

Data, clock, and reset generator for test benches.

  • testing

Mirrors the other directories, with tests.

  • hard

Mirrors other directories, contains replacement files for specific process technologies.

Contact

Email: [email protected]

basejump_stl's People

Contributors

akankshabaranwal avatar barnold3 avatar ctorng avatar dalance avatar dpetrisko avatar drichmond avatar farzamgl avatar flaviens avatar gaozihou avatar hema0730 avatar infinitymdm avatar leonardxiang avatar luzh avatar mrutt92 avatar muwyse avatar mysoreanoop avatar robertcrist avatar rovinski avatar shawnless avatar songchun-li avatar sripathi-muralitharan avatar stdavids avatar tanglingshu avatar taylor-bsg avatar tommydcjung avatar vb000 avatar vegaluisjose avatar xusine avatar yuan-mao avatar zaazad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

basejump_stl's Issues

question about testing

I was reading the paper and noted the following ...

(9) Testing Suite. The STL shall have unit tests for each module.

Is this the testing suite?

... is the above assertion correct (for each module)?

[QUALITY] PPA regression

We should add a PPA regression to track PPA across synthesis tools, versions and pull requests.

bsg_cache: support 64 bit word length

Needed for BlackParrot and other 64 bit processors. Support looks to be mostly there, items I see:

  1. modify read/write mask
  2. modify test makefiles to accept data width as a param
  3. modify trace_gen.py script to output 64 bit traces

System task causes a warning in Vivado

$fatal( 1, "Error: you must admit this is a bad idea before you are allowed to use the bsg_dlatch module!" );

Don't know if it's worth addressing - or if it can be - but here's the warning:

WARNING: [Synth 8-1921] elaboration system task fatal violates IEEE 1800 syntax [/mnt/bsg/diskbits/dustinar/bsg_bladerunner/bsg_f1_621ec48/cl_manycore/build/src_post_encryption/bsg_dlatch.v:10]```

1RW sync bitmask does not build in verilator

When you try to compile bsg_mem/bsg_mem_1rw_sync_mask_write_bit_synth.v with Verilator, you get this error:

%Error-BLKLOOPINIT: /home/petrisko/bitbucket/test/pre-alpha-release/basejump_stl/bsg_mem/bsg_mem_1rw_sync_mask_write_bit_synth.v:42: Unsupported: Delayed assignment to array inside for loops (non-delayed is ok - see docs)

Verilator does not like this construct

 always_ff @(posedge clk_i)
   if (v_i & w_i)
       for (i = 0; i < width_p; i=i+1)
         if (w_mask_i[i])
           mem[addr_i][i] <= data_i[i];

This would work functionally, but may not synthesize to the same, depending on compiler quality.

 always_ff @(posedge clk_i)
   if (v_i & w_i)
     mem[addr_i] <= (~wmask_i & mem[addr_i]) | (w_mask_i & data_i); 

More Documents is needed

There are too many modules in the library. Sometimes a simple function can be implemented by two or more modules. I cannot find a doc to help me to choose the proper module. It is better to offer

  1. A user manual to specify the interface and function of each module. It can also help the user to understand the differences between similar module.
  2. A design specification to explain the principle of each module. It maybe useful for senior designer.

Trace-replay assembler

Would be nice to provide a nice, clean api for hooking into trace-replay (rather than everyone rolling their own python generators). I'm picturing a 1 file python library that you can call with your own higher level test-generator

Additional modules for bsg_fpu

In order to support RV64F

  1. Floating point divide (most likely implemented as FP inversion and then multiplication)
  2. Floating point square root
  3. Floating point classify (decode types of NaNs and subnormal numbers)

efficiency bsg_parallel_in_serial_out_dynamic

This module uses a full twofer FIFO to hold the packet even though data is draining out serially. Much like SIPO, this can be improved by using one element FIFOs except for the top flit which is two element. This way, it can accept a new word even if it is uncertain whether the last flit will be dequed. The logic is a little more complicated because the location of the last flit changes.

M

unlikely but important async fifo gotcha

,.r_data_o(r_data_o[0+:(width_p - control_width_p)] )

Although the async FIFO ensures that the SRAM is never written in the same place as it currently being read, in a synthesized SRAM, it is theoretically possible that changes resulting from the write can propagate to the output either because the FIFO is empty and a write is being done to where the read is currently pointing or simply because the MUX that is MUXing among register items is glitchy. It is thus theoretically possible, depending on synthesis results, for async writes to cause temporary glitches at inopportune times that violate setup and hold time of the destination flops. A more stable solution would be to add a hardened data gate that prevents data changes from propagating when a read is not being performed.

Convert wormhole router to bsg_noc_links

Can we convert the wormhole routers from a ready-valid interface to bsg_ready_and_link_sif links? That would make it much easier to stitch together a mesh in the standard way. Preferably we can change this quickly before we make the off-chip network in BlackParrot, so that we don't have to redo work. Also, can we change local_x_cord_i to my_x_i for consistency?

Current

  (input clk_i
  ,input reset_i
  
  // Configuration
  ,input [x_cord_width_p-1:0] local_x_cord_i
  ,input [y_cord_width_p-1:0] local_y_cord_i
  
  // Input Traffics
  ,input [dirs_lp-1:0] valid_i // early
  ,input [dirs_lp-1:0][width_p-1:0] data_i 
  ,output [dirs_lp-1:0] ready_o // early
  
  // Output Traffics
  ,output [dirs_lp-1:0] valid_o // early
  ,output [dirs_lp-1:0][width_p-1:0] data_o 
  ,input [dirs_lp-1:0] ready_i // early

mesh_router

   (
    input clk_i
    , input reset_i

    , input  [dirs_lp-1:0][bsg_ready_and_link_sif_width_lp-1:0] link_i
    , output [dirs_lp-1:0][bsg_ready_and_link_sif_width_lp-1:0] link_o

    , input [x_cord_width_p-1:0] my_x_i
    , input [y_cord_width_p-1:0] my_y_i
    );

Request for structs as NOC links' data field

`define declare_bsg_then_ready_link_sif_s(in_data_width,in_struct_name)\

`define declare_bsg_then_ready_link_sif_s(in_data_width,in_struct_name)\
   typedef struct packed {                                                \
      logic       v;                                                      \
      logic       then_ready_rev;                                         \
      logic [in_data_width-1:0] data;                                     \
  } in_struct_name

In macros like above, data fields in NOC links are more often than not packets and packets are almost always structs. It's hard to interpret the data section as is, while debugging. Wouldn't it be better to take a "data type" argument instead of data width?

I'm proposing we could replace in_data_width with something like data_type_s

`define declare_bsg_then_ready_link_sif_s(data_type_s,in_struct_name)\
   typedef struct packed {                                                \
      logic       v;                                                      \
      logic       then_ready_rev;                                         \
      data_type_s data;                                     \
  } in_struct_name

@proftaylor @tommydcjung @dpetrisko What do you think?

Unguarded delay statement in bsg_wormhole_router_generalized

%Warning-STMTDLY: /home/petrisko/bitbucket/pre-alpha-release/basejump_stl/bsg_noc/bsg_wormhole_router_generalized.v:45: Unsupported: Ignoring delay on this delayed statement.
%Warning-STMTDLY: Use "/* verilator lint_off STMTDLY */" and lint_on around source to disable this message.

I think we need `ifndef VERILATOR surrounding this. This will just get rid of the assertion in Verilator. Otherwise, the assertion may fire only in verliator testbenches. Another solution would be to find a way to mask the assertion when not useful without delay (since delay is awful for many other reasons).

Should use bsg_cadenv to set up tests

Right now, if you run many of the tests in testing (testing/bsg_noc/bsg_wormhole_router_adapter_in), they expect you to have vcs on your PATH, along with license servers set up, etc.

We should have a public bsg_cadenv repo where users can set up the CAD variables needed and then Makefiles in BaseJump STL should use the correct variables and make no assumptions about environment.

This could be a shorter term solution to #36, or possibly orthogonal. I don't know how license servers work with FuseSoC

bsg_fifo_multiplexed

This would be a multiplexed FIFO implementation, which leverages a single hardened SRAM to store state. The idea is that on input, you have a tag which is which channel it corresponds to. And somehow you have the ability to pull data based on what channel you want to access.

A lot of different implementations, TBD what the right interface is.

Add 1R1W "small fifo"

Would use hardened 1R1W and a bsg_dff_en_bypass so that it can grab the read element out. Need to be careful about simultaneous read and write to same address of the 1R1W.

Improvement to bsg_mesh_stitch

Should be able to instantiate repeaters between nodes of the mesh, for long links.

proposed parameter:

parameter int repeater_depth_lp [max_y_dim_p*max_x_dim_p:0] = 
  '{1, 0, 0, 2, 3, 0}

Test for bsg_mem_banked_crossbar appears to be out-dated/broken

Hi, I'm trying to evaluate the bsg_mem_banked_crossbar for a project I'm working on.
I can't seem to get the test to work correctly.
Using modelsim 2015, I only get

Error: Error at time: (time), no transaction in a cycle

I changed

bsg_mem_banked_crossbar #( .bank_size_p (bank_size_lp)
,.num_ports_p (ports_lp)
,.num_banks_p (banks_lp)
,.data_width_p (data_width_lp)
) UUT
( .clk_i (clk)
,.reset_i (reset)
,.v_i (test_input_v)
,.w_i (test_input_w)
,.addr_i (test_input_addr)
,.data_i (test_input_data)
,.mask_i (test_input_mask)
,.yumi_o (test_output_yumi)
,.v_o (test_output_v)
,.data_o (test_output_data)
);

to

bsg_mem_banked_crossbar #( .bank_size_p  (bank_size_lp)
                            ,.num_ports_p  (ports_lp)
                            ,.num_banks_p  (banks_lp)
                            ,.rr_lo_hi_p(0) 
                            ,.data_width_p (data_width_lp)
                           ) UUT
                           ( .clk_i   (clk)
                            ,.reset_i (reset)
                            ,.reverse_pr_i('0)
                            ,.v_i     (test_input_v)
                            ,.w_i     (test_input_w)
                            ,.addr_i  (test_input_addr)
                            ,.data_i  (test_input_data)
                            ,.mask_i  (test_input_mask)
                            ,.yumi_o  (test_output_yumi)
                            ,.v_o     (test_output_v)
                            ,.data_o  (test_output_data)
                           );

and messed around with the makefile to get vsim to stop complaining about missing files:

BSG_MISC_FILES = bsg_defines.v bsg_crossbar_o_by_i.v bsg_transpose.v bsg_cycle_counter.v bsg_encode_one_hot.v bsg_mux_one_hot.v bsg_round_robin_arb.v
to

BSG_MISC_FILES      =   bsg_defines.v bsg_crossbar_o_by_i.v bsg_transpose.v bsg_cycle_counter.v bsg_encode_one_hot.v bsg_mux_one_hot.v bsg_round_robin_arb.v

and

BSG_MEM_FILES = bsg_mem_1rw_sync.v bsg_mem_1rw_sync_mask_write_byte.v

to

BSG_MEM_FILES       =   bsg_mem_1rw_sync.v bsg_mem_1rw_sync_synth.v bsg_mem_1rw_sync_mask_write_byte.v bsg_mem_1rw_sync_mask_write_byte_synth.v

Dlatch fixes

  • Have input be clk_i and not en_i; we want these used in very specific cases (usually a negative clock dlatch, will grab and pass through the new value at the second half of the cycle).
  • Use always_latch and not always_ff
  • Add %m warning if it is instantiated; and i_know_this_is_a_bad_idea_p which disables the warning.

Unnamed generate blocks

I'm sure there are more, but these come up during BP compile

"/mnt/bsg/diskbits/petrisko/scratch/pre-alpha-release/basejump_stl/bsg_misc/bsg_mux_one_hot.v",
26
"/mnt/bsg/diskbits/petrisko/scratch/pre-alpha-release/basejump_stl/bsg_mem/bsg_mem_1rw_sync.v",
35
"/mnt/bsg/diskbits/petrisko/scratch/pre-alpha-release/basejump_stl/bsg_mem/bsg_mem_1rw_sync_mask_write_bit.v",
35
"/mnt/bsg/diskbits/petrisko/scratch/pre-alpha-release/basejump_stl/bsg_misc/bsg_priority_encode_one_hot_out.v",
24
"/mnt/bsg/diskbits/petrisko/scratch/pre-alpha-release/basejump_stl/bsg_misc/bsg_encode_one_hot.v",

RFC: Wrap $display statements in a DEBUG macro?

(x-post to BSG Manycoree)
It seems unnecessary to have $display statements enabled for every simulation run. Printing information on every run can slow the simulator down, or clog with unnecessary information.

In BSG F1 we have a printing header. bsg_pr_debug statements are only enabled when the DEBUG macro is defined. This can be defined globally, or on a per-file basis.

Instead of putting raw $display statements in BaseJump Code, it would be cool if we could wrap $display in a macro that is only enabled when DEBUG is defined. Like in C/C++ we could enable this on a per-file basis (by putting define DEBUG include "bsg_basejump_printing.vh") or globally when compiling the simulation executable

bsg_wormhole_router_generalized is not Verilator compatible

Fix 1: (bare #1000 statement). Actual best practice for this is $assertoff ...delay... $asserton in the testbench, but this fix will work, too

`ifndef SYNTHESIS
`ifndef VERILATOR
    wire [dirs_lp-1:0][dirs_lp-1:0] matrix_out_in_transpose;

    bsg_transpose #(.width_p(dirs_lp),.els_p(dirs_lp)) tr (.i(routing_matrix_p[0])
                                                          ,.o(matrix_out_in_transpose)
                                                          );
    initial
      begin
        #1000;
        assert (routing_matrix_p[1] == matrix_out_in_transpose)
          else $error("inconsistent matrixes");
      end
`endif
`endif

But even with that, I get this error:

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.