Design and implementation of a pipelined Bfloat16 Floating Point Arithmetic Unit using VHDL. This unit can perform addition, subtraction, multiplication, division and fused multiply-add/subtract operations. Bfloat16 is a 16-bit floating-point data type developed at Google and currently used in their Tensor Processing Units (TPU's). Thanks to its dynamic range, the Bfloat16 format can be useful for Machine Learning applications that work well with low-precision representations of data. This Bfloat16 unit will be used to add custom RISC-V floating-point instructions to a RISC-V processor that can potentially be used as a hardware accelerator for Machine Learning applications. This model will also be tested on an FPGA and possibly modified to achieve optimal performance. Work is still in progress.
Note: I am still in the preliminary stages of the project. For now, I made sure that the operations are working properly and we are getting the appropriate results. The next step would be to add more features (i.e. supporting different rounding modes, adding more operations such as square root...) and optimize the design to be able to execute faster operations.