Git Product home page Git Product logo

msc_research_report's Introduction

Dissertation

Literature corpus of my MSc's dissertation thesis.

Author:

Subject

Many embedded systems are based on the use of generic or dedicated processors. These processors have hardware calculation units of variable precision (examples: ALU or FPU 8, 16, 32, 64 bits). Some processing algorithms are designed to perform calculations based on a given accuracy. However, the use of calculations with lower accuracy allows, in some cases, an acceleration of these same calculations while maintaining sufficient accuracy for the desired functionality. This acceleration can have several benefits:

  • A reduction in the cost of the calculating component: a less powerful component is generally cheaper,
  • Better energy efficiency: a less powerful component or with a lower frequency/ voltage consumes less energy, and dissipates less heat,
  • Better performance: reduced computing time allows you to work on larger datasets or perform other calculations at constant cost.

The objective of the internship is to analyse a reference application, determine the parts of code where accuracy can be reduced, and then implement it on one or more hardware architectures to verify that the accuracy is sufficient and the performance gain makes sense. The trainee will have to appropriate the existing digital precision analysis tools, then implement these tools on an algorithm, to analyse on each code portion, the loss of precision and its level of acceptability. This step will be performed on a sequential code in C or C++, in order to allow the use of tools that do not support the parallelised code. ย This first stage will rely on LLVM or Clang.

In a second step, the application will be based on an architecture with computing unit(s) of the desired precision (GPU or MPPA). Depending on the profiling and the results of the analysis, the application code will be modified to take advantage of the targeted computing units (meaning, exploiting an extended ISA, with extra instructions that invoke hardware primitives). Finally, the global architecture (processor + hardware primitives) will be built automatically, and the impact of a malicious alteration of synthesis scripts will be illustrated to motivate the need for cyber-protection when designing such a soc.

Structure of the literature review

Please look inside the research_report folder

Articles read and annotated

  • 1967-Moler: Iterative Refinement in Floating Point
  • 1989-Imel: Mixed-precision Operations Floating Point Operations from a Single Instruction Opcode
  • 2000-Tong: Reducing Power by Optimizing the Necessary Precision/Range of Floating-Point Arithmetic
  • 2006-Moore: Cramming More Components onto Integrated Circuits
  • 2006-Strzodka: Pipelined mixed-precision Algorithms on FPGAs for Fast and Accurate PDE Solvers from Low-precision Components
  • 2007-Goddeke: Performance and Accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations
  • 2007-Yates: Fixed-point arithmetic: an introduction
  • 2008-Sun: High Performance Mixed-precision Linear Solver for FPGAs
  • 2009-Baboulin: Accelerating Scientific Computations with Mixed-Precision Algorithms
  • 2010-Clark: Solving lattice QCD systems of equations using mixed-precision solvers on GPUs
  • 2012-Chow: A Mixed-precision Monte-Carlo methodology for Reconfigurable Accelerators Systems
  • 2013-Darulova: Synthesis of fixed-point programs
  • 2013-LeGrand: SPFP: Speed without compromise - A Mixed-precision Model for GPU accelerated Molecular Dynamic Simulations
  • 2013-Rubio: Precimonius, tuning assistant for Floating Point programs
  • 2014-Horrowitz: Computing's Energy Problem (and what we can do about it)
  • 2014-XuanSang: From Smalltalk to Silicon: a methodology to turn Smalltalk code into FPGA
  • 2015-Nips: High-Performance Hardware for Machine Learning
  • 2016-Courbariaux: Binary-net: Training deep neural networks with weights and activations constrained to +1 or -1
  • 2016-Hubara: s neural networks: Training neural networks with low precision weights and activations
  • 2016-Park: FPGA based implementation of deep neural networks using on-chip memory only
  • 2016-Qiu: - [ ] Going Deeper with Embedded FPGA Platform for Convolutional Neural Network
  • 2016-Zhao: F-CNN: An FPGA-based Framework for Training Convolutional Neural Networks
  • 2017-Liang: FP-BNN: Binarised neural network on FPGA
  • 2017-Micikevicius: Mixed-Precision Training
  • 2017-Umuroglu: FINN: A framework for fast, scalable binarised neural network inference
  • 2017-Xilinx: Reduce Power and Cost by Converting from Floating Point to Fixed Point
  • 2018-Abdelouahab: Accelerating CNN inference on FPGAs: A Survey
  • 2018-Blott: FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of quantised Neural Networks
  • 2018-Colangelo: Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs
  • 2018-Darulova: Sound mixed-precision with rewriting
  • 2018-Haidar: Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed Up Mixed-Precision Iterative Refinement Solvers
  • 2018-Jia: Highly Scalable Deep Learning Training System With Mixed-Precision: Training ImageNet in Four Minutes
  • 2018-Joubert: Attacking the opioid epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction
  • 2018-Kurth: Exascale deep learning for climate analysis
  • 2018-LeGallo: Mixed-precision in-memory computing
  • 2018-Narang: Mixed-precision training
  • 2018-Rybalkin: FINN-L: Library Extensions and Design Trade-off Analysis for Variable Precision LSTM Networks on FPGAs
  • 2019-Ding: REQ-YOLO: A Resource-Aware, Efficient Quantisation Framework for Object Detection on FPGAs
  • 2019-Jahanshahi: TinyCNN: A Tiny Modular CNN Accelerator for Embedded FPGA
  • 2019-Wang: Deep neural network approximation for custom hardware: Where weโ€™ve been, where weโ€™re going
  • 2019-Zhao: Automatic generation of multi-precision multi-arithmetic CNN accelerators for FPGAs
  • 2020-Bacchus: Accuracy, Training Time and Hardware Efficiency Trade-Offs for Quantized Neural Networks on FPGAs
  • 2020-Radu: Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

Important sites

Precision analysis:

Floating-point and fixed-point arithmetic:

Mixed-precision applications:

Deep-learning:

Planning and Deadlines

  • Research Report Draft - 16/03
  • Research Report Final - 08/04 3:30pm

Logs

15/02 Read and annotated 1989-Imel
16/02 Read and annotated 2008-Strzodka
17/02 Read 2008-Baboulin
19/02 Re-read 2008-Strzodka and read 2008-Sun.
21/02 Re-read 1989-Imel & 2008-Baboulin.
23/02 Corrections on 2008-Strzodka and 2008-Sun.
27/02 Read and annotated 2010-Clark.
28/02 Read and annotated 2012-LeGrand and 2018-Darulova.
29/02 Read and annotated 2018-LeGallo.
01/03 Found applications and uses of mixed-precision in diverse fields 2018-Joubert, 2018-Kurth.
02/03 Looking for material regarding fixed-point and floating-point arithmetic. Computerphile serie and 2001-Yates document. Read and annotated 2013-Darulova.
06/03 Read and Annotated 2014-XuanSang.
07/03 Learned about CNN (Convolutional Neural Network) through the cd231n Stanford lecture materials
08/03 Read and Annotated 2018-Colangelo
09/03 Watched two talks on ML for FPGA (Intel and Xilinx)
10/03 Read and annotated 2018-Jia and 2018-Narang ...

msc_research_report's People

Contributors

qducasse avatar

Watchers

 avatar  avatar  avatar  avatar

msc_research_report's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.