---
date: 2025-04-15
---

# Reinforcement learning FPGA implementation related resources

:::{card} Background
I research energy efficient reinforcement learning (RL). Here I compiled implementation-related resources, e.g., frameworks, libraries, readings.
:::

# Reading

- [Two introductory exercises](https://docs.google.com/document/d/1ChHiwLCDWpkwNDL77iRBc3D8tydXonJgaK2BvVYI3oE/edit) 
  1. Learning value functions 
  2. Neural network weight updates using gradient descent

  [Solutions](https://docs.google.com/document/d/19AIWZKPqRoKYul2Pj_rF7wnT_3zLpBTebSiR-1ogtYE/edit)
- [Chapter about RL](https://artint.info/3e/html/ArtInt3e.Ch13.html) from the book [Arificial intelligence](https://artint.info)
  - free book from University of Columbia
  - if you want a compact and different view than the [standard RL book by Sutton](http://incompleteideas.net/book/the-book.html)
- [an object-oriented approach to linear neural networks](https://d2l.ai/chapter_linear-regression/oo-design.html)
  - from an open source book about deep learning
  - does not implement neural networks from scratch, but uses many popular ML libraries
- [Deep Learning from Scratch](https://learning.oreilly.com/library/view/deep-learning-from/9781492041405/)
  - implements neural networks from scratch in Python
  - not free
  - once prepared [slides for a short lecture](https://f.aydos.de/from-operations-to-neural-network.pdf) based on the book. Here is [code](https://gitlab.com/goekce/from-operations-to-neural-network) used in the slides.
- [Neural Networks from Scratch](https://nnfs.io/)
  - similar to the last
  - not free
  - has many useful [animations](https://nnfs.io/neural_network_animations)

# Frameworks, Libraries, tools

- [FINN+](https://github.com/eki-project/finn-plus) (based on [FINN](https://github.com/Xilinx/finn))
  - neural network inference on FPGAs
  - supports only few neural network architectures
  - quantized neural networks
  - [FINN examples](https://github.com/Xilinx/finn-examples)
    - even the project is active, no newer boards like PYNQ-Z2 or Alveo U50 are tested.
- [QONNX](https://github.com/fastmachinelearning/qonnx)
  - introduces quantized operators for ONNX
- [brevitas](https://github.com/Xilinx/brevitas)
  - quantized implementations of the most PyTorch layers, e.g., `QuantConv1d`
  - enables low-precision arithmetic (8bit, 4bit etc) which reduces DSP and memory footprint usage
- [QKeras](https://github.com/google/qkeras)
  - quantized implementations of Keras layers
  - e.g., `smooth_sigmoid(x)`, `hard_sigmoid(x)` ...
  - includes an energy consumption estimator
- [Netron](https://github.com/lutzroeder/netron)
  - ONNX model visualization
- [rule4ml](https://github.com/IMPETUS-UdeS/rule4ml)
  - resource and latency estimation for ML on FPGA
- [hls4ml](https://github.com/fastmachinelearning/hls4ml)
  - converts neural network models to FPGA firmware
- [HLSFactory](https://github.com/sharc-lab/HLSFactory)
  - framework for HLSing many configurations of a design and comparing the results
- [Vitis Libraries](https://github.com/Xilinx/Vitis_Libraries)
  - Vitis libraries for HLS
  - [docs](https://docs.amd.com/r/en-US/Vitis_Libraries/index.html)
- [Vitis AI](https://github.com/Xilinx/Vitis-AI)
  - for flexible AI inference compared to Brevitas & FINN
  - compiled code is run on an a micro-coded DPU (deep learning processing units)
  - [docs](https://xilinx.github.io/Vitis-AI/3.5/html/docs/workflow-system-integration)
- [Ramulator](https://github.com/CMU-SAFARI/ramulator2)
  - cycle-accurate RAM simulator including HBM

# Neural network implementations

- [Neural network on an FPGA](https://github.com/erwanregy/Neural-Network-on-an-FPGA)
  - SystemVerilog
  - datatypes are fixed, probably no quantization possible
- [NeuralNetworkAccelerator](https://github.com/shayaanc4/NeuralNetworkAccelerator)
  - SystemVerilog
  - datatypes are fixed, probably no quantization possible
- [AccDNN](https://github.com/IBM/AccDNN)
  - AI model to Verilog
  - by IBM, but not maintained
- [SpinalHDL CNN accelerator](https://github.com/19801201/SpinalHDL_CNN_Accelerator)
  - common operators in CNN
  - configurable multiplier etc
    - no documentation
- [spatten](https://github.com/mit-han-lab/spatten)
  - sparse attention for LLMs, contains a hardware implementation
  - includes a [dot product implementation](https://github.com/mit-han-lab/spatten/blob/main/spatten_hardware/hardware/src/main/scala/spatten/DotProduct.scala)

# Math libraries

- [Chainsaw](https://github.com/Chainsaw-Team/Chainsaw)
  - hardware design library based on SpinalHDL
  - includes [a systolic array](https://github.com/Chainsaw-Team/Chainsaw/blob/master/src/main/scala/Chainsaw/examples/SystolicExample.scala)
- [SpinalHDL math library](https://github.com/tomverbeure/math)
  - floating point with user programmable exponent and mantissa
- [PiMAC](https://github.com/SteffenReith/PiMAC)
  - a pipelined multiplier in SpinalHDL 

# Reinforcement learning implementations

- [HeteroRL](https://github.com/pgroupATusc/HeteroRL)
  - RL via DPC++ (SYCL) and Torch
- [Q learning acceleration](https://github.com/CatherineMeng/q-learning-accel-fpga)
  - three stage pipeline
- [Proximal policy optimization (PPO) implementation](https://github.com/CatherineMeng/PPO_2CU)
  - not documented
- [A Toolkit for benchmarking FPGA-accelerated Reinforcement Learning](https://github.com/pgroupATusc/FGYM)
  - not maintained
- [Experiences with Vitis AI for deep RL](https://synergy.cs.vt.edu/pubs/papers/chaudhury-vitis-ai-hpec2024.pdf)
  - pruning increases performance
  - tried only on GPUs, not on FPGAs
- [Vitis HLS based reinforcement learning code](https://github.com/CatherineMeng/CPU_FPGA_RL)
  - no documentation
