Accelerator Functional Unit Developer’s Guide for Intel® FPGA Programmable Acceleration Card

ID 683129
Date 7/20/2020
Public
Document Table of Contents

4.1. AFU Design Components

Figure 1. AFU High Level Block Diagram

A typical AFU design includes the following components:
  • RTL description of the algorithm or function being accelerated
  • RTL description to implement the base requirements placed on AFUs by OPAE (e.g., DFH, AFU ID in MMIO space). See the CCI-P Reference Manual for more details on the RTL description.
  • Supportive infrastructure
    • Logic to map AFU CSRs into MMIO space
    • Memory mastering logic
      • FPGA to host memory access
      • Local FPGA memory access
  • Debug and Performance monitoring
    • Signal Tap with the Remote Debug feature
    • Performance monitoring and counters within the scope of the AFU

The interfaces provided by OPAE for host and local memory access are basic, slave access interfaces. The host only has access to the AFU’s 256KB MMIO space. The AFU must implement a DMA to move large workload data to and from host memory. The dma_afu sample AFU in the OPAE platform installation provides an example for moving data between the host and local memory.

The FIM supports notification for illegal accesses made on the CCI-P interface and performance monitoring capabilities accessible by the host through the FME in the FIU. Any error handling and performance monitoring must be implemented in the AFU by developer.

The FIM provides for AFU remote debug through the FME connected to an OPAE tool that hosts the debug connection over TCP. The AFU designer must instrument the AFU with debug instances and nodes using tools such as Signal Tap. The nlb_mode_0_stp sample AFU in the OPAE platform installation provides an example for enabling an AFU for remote debug with Signal Tap over a TCP connection.